Article 01

Reward, punishment, and the transfer of learning

How often do we say when setting objectives for a training programme, ‘Unless behaviour changes, nothing changes’? This is a direct reference to behaviourism, one of the most influential schools of psychological thinking. Behaviourism is based on the view that it is only people’s observable behaviour which matters; that focusing on invisible and therefore scientifically unprovable things such as ‘attitude’, ‘personality’, and ‘belief systems’ is misguided; and that people’s behaviour can be modified by the systematic application of ‘rewards’ and ‘punishments’. Behaviourism was developed through research with animals : Pavlov’s dog ‘learned’ to salivate at the sound of a bell, because it meant food was coming; Skinner’s rats ‘learned’ to press keys inside their specially designed ‘Skinner boxes’ to get pellets of food. But behaviourist principles have been adopted in many human settings, including the education of disturbed children and the rehabilitation of offenders.

Do we as trainers use ‘rewards’ to reinforce the messages of training? Surely we do ‘Rewards’ include expression of interest in certain points participants make, use of participants’ suggestions, and, of course, straightforward praise. Participants are also rewarding and punishing each other, with their patterns of interaction, and we as trainers try to influence those patterns so that they contribute to the learning. We also make use of another kind of learning which the behaviourists first recognised, the social learning called ‘modelling’. Modelling happens when a ‘high status individual’ in a group behaves in a certain way. Others in the group will tend to follow that ‘modelled’ behaviour, because it is associated with high status and therefore potential reward. As trainers, we tend to be granted temporary ‘high status’ in the training room itself, and we often use that status to ‘model’ behaviour consistent with the messages of the training.

I don’t want to discuss here the ethics of reward and punishment. Since we simply can’t avoid dishing out rewards and punishments when we are in the trainer role, I’d like us rather to reflect on how effectively we are doing so.

The reward systems we set up in the training room are artificial, often completely unlike the reward systems which operate in the organisation as a whole. So behaviour which is rewarded in the training environment may not be once training has stopped. There is an interesting fact about reward in such artificial circumstances. If you reward a particular behaviour very reliably and consistently during learning (so, for example, the rat gets a food pellet every time it presses the key), the learned behaviour disappears as soon as the reward does. But if you only reward the behaviour every so often during learning (‘partial reinforcement’), then the behaviour persists long after the reward has stopped.

So behaviourism suggests we should make ‘rewards’ during learning a bit random and unpredictable. This sheds a whole new light on those visits to our training programmes by ‘unreconstructed’ senior executives. We’ve all had the experience of briefing them carefully, of trying to ensure their comments on the participants’ work and ideas support the change we’re in the business of bringing about, in short, of getting them to conform to our ‘ideal’ pattern of rewards and punishments. Many of us have also had the experience of a ‘good’ training session being ’ruined’ by demotivating remarks from the CEO at the final presentations. Behaviourism sheds a different light on all this. We should be glad our neat schedules of rewards and punishment have been interrupted. We should seek more involvement from representatives of the real-world reward system. And we should seek their involvement not so much at the end of the programme, but during it.

It might be harder for us to manage these kinds of untidy learning experiences, but behaviour change which occurs under these conditions is much more likely to last.

Behaviourism also provides us with one of the reasons why self-managed learning and action learning approaches can be so effective. In self-managed learning, you are in charge of your own patterns of reward and punishment, and you take them with you when you leave the learning environment. In action learning, you are learning in the context of the everyday reward systems and the problem of learning transfer is minimised.

If we follow some of these behaviourist thoughts through to their logical conclusion, we arrive at the idea that training which is perfectly designed to produce maximum behaviour change in the shortest possible time is likely to be less effective in the long-term.

back to top