What Not to Pair: The Consequence of Mixing Consequences

Since 2003, I’ve been privileged to be on the faculty of Karen Pryor’s outstanding ClickerExpo conferences. So far, we’ve held twenty-six of these events in which several hundred people and their dogs gather for three days of lectures and workshops about all aspects of clicker training. Included is an hour-long panel discussion; six faculty members sit on stage, answering questions from the audience. One year, an interesting question was, “Is there any training mistake that you can make that you can’t fix?” In other words, in clicker training, is there any way you might inadvertently cause permanent damage to your animal’s future ability to learn?

We first acknowledged that the use of strong punishers can, of course, create enduring fears and learning disabilities. After that disclaimer, my colleagues, for the most part, said that clicker training is very forgiving and you can click your way out of virtually any problem.

But I disagreed. There is a training mistake I’m becoming more aware of and I believe it can significantly impair an animal’s ability to learn and to trust the trainer. This mistake is made when you change the emotional meaning of either punishers or reinforcers. Trainers typically do this without realizing what’s occurring. Classical conditioning is the learning process responsible for this shift in the affective value of a punisher or a reinforcer.

Classical conditioning is a process whereby animals learn predictive associations between environmental stimuli. We all remember Ivan Pavlov, 100 years ago in Russia, teaching dogs to salivate in response to the sound of a bell. He did this through repetitive trials of linking the bell’s ringing and the dog’s eating. For classical conditioning to “work,” the bell sound (the conditioned stimulus or CS) had to precede the dog’s feeding (the unconditioned stimulus or US). In other words, these two stimuli had to occur in sequence, CS followed by US. Otherwise the CS would not acquire any predictive (or anticipatory) meaning for the animal. So for classical conditioning to take place, the US cannot precede the CS, nor can the two stimuli happen simultaneously.

Another essential tenet of classical conditioning is that the emotional value of the US spreads backwards to “infect” the CS. That is, after many pairings of “CS followed by US,” the emotion the animal feels in response to the sight or sound or smell or feel of the US will become the way it reflexively feels about the CS. The conditioned stimulus will take on the emotional value of the unconditioned stimulus. [Note that in some cases, the animal’s physiological reaction to the CS will be different than its reaction to the US. For example, the response of a rat to an electric shock is to abruptly increase activity, whereas the rat’s response to a tone that signals the shock is to dramatically reduce activity. For more examples, see Rescorla, R. (1988) Pavlovian Conditioning: It’s Not What You Think It Is. American Psychologist, 43, 151-160]

Animal trainers can use the power of classical conditioning in a variety of productive ways. It is the process responsible for creating conditioned (or secondary) reinforcers of all kinds, including the sound of the clicker. [Note that the rules of classical conditioning require that there be a brief pause between the click and the subsequent treat. If the two occur simultaneously, even if the “treat” the animal perceives is simply the trainer’s hand reaching toward food, the animal will not regard the click as a reinforcer. Therefore, the click should precede any movements of the trainer’s hand, eyes or body toward the food or toy being used as the treat.] Classical conditioning can create conditioned emotional responses, either joyful or fearful, to environmental sights and sounds the animal perceives. It also allows trainers to follow simple rules to successfully transfer old behavioral cues to new cues.

But there is another way that classical conditioning sneaks into our training, a way we easily can neglect to notice. Classical conditioning can change the animal’s perception about which stimuli are reinforcers and which stimuli are punishers. And these learned changes in the affective value of supposedly reinforcing or punishing consequences can be pervasive and long-lasting.

Murray Sidman, in his revolutionary book “Coercion and its Fallout,” explains this concept clearly. Demonstrating the principle with the example of a rat in an operant-conditioning chamber, Dr. Sidman describes the simple procedure that will turn electric shocks into positive reinforcers so powerful that they can be used to train the rat a completely novel behavior. The experimenter can use the process of classical conditioning to link shocks with food (e.g., the rat gets shocked and then immediately receives food). Despite our common sense that electric shocks are immutably punishing to animals, as a result of this training, the painful stimulus can indeed become a positive reinforcer for the rat, something the animal will actively work to obtain (pp. 74-75).

Here’s an excerpt from another book, a wonderful old textbook by Frank A Logan, titled “Fundamentals of Learning and Motivation” (1970, Dubuque Iowa: Wm. C. Brown).

“One way to illustrate the importance of the principle of the anticipatory response…is to use as the conditioned stimulus and the unconditioned stimulus two stimuli of which one is emotionally positive (such as food) and the other of which is emotionally negative (such as electric shock). Let us see what happens depending on the order of these two stimuli.

Consider first an environment in which an organism occasionally receives an electric shock which signals that he can go to a food bin and get some food. In this order, the shock is the conditioned stimulus and food is the unconditioned stimulus, and the emotionally positive responses to the latter become conditioned to the former. Unless it is very intense, the shock loses its aversive properties; the organism accepts the shock calmly and eagerly goes for food. Perhaps he would prefer that a tone or a light signaled food, but he lives quite satisfactorily in such an environment.

Let us now reverse the order by the space of a single second. If in the first case food was delivered one-half second after the shock, let us now arrange an environment in which food is freely available but one-half second after the organism takes a bite of food, an electric shock is delivered. In this order, the emotionally negative responses elicited by the shock become conditioned to the food and the organism rarely eats. He lives in a continual state of conflict and behaves poorly. His life, in effect, is miserable.

Viewed from a distance, these two environments are identical: the same amount of food and the same number of shocks could be obtained in both. But the organism’s response to them is markedly different: the second event comes to dominate the situation. This is because the emotional responses to it become conditioned to the first stimulus and change its affective value. Shock can become pleasant, food can become aversive.” [Emphasis is mine.] (pp 55-56)

We animal trainers must realize that both these learning processes – punishers becoming reinforcers, or reinforcers becoming punishers – happen quite frequently without us realizing it. That’s the real danger: that our dogs learn that leash pops, verbal reprimands and other aversives are actually reinforcers and that food, toys, petting and praise are actually punishers. This will destroy our ability to train new behaviors and it will cause frustration for the trainer and confusion for the dog.

Here’s a common example of how this problem develops in dog training. Suppose you yell at your puppy for chewing the leg of the dining-room table and, as soon as he stops, you praise him and maybe give him a treat. Some trainers call this the “Jekyll and Hyde routine” or the “bad cop, good cop procedure,” meaning you should be quite harsh with the dog when he is behaving badly but the instant he changes his behavior to something more desirable, you should switch from yelling or leash-popping to sweet-talking, smiling and feeding. No matter whether this method is effective in decreasing table-chewing, it is doing something much more significant. It is turning the reprimands (or hitting or leash pops) into conditioned reinforcers. This means that over time, this consequence will become less effective at suppressing behavior. Because you don’t realize why this is happening, you are likely to think you must increase the intensity of your yelling (or hitting or leash-popping). But if you continue to follow each of these intensified aversive stimuli with a positive reinforcer, even they will lose their ability to punish (i.e., decrease the strength of the preceding behavior). Dr. Sidman states that even shocks (delivered through the floor grid of an operant-conditioning chamber) strong enough to throw rats off their feet can become positive reinforcers capable of motivating a rat to learn a new behavior.

Why should this be important for clicker trainers – and for every positive-reinforcement based trainer – to understand? Don’t we avoid using punishers as much as possible? Yes, certainly, but I believe it is essential that we preserve the power of humane, mild punishers to suppress behavior. That way, on the infrequent occasions that we might decide to use one in our training program (e.g., saying “anh-anh!” as our dog’s front paws lift off the ground in an attempt to grab the roast chicken off the kitchen counter), it will function as expected. The worst-case scenario is that we decide to use an occasional punisher as part of a carefully planned training “set-up” but then find that it not only didn’t suppress the unwanted behavior, it actually increased it! It takes a wise trainer to realize that yelling louder at the dog in this situation will not solve the problem.

The rule for avoiding this damaging effect is simple: Follow any punishing event with a period of not interacting with your dog (or your horse, or even your child). Communicate nothing to the dog for 30 seconds if possible, or at least 10 seconds if that’s all you can manage. “Disconnect” from the animal for this brief period (psychologically, not physically – don’t drop the dog’s leash!) You need to ensure that nothing you do immediately after the punishment in any way counter-conditions its aversiveness. After delivering a punishment, avoid all “rebound lovingness” to your dog. Of course, we definitely want to positively reinforce any desirable behaviors the dog offers, but not in that black-out period after a punisher.

Might you want to use this process to intentionally defuse a potential punisher? Yes, I can think of several examples. Maybe someone in your family yells at your dog and you would like to use classical conditioning to teach the dog that yelling (or fur-tugging or collar-grabbing) is actually a good thing, that is, a predictor of yummy food, or a fun game of tug, or a romp in the yard. Or another situation that comes up with many of my clients is that, when they get anxious on a walk with their dog, they tighten the leash. The dog usually interprets this leash tension as uncomfortable and stressful. But, at home, as a training assignment, the client can use classical conditioning to convince the dog that a taut leash is a predictor of good things. After many trials of “leash tightens” (the conditioned stimulus) followed by “liver treat for the dog” (the unconditioned stimulus), the dog will react emotionally to the leash-tightening as he does to the liver treats (i.e., yippee!).

What about the second possibility Dr. Logan mentions, that food can become aversive? This is a huge problem for positive-reinforcement trainers, one I see quite often. The most common way it happens is that trainers present a food lure to the dog and then immediately follow it with something painful, annoying or spooky. Repeated occurrences of “food followed immediately by an aversive stimulus” will teach the dog to distrust and even avoid food. This is the real reason behind many cases in which the handler says, “But my dog isn’t motivated by food.”

Here are some situations in which accidental classical conditioning might create some level of food-avoidance:

Using steak pieces to lure your hesitant and fearful dog onto the teeter in agility class (steak followed by scary movement of the teeter)

Smearing peanut butter on your refrigerator door so your dog will lick it while you brush out his matted fur (peanut butter followed by too intense or too much uncomfortable grooming; a gentle brief bit of grooming might be fine though)

Giving your very anxious dog a new Kong™ toy stuffed with cheese and biscuits just before you leave for work (cheese and biscuits followed by extremely distressing separation; this is especially problematic for dogs who haven’t learned how to eat from a Kong™ in a stress-free situation first)

Feeding your leash-aggressive dog pieces of chicken as soon as you notice a dog approaching from across the street; your dog notices the approaching dog only after she has eaten some of the chicken (chicken followed by appearance of the “threatening” dog; a much better technique would be to feed the dog pieces of chicken after she notices the other dog)

Passing dog cookies out to strangers so they can feed and then pet your shy puppy (cookies followed by too much close interaction with an intimidating person)

Two less common ways that I see food taking on negative associations are when trainers try to keep feeding a satiated dog (maybe at the end of a training class) or when they attempt to force-feed a dog, usually in a stressful situation (e.g., at the veterinarian’s office).

It’s even possible to inadvertently “infect” the click sound with negative associations in just this same way. That’s why I’m always particularly careful about using a clicker in situations in which the dog is fearful. Though clicker training is an ideal way to help dogs overcome fears, it is essential that the click itself is not followed by events that frighten or overwhelm the dog. In the words of legendary trainer Bob Bailey: “Your clicker is forever.” So we must take care to keep it unambiguously positive.

When I am training any animal, it’s vitally important that I preserve the emotionally-negative effect of any humane punisher I might ever decide to use and that I preserve the emotionally-positive effect of all my unconditioned reinforcers (e.g., food, toys, play, walks, touch) and conditioned reinforcers (e.g., clicks, “yeses,” praise). Even the most technically-proficient training and the most creative behavior-modification plan can’t compensate for failure to keep these reinforcers and punishers separate and discrete.

Bright Spot Dog Training

positively unique solutions

What Not to Pair: The Consequence of Mixing Consequences