Thursday, January 20, 2011

An introduction to dog training, pt 1

An introduction to dog training ends up sounding a lot like an introduction to psychology. Because it is. If you want to teach your dog anything it would be helpful to look into how animals (ourselves included) learn.

I'm going to use these first few posts to lay out a lot of the basic language that I think is important to understand when talking about training.

Classical conditioning

Pavlov's dog: By this point in your life you probably have at least a passing familiarity with Pavlov and his dogs. He was a physiologist who noticed that his experiment subjects would occasionally drool when no food was present – for instance, when the assistant who normally fed them walked into the room, even if he wasn't carrying food at the time. Pavlov designed an experiment that looked into the root of the dogs' responses. In phase one he would measure the dog's salivation under two situations: when meat powder was placed on the dog's tongue, when a neutral stimulus was presented (a tone, which on its own would cause no salivation). In phase two he would sound the tone and then present the meat powder several times. In phase three he would sound the tone with no food present and, whelp, the dogs still salivated. They had learned that the sound of the tone indicated the imminent arrival of food.

Unconditioned stimulus (meat powder) -> Unconditioned Response (salivation)

The process of conditioning:

Neutral stimulus (tone) -> Unconditioned stimulus (meat powder) -> Unconditioned Response (salivation)

After conditioning has occurred:

Conditioned stimulus (tone) -> Conditioned response (salivation)

This is an important idea to understand about the learning process. The physical response is involuntary, but still occurs despite being protracted from the original trigger.

This reaction can be stretched a bit by, say, pairing a flashing light with the sound of the tone, which was previously paired with the arrival of meat so the flashing light eventually increases salivation response, but the response is weaker. This is called second-order conditioning. You can normally further protract the process another few times, but you must understand that it's less effective each time.

The coolest thing about conditioning, and what's most important to remember, is that the learning process happens subconsciously. It is a natural reaction of animals' brains, and it can be used to explain the occurrence of various phobias, etc.

So how does this apply to dog training?

Say you're out walking your dog and it sees another dog approaching in the distance and starts barking at them and generally being an asshole. This sort of antisocial behaviour is normally born out of insecurity, and the dog has learned that if it barks and is unapproachable the other dog won't approach. The dog has been conditioned to feel that other dogs' presences are unpleasant and reacts accordingly. To bring this back to Pavlov, the approach of the other dog is the conditioned stimulus, and the barking is the conditioned response.

So, well, your dog has already been conditioned to think that other dogs mean bad things. What now? Now it's time for counter-conditioning. Your goal is to change the approach of other dogs from an indicator of negative things into an indicator of positive things. You do this with food, 'cause, well, dogs love food (and they need it to live).

First you need to figure out what your dog's reaction distance is. Is it when the other dog gets within 10 feet of it, or when your dog sees another dog 6 blocks away? The reaction distance is your dogs' threshold between being chill and freaking out. You want to keep the dog under threshold at all times if possible (but admittedly, this is not always possible). So, keep your distance from other dogs while you're doing this. Don't push your dog too hard.

Second, once you see that other dog approaching your dog's threshold start popping food into its mouth, one piece immediately after another. If your dog won't take food you're too close to the other dog and you need to move away. Use awesome treats for this if your dog is really disinterested in taking food – steak, pizza, hotdogs, peanut butter, etc. Essentially your goal is to repeat this enough that your dog starts looking at you expecting food when it sees another dog. And your job is to provide food every single time.

Important things to remember: Your dog should notice the other dog before he gets food, so he understands more quickly that other dogs = incoming food. Counter-conditioning takes a LOT of time, so expect to spend months working on this. Progress might seem slow, and there are occasional set backs, but keep at it.

This is an excellent video demonstration of how successful basic counterconditioning can be:


Systematic desensitization is often coupled with counter-conditioning. It's used by psychologists to treat people with anxieties or phobias. The subject is exposed to a fear-evoking object or situation at an intensity that does not produce a response. Intensity can be modified via the degree of realism, proximity, etc. Intensity is gradually increased contingent on the subject continuing to feel okay.


In general, a conditioned response will gradually disappear if not reinforced through the process of extinction. For instance, if Pavlov stopped offering meat powder after sounding the tone for a period of time, the dogs will cease to salivate since the association between the tone and the food is no longer being reinforced. This is why ignored behaviours often stop since the dog is no longer being reinforced for providing them.

However, some behaviours are self-reinforcing, and therefore very difficult to extinguish. For example, a dog often finds barking to be a pleasurable response to various stimuli (barking is FUN!) so even if you ignore a barking dog they're very unlikely to stop this behaviour since they're reinforcing it themselves. That's not to say that you can't train a barking dog to be less barky, but it requires a different approach than to ignore it.

Operant conditioning

Operant conditioning accounts for most of what we learn every day.

In classical conditioning, the neutral stimulus and unconditioned response are predictably paired, and the result is an association between the two. (Then the conditioned stimulus triggers the conditioned response.) Stimuli occur before or along with the conditioned response. But dogs (and humans) also learn many associations between responses and stimuli that follow them – between a behaviour and its consequences.

Operant conditioning is all about consequences, whether they're good or bad. Learning is governed by the law of effect which states that if an action is followed by a satisfying effect the action is more likely to be repeated the next time the stimulus is present, and if an action is followed by an unsatisfying effect it is less likely to be repeated. The subject learns by operating on the environment, hence the term operant conditioning.

In classical conditioning the conditioned response does not affect whether or when the stimulus occurs. Pavlov's dogs salivated when the buzzer sounded, but the salivation had no effect on the buzzer or on whether food was presented. To contrast, an operant has some effect on the world. A child says “I'm hungry” and then is fed, the child has made an operant response that influences when food will appear. If a dog sits and then is fed, the dog has made an operant response that has also influenced when food will appear.

Reinforcement and punishment

There are four quadrants of consequences that follow a response in operant conditioning. They are positive reinforcement, negative reinforcement, positive punishment, and negative punishment. A reinforcer increases the likelihood of a behaviour happening again, and a punishment decreases the likelihood of a behaviour happening again. The term “positive” means you're adding something to the environment, “negative” means that you're taking something away from the environment. To clarify:

Positive reinforcement (R+): So, based on the definitions I just gave, a positive reinforcer is something you provide to the dog that will increase the likelihood of a behaviour repeating itself. Example: a treat following a dog sitting after you ask it to sit.

Negative reinforcement (R-): A negative reinforcer is when you take something away from the environment to increase the likelihood of a behaviour repeating itself. Example: upwards tension on a leash is released once a dog has sat after being asked to sit.

Positive punishment (P+): Positive punishment is adding something to the environment to decrease the likelihood of a behaviour repeating. Example: When you reprimand a dog for jumping up on visitors.

Negative punishment (P-): Negative punishment is when you remove something from the environment to decrease the likelihood of a behaviour repeating. Example: Putting a dog on “time out” after jumping up on visitors.

I like to focus primarily on R+/P- quadrants. I like to reward good behaviour and ignore bad behaviour. If bad behaviour is ignored (and not self reinforced) then its occurrence will decrease. (See the Extinction section for more information.)

