How Dogs Learn
Long story short, dogs learn through association. Their behaviors are based off of what rewards and corrections follow. Basic learning with dogs is similar to humans, as we both are affected by the discoveries of B.F. Skinner (the founder of operant conditioning) and Ivan Pavlov (the founder of classical conditioning).
By understanding the below concepts, we can really break down the why’s of your dog’s behavior.
Operant Conditioning
Operant conditioning is where the term “positive reinforcement” comes from, but this is often meant to explain force-free, rewards-based training, which isn’t always positive reinforcement. Operant conditioning is broken down in 4 quadrants, which positive+ means to add in something, negative- means to take away something, reinforcement means the likelihood of repeating a behavior, and punishment means to stop a behavior. For the sake of my typing fingers, let’s refer to these four as +R (positive reinforcement), -R (negative reinforcement), +P (positive punishment), and -P (negative punishment).
Going back to “positive reinforcement” usage, many will say that “negative reinforcement” is the opposite, meaning the use of aversive training techniques. In fact, it is actually +P, so adding in something (an aversive) to stop a behavior. For example: The dog barks, so the owner yells at the dog, which makes the dog stop barking. An example of -R could be applying tension on a horse’s reins to steer them in a direction. You want the horse to turn left, so you apply tension on the reins to the left and keep consistent tension until the horse turns, in which you release the tension. So you are removing the tension on the reins in order to reinforce the horse going in the direction you want. I know, it’s a lot. Here’s a picture for a visual of what operant conditioning can look like.
When we talk about consequences driving behavior, that doesn’t always mean the consequence is something bad! It is simply a result of an action. It is often broken down using the formula of Antecedent + Behavior = Consequence, or A+B=C. The antecedent is the thing or event that preceded the behavior, so what happened before Sparky started lunging on the leash? So maybe A: Someone approaches Sparky (which he is not a fan of), B: Sparky barked and lunged at this person, so C: That person backed away from Sparky. 1 point for Sparky! That was something that worked to get the person to leave him alone, so he is likely to do it again. The person going away drives Sparky to keep barking at strangers, so it is considered -R.
So how in the heck can knowing this be beneficial for your training?? Knowing what the ABC’s are of the situation you’d like to address can help you come up with a plan. You don’t want Sparky to bark, so you have options. You can either avoid coming across the antecedent (strangers) altogether, which isn’t practical, change the consequence (the person doesn’t back off), or aim to reinforce a different behavior. If you are the one to add in a behavior, you can change the consequence. Let’s try this instead! A: Someone approaches Sparky, B: You toss a treat off to the side, so C: Sparky goes to chase the treat instead. In the end, Sparky is reinforced for moving away from strangers and therefore more likely to make that choice in the future. Thus +R! By knowing what goal you’d like to set in training your dog, you can use this formula to help come up with a plan.
“I don’t want my dog to jump up on me anymore!” Okay, so let’s say you want to know why your dog insists on continuing to jump up on you, so that is where we’d start. A: Your dog jumps up on you, B: You push him away, C: He excitedly jumps up again. What can you change in this instance to make the consequence different? Well, we could think back and change the event leading up to the jumping, but in this instance, the jumping is already happening, so we’re going to have to stick with changing the behavior exhibited by you when the dog jumps. Rewind, let’s try that again! A: Your dog jumps on you, B: You turn your back to him, C: He backs off due to lack of engagement.
When we get a desired behavior from our dog, the question is how long do we need to continue the reinforcements to make sure they continue the behavior? That brings me to our next section!
How Reinforcements Affect Behavior
As we are teaching our dogs what behaviors earn a reinforcer, it’s also important how we give a reinforcer and whether a reinforcer is changing your dog’s emotional response to something. If treats are not creating a sudden burst of dopamine, they are not changing your dog’s emotional response and it will not be effective in helping them feel more comfortable around a certain thing. If a dog receives a reward every time they do a behavior, regardless of whether they perform well or not, your dog will not be motivated to offer the behavior when something more interesting is happening. Let’s go into to how timing and pairing of rewards affects your training.
Classical Conditioning
To put it simply, this is something happening prior to or simultaneously as an event that your dog connects and it affects your dog’s behavior. Many of us know about Ivan Pavlov’s experiment with dog’s salivating at the sound of a metronome (often claimed to be a bell). He had the idea that dogs don’t need to learn certain things and reflexes are hard-wired into them. With Pavlov’s experiment, he found that by having an unconditioned stimuli (dog food) with an unconditioned response (salivating) would lead to the unconditioned response becoming conditioned when a stimuli (dog food) is paired with another stimuli (the metronome). Long story short, the sound of the metronome made the dogs salivate as they would at the sight of food because they know that food often follows. They also found that the dogs had the same response at the sound of Pavlov’s assistant’s footsteps as he was the one that always fed them.
You have probably noticed accidental conditioning with your dog, such as them feeling anxious at the sound of jingling keys as that is usually followed by you leaving. My dogs become restless and excited when they hear the clink of ceramic because their food bowls are ceramic and the noise usually means I’m putting their bowls on the counter to prepare their meals. With training, the most common use of classical conditioning is to add value to the clicker. Like the metronome, the sound of the clicker makes the dog know that a treat is going to follow. By using this conditioned stimuli, you can more easily teach operant skills by making the marked behavior more clear to your dog.
Schedule of Reinforcement
The schedule in which we give a reinforcement affects the consistency of a behavior. B.F. Skinner had used rats in his experiment, teaching them to push a lever in order to get a treat. We’ll go over the different schedules and what scenarios some are useful to your training. There is a method to the madness! Or…um…the treating.
Continuous Ratio (CR) - This is when the reinforcement is happening every time the desired behavior is displayed. As mentioned above, this can make the behavior inconsistent. If we are giving a treat to our dogs regardless of how well they are performing the behavior, there is no motivation for them to improve. This ratio should be temporary and used for when a dog is first learning something new.
Fixed Duration (FD) - This is when a reward is given after a specific time. For example: If we are treating them for sitting for entering guests, but then the next treat for sitting doesn’t happen for another 5 minutes, your dog’s ability or motivation to keep sitting politely can start to drop off and they go back to jumping up. Their behavior will then go back to being polite as the 5 minute mark approaches, tapering off again after the reward is given.
Fixed Ratio (FR) - This is when a reward is given after a certain number of responses. This can initially be effective, but then your dog will usually speed through the behaviors in order to get to the reward. If every time you go to give your dog a treat, you always ask for a “sit”, a “down”, then “give paw”, you may noticed that your dog starts to offer all of these behaviors before you have even asked them.
Variable Duration (VD) - This is when the reward is given after an unpredictable amount of time. This gives us a moderate, but steady response rate. For example: When we are rewarding our dogs for being in a “stay”, a treat might be given after 30 seconds for holding their “stay” one time, then a minute later, then 10 seconds. Your dog will be more likely to stay in place when they aren’t sure if the treat will be coming soon.
Variable Ratio (VR) - This is when a reward is given after an unpredictable amount of responses. This gives us a very high and steady response rate. So if your dog understands what behavior we are asking of them, such as “down”, they will be more likely to respond quickly and consistently to the cue because sometimes they receive the treat and sometimes they do not. By responding every time, they are increasing their chances of this time being the time they get the treat. Gambling is addictive for a reason!
So with the information we’ve gone over, let’s use an analogy. Your grandmother hands you a scratch off ticket in an envelope, as she does every time she sees you (antecedent). You scratch the numbers off and win $10 (behavior). Feeling particularly lucky, you decide to buy another ticket to scratch (consequence). This is an example of +R, being rewarded with the $10 makes you repeat scratching off tickets. As you scratch off the next ticket, you win $2, which isn’t as much but enough to motivate you to get another ticket and try again. After buying another ticket, this time you don’t win anything, but you are pretty sure the next one has to be a winner (variable ratio). Scratching tickets is reinforcing because you enjoy it. So every time you see your grandmother, especially when you see her holding an envelope, you get excited (classical conditioning).
Hopefully this gives you more insight on the processes behind a lot of the techniques used in dog training and why your dog might do what they do!