REWARD: A MANY SPLENDORED AND COMPLICATED THING Reshaping the Dialogue Copyright 2020 Ozuna

Standard

Reward is one of the most complicated concepts in dog training.  For me, mastering its nuances is both the art and the craft of what we do as trainers.  Yet, too often the discussion on a million internet posts is phrased as “to reward or not to reward”, as if that is the sum total of the dialogue, and the only relevant question.  Yet, whether to reward or not to reward when shaping a behavior is only one of a 1000 questions to be asked.  Rather the discussion needs to be if, and if so, what are we rewarding, how, when, why, where and in what fashion, what context, and with whom are we working and what are their likes/dislikes/preferences, strengths/weaknesses, in short, an endless kaleidoscope of details that make all the difference between good training and mediocre or bad training.  This article is a brief, by no means comprehensive, look at some of the considerations that come into the training equation with respect to the question of reward.

Behavior:

Is the desired behavior, at this moment in time, something that will shape easier with reward or without? Do you want the dog to do something or to NOT do something?  If a NOT do something, what do you want the dog to do instead?  If it’s a DO something, what stage of learning are you at: showing, teaching, reinforcing, or proofing? Are you aware of the difference between those stages of learning?   Depending on the stage of learning or rehab, or end training goal, what precisely am I rewarding?  Am I rewarding drive?  Effort/try?  Position?  Execution?  Steps towards execution?  Polished execution?  Execution under distraction?  Polished execution under distraction?  Focus?  Changes in behavior — both to do or not to do something?  How big of a change in behavior or how small?   The exercise?  Pieces of the exercise?  Just showing up?? As a colleague once said to me, “What are you trying to teach in this moment?” All of these things (and many others) can be separate and distinct rewardable moments.  What are you trying to teach IN THIS MOMENT IN TIME, not an hour from now, not tomorrow, but where is the dog at vis a vis your training objective right now.  Do you need to break something into smaller pieces, link pieces together, start a new piece, review an old piece?  What are you doing?  What is your teaching plan.  And since training is the ultimate life humbling exercise, how can you shift to reframe or refocus if training is not proceeding according to plan.

Arousal levels:

Do you want arousal levels up or down?  Is this a rehab or train of an already over aroused, reactive or aggressive dog, or at the other end of the spectrum, a timid or fearful, who ironically will be equally dysfunctionally flooded with adrenalin.  Rewards may need to be low key, touch or voice only, where pups or dogs are already overexcited or aroused.  Many, even low to mid drive puppies and dogs, come into training anymore already addicted to their own adrenalin.  Adding fuel to the fire does not give you a brain to train to.  Or is this a competition dog that we are building, building, building focus and engagement for increasingly complex tasks, and longer durations of concentration.  Again, at the other end of the spectrum is this a competition dog who is getting too aroused, and easily distracted and unfocused.  How, when, what, and where we deliver reward may all need to be tweaked.  Arousal is not in and of itself a bad thing, it depends on the context, and what the dog’s response to arousal is.  If the response is snappy, nippy or reactive…not so great, and reward may need to be modulated or timing very carefully orchestrated to reward calm focus not the moments where the dog is losing its mind.  If the response is buoyant, focused performance, well then mazeltov, reward is on point.

20191228_143335

The Participant — the Dog:

What does the dog you are working with bring to the equation?  Reward is as individual as the dogs that come through our doors.  I currently have three personal dogs.  Rajah, the old boy shepherd, wants his reward as food, food, food, and delivered in a static (non-moving) fashion with quiet hands.  The higher the food value, the happier he is.  Steak would be his life goal, but chicken and cheese would not be amiss.  That they do not appear on any regular basis, he considers my failing.   Lily, the opinionated alpha elder Shiba, wants her reward food based but tossed away from her in active prey mode.  Hard, crunchy food doesn’t work for her, soft and fast munch builds her drive.  Whatever position/task we were training when I was competing with her, I had to find a way to deliver reward in an active fashion, so that meant choosing certain rewards that both met her nutritional needs, were soft, but could be thrown and wouldn’t fall apart, and it meant not training on dirt, which at a grassless dirt/clay/sand ranch was challenging.  Carport were us.  Kerrtu, my 2.5 yo shepherd, on the other hand outgrew food early on, and only wants an active reward — tug or ball, and loves it the absolute best, when we are doing so in a dynamic fashion in two senses of the word.  She loves it best when the ball/stick/pinecone is thrown, but even better if we incorporate the throwing/reward while moving all across the ranch, not just hanging out in one place.  She wants double dynamic reward, and thrives on that level of complexity.  For her, we work out on the trail, in and out of formal commands and positional work.  Both the other two would shut down at that level of dynamic reward.  They do not have the drive to sustain work for that kind of reward.

Typically, with my pups in training, I am hand feeding a portion of their food, a core resource, as opposed to feeding a treat, something additional, as I want an extremely high level of low stress focus, and low gut impact. Using food, a core resource, gets me a higher level of focus, and nearly all pups within about 48 hours will click in to the food game which we will do in various contexts building positional memory: come to a hand target, touch, sit, down, place, and eye contact.  With them, I am delivering the food in a quiet, fixed, cupped hand NOT in my fingertips because I want them taking food with their lips, not with their teeth, as 9/10 times anymore, I am also building bite inhibition with mouthy youngsters.  Using the cup of my hand and holding that steady also simulates the rooting reflex, so we are accessing core reflexes as well, giving me an extra deep neurological bang for my training buck.

Not all pups are food or toy driven.  Touch is another core source of reward. I am working with a young Belgian shepherd right now, who gets too nervous and overstimulated at times to take food.  Ironically, as he was resistant to touch at first, he now comes to my hand for stroking even if he doesn’t want food, like a cat arching it’s back for pets.  He comes and shoves his head under my hand, and since coming is what I have asked him for, he gets his pets.  He is touched though in a very specific way to increase his proprioceptive (total body) awareness. Again, we are using core neurological processes to aid development and learning.  I use a slightly cupped hand, closed (fingers touching) fingers, firm one-way stroking from front to back, and top to bottom.  Scary for him at first, now touching is a reward.

If I am dealing with a dog who too easily goes into arousal, whether a fully reactive dog or just a pup who can’t process too much adrenalin then my “reward” needs to be tempered accordingly.  If we have a super mouthy or harder edged bitey pup/dog, then the last thing I probably want to do, at least in the beginning, is use food, and even touch may be too much, and just elicit jumpy, erratic behavior.  I’d probably start with just low tone voice praise and evaluate the dog’s arousal balance point over time.  What reward is too much, what is not enough, what is just right in the moment, is the continuing quest of the trainer.  And just when we get a glimmer of rhythm with a dog, all of those considerations will change with the dog’s development, and sometimes from morning to afternoon.

Style and context of reward varies immensely and the style, context, and needs of the dog you are working with vary immensely.

20200225_134407

The Mechanics of Reward:

Books, videos, lecture series, workshops, careers have been developed on the subject of the mechanics of reward.  I just want to raise a few points for the purpose of reframing and deepening the dialogue.  The infinite intricacy of the mechanics of rewards — what, where delivered, how delivered, when delivered — all of those will depend on that initial analysis of WHAT it is we are trying to shape and thus what it is we want to reward, coupled with insight, knowledge and awareness of the dog with whom we are working.  Added to which, just to make this more difficult, task, context, dog analysis, and reward are exceptionally fluid.  We may reward different components of the exercise in very different ways, during the course of a single practice session, let alone from session to session.

Mechanics: Timing

The timing of reward is an intricate dance whether doing behavioral rehab or command based training.  For example, reward in my world, if working on command based behaviors, is coordinated with a careful rhythm of words using a marker system: 1) name, 2) command, 3) marker — “yes”, “good”, or “no” depending on stage of learning and 4) reward or praise language.  Marker timing is a four-part rhythm, not always easy to capture, but when you do it is like a waltz with our dogs.  And yes, the better you get, and the higher the levels of tasks you are training, or the longer the duration of focus you are asking for, the trickier it gets.  I can remember watching an instructor coach one of the members of our Schutzhund (IPG) club on the timing of the release of a bumper reward.  You could totally see the dog flatten slightly or motivate depending on split second differences in timing of the release of the reward.  Fascinating.  Especially when I see so much focus on changing the dogs in endless internet videos with bad timing, rather than a mutual review including changing ourselves.

Reward for behavioral reshaping is the most nuanced and will depend totally on assessing microshifts in behavior in the direction of the desired balanced behavior I am shooting for.  That is a topic for another day.

Another component of timing is when and how are you delivering your reward vis a vis the concept you are trying to teach? Is the correlation strong enough for the task you are teaching? Are you constructively building the desired behavior or is the behavior flat lined because the reward timing is off for that dog, that concept, in that moment?  Working with another experienced person or coach can really help hone your timing.  Timing is a crucial skill to develop as its use or misuse can radically affect behavioral aspect, drive levels and task outcome.

Mechanics: Place of reward

Another piece of the mechanics of reward is the whole concept of place of reward, and its nemesis — consistency.  I see innumerable videos on the internet where the reward is delivered in a variety of places throughout the course of a short video in relation to one specific behavior or task with concomitant varying results.  Yes, our dogs can be moving targets, but we can still strive for consistency of place of reward to facilitate cleaner learning.

There is a vast variance of opinions on “place of reward” depending on what skill is being taught.  Left hand/right hand, fingers tucked back, fingers level, fingers cupped/toy hidden, toy out, there are innumerable details that vary with the task taught and the stage of training.  Some skills sets do not require as consistent a place of reward as others.  The specifics of place of reward depending on what skill set is being taught is again a whole other article.  My point here is be thoughtful about what position you want your dog to be in for a task and reward accordingly.  Think about it, film it, review it, change it, and then once tweaked, push yourself for consistency, before you push your dog for consistency.  However you are going to define place of reward for that skill set, you need to be consistent in delivery.

Being consistent with place of reward does not contradict the need that may occur to vary style or type of reward during a session.  Each of those would also have their best place of reward depending on what is being taught.  In addition, place of reward usually evolves as a task is taught.  Knowing how and when to vary place of reward is an art form.

Evolution of reward:

As training progresses, reward needs to remain fluid and changeable in many different aspects. Reward is never static if training is progressing how it should.  For e.g., when I am first starting a very young puppy, I may be rewarding using a treat/food lure like a magnet, and be rewarding 100% of the time for a particular shaped behavior.  If I am using food up over the head of a pup to sit, the pup sits, and food comes directly from the visually close lure position to the mouth.  Then I may transition to rewarding after execution and a marker (a verbal marker “yes” or a clicker).  Food now is not over the nose of the pup, I simply say the command, mark when executed, then deliver the food.  Then, I might transition to rewarding a sequence of behaviors vs. one isolated behavior.  For e.g., now I want my pup to sit and stay.  He has captured the sit idea, and now I am adding a second concept — sit and hold the sit until I release you.  The timing and place of my reward will now shift from the quick reward for the sit to reward on release from the sit.  So reward is subject to development along with the development of the dog, physically and mentally.

At some point in time, once I am teaching sequences of behaviors, I am probably going to introduce variable reward, not rewarding 100% of the time to build curiosity and what we call drive to problem solve and get the reward.  To isolate one factor, whether or when I switch to variable reward might depend on the status of the dog.  I might have a dog who is smart and executes the sequence really well, but is behaviorally insecure and might need way more time at 100% reward.  Or I may have a super smart dog who rapidly gets bored with 100% reward.  Now we are getting into the infinite nuances of reward and shaping behaviors.  But these are all considerations to be brought to bear to the training equation.  What the reward schedule is for any given dog or task will vary.

Reward: It is a many splendored and every so complicated thing, and worthy of far more of our attention as canine professionals than a cursory emotional stance of whether you reward or not.