Schedules of reinforcement are rules dictating how and when reinforcement is delivered during operant conditioning training. These schedules outline when to present or remove a reinforcer based on how much time has elapsed or how many responses have occurred.
Each of these schedules of reinforcement produces distinct patterns of responding. The type of reinforcement schedule used can impact how quickly a behavior is learned, the strength and frequency of the response, and how prone the response is to extinction.
In this article, learn more about different schedules of reinforcement and how they are utilized in the operant conditioning process.
Table of Contents
Continuous Schedules of Reinforcement
A continuous schedule of reinforcement involves reinforcing a behavior every time it occurs. Because this reinforcement occurs every time the behavior is displayed, the learner can form an association between the behavior and the consequence of that behavior quite quickly.
For example, when training a dog to sit, you would start by providing a treat every single time the dog sits after you give the command.
Continuous reinforcement is often used during the initial stages of teaching behavior. Once the response has been acquired, it is often a good idea to switch to what is known as a partial reinforcement schedule.
Partial Schedules of Reinforcement
Unlike continuous reinforcement, partial (or intermittent) schedules of reinforcement do not reinforce every instance of a behavior. Instead, reinforcement is given periodically. It might be delivered after a certain number of responses have occurred or after a certain amount of time has elapsed.
There are four main partial reinforcement schedules:
Fixed-Ratio Schedule of Reinforcement
In a fixed-ratio schedule of reinforcement, reinforcement is delivered after a fixed number of responses. For example, a rat would have to press a button 10 times to receive a food pellet.
This reinforcement schedule typically leads to steady and high rates of response. There is sometimes a brief pause after a reward is delivered, but then the behavior quickly commences.
Fixed-Interval Schedule of Reinforcement
In a fixed-interval schedule of reinforcement, a behavior is reinforced after a fixed period of time has elapsed. For example, a rat would have to wait five minutes before pressing the button would deliver a good pellet.
This reinforcement schedule typically leads to a fairly slow rate of response at the beginning of the interval. Response rates tend to increase in speed as the time draws closer to the reinforcement time.
Variable-Ratio Schedule of Reinforcement
In a variable-ratio schedule of reinforcement, a behavior is reinforced after a varied, unpredictable number of responses. For example, a rat might be rewarded with a food pellet after 3 responses, then after 8, then 2, then 10.
This reinforcement schedule typically leads to a fairly high and steady rate of response.
Variable-Interval Schedule of Reinforcement
In a variable interval schedule of reinforcement, behavior is reinforced after an unpredictable period of time has passed. For example, a rat might be rewarded with a food pellet for the first response after a random amount of time has elapsed.
This reinforcement schedule leads to a slow and steady rate of response.
Choosing a Reinforcement Schedule
The right reinforcement schedule often depends on the situation and the type of learning taking place. When learning begins, it is often best to start with a continuous reinforcement schedule. Once the desired response has been established, it is often a good idea to switch to a partial reinforcement schedule.
Because continuous reinforcement involves delivering a reward every time a behavior happens, there is a risk that the learner will become satiated. This means that the reinforcement no longer acts as a reward. For example, if you were training a dog to sit using treats, continuous reinforcement may stop working if the animal is no longer hungry.
Reinforcements that are delivered on a less predictable, partial schedule also tend to be less susceptible to extinction. This means that even if a period of time elapses where there is no reinforcement, the behavior won’t suddenly disappear.
History of Reinforcement Schedules
Schedules of reinforcement were first described by psychologist B. F. Skinner as part of his theory of learning known as operant conditioning. In operant conditioning, reinforcement and punishment are utilized to either increase or decrease the likelihood that a behavior will occur again in the future. This theory played an important part in the school of thought called behaviorism, an approach that suggested that all behaviors could be understood by looking at learned associations.
Frequently Asked Questions
A weekly paycheck is an example of which schedule of reinforcement?
A weekly paycheck is an example of a fixed-interval schedule of reinforcement. Because the reinforcement arrives after a fixed period of time (every seven days), it may lead to a higher rate of responding as payday approaches, followed by a brief drop-off as soon as the reinforcement is delivered.
Which schedule of reinforcement is most effective?
The best reinforcement schedule depends on the situation. Continuous reinforcement is often best when a response is first established. Partial reinforcement is a good choice during later stages to avoid satiation from over-rewarding the behavior or extinction when the reward is withdrawn. Variable-ratio schedules tend to lead to the highest and steadiest rate of response.
Gambling is an example of which schedule of reinforcement?
Gambling is an example of a variable-ratio schedule. One of the reasons it can lead to such a high response rate is that gamblers never know when the next response will lead to a reward.
Domjan, MP. The Principles of Learning and Behavior. Belmont, CA: Cengage Learning; 2009.
Redish AD, Jensen S, Johnson A, Kurth-Nelson Z. Reconciling reinforcement learning models with behavioral extinction and renewal: implications for addiction, relapse, and problem gambling. Psychol Rev. 2007 Jul;114(3):784-805. doi: 10.1037/0033-295X.114.3.784