Overexpectation in the context of reward timing

11
Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning and Motivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004 ARTICLE IN PRESS G Model YLMOT-1417; No. of Pages 11 Learning and Motivation xxx (2014) xxx–xxx Contents lists available at ScienceDirect Learning and Motivation j o ur nal ho me pag e: www.elsevier.com/locate/l&m Overexpectation in the context of reward timing Chad M. Ruprecht, Haydee S. Izurieta, Joshua E. Wolf, Kenneth J. Leising Department of Psychology, Texas Christian University, United States a r t i c l e i n f o Available online xxx Keywords: Overexpectation Timing Pavlovian conditioning Rescorla–Wagner model Temporal difference models a b s t r a c t One of the many effects predicted by the Rescorla–Wagner model is overexpectation (OX). The OX effect is the finding that following compound training with two asymptotic ele- ments, X and A, animals emit less conditioned responding (CR, e.g., nose poking) during tests of X alone compared to animals that did not receive compound training. We inves- tigated the OX effect in the context of reward timing by training rats to expect sucrose at different times during X and recording the CR throughout the duration of X. Experiment 1 examined the OX effect using a traditional delayed conditioning procedure. In Experiment 2, the period during which sucrose was expected occurred either early or late during X. Tests revealed that less CR occurred in the OX group around the period that sucrose was previously overexpected, and was otherwise similar in response functions to the control group that did not receive the compound manipulation. These are the first studies pitting the effects of OX with an animal’s ability to time their expectation of food. © 2014 Elsevier Inc. All rights reserved. Previous research has shown that the number and quality (e.g., temporal proximity) of pairings of an unconditioned stimulus (US; e.g., sucrose) with an initially neutral, conditioned stimulus (CS; e.g., a tone) influences the magnitude and timing of the conditioned response (CR; e.g., nose poking for food). The Rescorla–Wagner (R-W) model of learning (Rescorla & Wagner, 1972) proposed an equation to describe how a CS comes to control the CR by couching learning as trial by trial alterations to the associative value of a CS. The R-W model notably accounted for many existing conditioning effects (e.g., blocking), and also anticipated, a priori, a variety of conditioning effects that rely on the summation of associative values from more than one CS (e.g., superconditioning and overexpectation) One such effect occurs when the combination of previously trained CSs (hereafter referred to as elements) results in an overexpectation (hereafter referred to as OX) of the US. In Phase 1 of an OX procedure, two elements are trained on separate trials with a common US to asymptotic levels of responding. In Phase 2, the two elements are presented in compound. Given that initial trials of X and A occurred separately, the R-W model posits a summation rule: when X and A are placed in compound their associative values sum together. It is on the initial compound trial that animals should maximally overexpect the single US. Because of the finite amount of learning that can occur to a US, the model predicts that the associative value of each element should drop during subsequent compound trials until each element predicts the appropriate amount of US (i.e., X and A each equal half of the original US value). The performance of animals trained with an OX procedure is most often compared with a CTL group that receives additional trials with one of the pre-trained elements in place of compound training (e.g., Kehoe & White, 2004; McNally, Pigg, & Weidemann, 2004; Rescorla, 1970, 1999; Sissons & Miller, 2009). Only in group OX should the prediction error lead to an adjustment of the associative value of X and A. The OX effect has been substantiated in the behavior of rats (Kamin & Gaioni, 1974; Kremer, 1978; Lattal & Nakajima, 1998; Rescorla, 1970), pigeons (Khallad & Moore, 1996) and recently, humans (Collins & Shanks, 2006). Corresponding author at: Department of Psychology (298920), Texas Christian University, 2800 South University Drive, Fort Worth, TX 76129, United States. E-mail address: [email protected] (K.J. Leising). 0023-9690/$ see front matter © 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.lmot.2014.01.004

Transcript of Overexpectation in the context of reward timing

G ModelY

O

CD

A

KOTPRT

st&abmt1Imict

aPt&h

U

0h

ARTICLE IN PRESSLMOT-1417; No. of Pages 11

Learning and Motivation xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Learning and Motivation

j o ur nal ho me pag e: www.elsev ier .com/ locate / l&m

verexpectation in the context of reward timing

had M. Ruprecht, Haydee S. Izurieta, Joshua E. Wolf, Kenneth J. Leising ∗

epartment of Psychology, Texas Christian University, United States

a r t i c l e i n f o

vailable online xxx

eywords:verexpectationimingavlovian conditioningescorla–Wagner modelemporal difference models

a b s t r a c t

One of the many effects predicted by the Rescorla–Wagner model is overexpectation (OX).The OX effect is the finding that following compound training with two asymptotic ele-ments, X and A, animals emit less conditioned responding (CR, e.g., nose poking) duringtests of X alone compared to animals that did not receive compound training. We inves-tigated the OX effect in the context of reward timing by training rats to expect sucrose atdifferent times during X and recording the CR throughout the duration of X. Experiment 1examined the OX effect using a traditional delayed conditioning procedure. In Experiment2, the period during which sucrose was expected occurred either early or late during X.Tests revealed that less CR occurred in the OX group around the period that sucrose waspreviously overexpected, and was otherwise similar in response functions to the controlgroup that did not receive the compound manipulation. These are the first studies pittingthe effects of OX with an animal’s ability to time their expectation of food.

© 2014 Elsevier Inc. All rights reserved.

Previous research has shown that the number and quality (e.g., temporal proximity) of pairings of an unconditionedtimulus (US; e.g., sucrose) with an initially neutral, conditioned stimulus (CS; e.g., a tone) influences the magnitude andiming of the conditioned response (CR; e.g., nose poking for food). The Rescorla–Wagner (R-W) model of learning (Rescorla

Wagner, 1972) proposed an equation to describe how a CS comes to control the CR by couching learning as trial by triallterations to the associative value of a CS. The R-W model notably accounted for many existing conditioning effects (e.g.,locking), and also anticipated, a priori, a variety of conditioning effects that rely on the summation of associative values fromore than one CS (e.g., superconditioning and overexpectation) One such effect occurs when the combination of previously

rained CSs (hereafter referred to as elements) results in an overexpectation (hereafter referred to as OX) of the US. In Phase of an OX procedure, two elements are trained on separate trials with a common US to asymptotic levels of responding.n Phase 2, the two elements are presented in compound. Given that initial trials of X and A occurred separately, the R-W

odel posits a summation rule: when X and A are placed in compound their associative values sum together. It is on thenitial compound trial that animals should maximally overexpect the single US. Because of the finite amount of learning thatan occur to a US, the model predicts that the associative value of each element should drop during subsequent compoundrials until each element predicts the appropriate amount of US (i.e., X and A each equal half of the original US value).

The performance of animals trained with an OX procedure is most often compared with a CTL group that receivesdditional trials with one of the pre-trained elements in place of compound training (e.g., Kehoe & White, 2004; McNally,igg, & Weidemann, 2004; Rescorla, 1970, 1999; Sissons & Miller, 2009). Only in group OX should the prediction error lead

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

o an adjustment of the associative value of X and A. The OX effect has been substantiated in the behavior of rats (Kamin Gaioni, 1974; Kremer, 1978; Lattal & Nakajima, 1998; Rescorla, 1970), pigeons (Khallad & Moore, 1996) and recently,umans (Collins & Shanks, 2006).

∗ Corresponding author at: Department of Psychology (298920), Texas Christian University, 2800 South University Drive, Fort Worth, TX 76129,nited States.

E-mail address: [email protected] (K.J. Leising).

023-9690/$ – see front matter © 2014 Elsevier Inc. All rights reserved.ttp://dx.doi.org/10.1016/j.lmot.2014.01.004

G Model

ARTICLE IN PRESSYLMOT-1417; No. of Pages 11

2 C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx

The RW-model makes predictions about the magnitude of the CR following the OX procedure. The model treats a stimulus,regardless of its duration or the timing of reward delivery, as a single element that accrues or loses associative value as awhole unit until the US prediction error is minimized. This assumption predicts the OX effect should result in a diminishedCR uniformly throughout the duration of an element and independent of the CS–US temporal relationship. However, thereis no question that timing of crucial conditioning events, such as the CS–US interval, influences the magnitude, form, anddistribution of the conditioned response during the presentation of an element (e.g., Catania, 1970; Gibbon, Malapani, Dale,& Gallistel, 1997; Kirkpatrick & Church, 1998, 2000; Roberts, 1981).

A temporally specific CR is typically evaluated using post-training trials in which the US is omitted and the temporaldistribution of responding is analyzed. For example, Kirkpatrick and Church (1998, Experiment 2) found that rats trainedwith a 15-s CS–US interval increased the rate of nose poking within 1–2 s after CS onset but peaked at approximately 15 safter CS onset. While the original R-W model uses prediction error to modify the associative value of a stimulus as a unifiedwhole, temporal difference (TD) models assume that each moment (time steps) during a stimulus is distinctly represented(Ludvig, Sutton, & Kehoe, 2008; Sutton & Barto, 1981; Vogel, Brandon, & Wagner, 2003). The pattern of phasic firing by rewardprocessing dopamine neurons is suggested by TD models to encode prediction error at each moment during a stimulus basedon the difference between the discounted value of the US predicted at the current time step and the predicted cumulativesum of discounted US value from the remaining time steps. Unlike the R-W model, TD models assume that US predictionstrength (i.e., associative strength) varies throughout the duration of the stimulus. This assumption correctly predicts theresponse peaks observed at the expected time of the US delivery during conditioning studies that vary the CS–US interval.Furthermore, it suggests that moments of prediction error, such as during an OX procedure, may be isolated to distincttime steps within the duration of a stimulus. Temporally specific adjustments in the associative strength of the CS due toprediction error should result in a temporally specific reduction in the magnitude of the CR.

Blaisdell, Denniston, and Miller (2001, Experiment 4) demonstrated the best evidence for temporal relationships betweenX, A, and the US modulating the strength of the OX effect. Rats were initially trained with a 5-s trace interval between eachelement (X and A) and the US in a fear conditioning paradigm. The same interval was used in Phase 2 when the elementswere presented together in compound training. Rats were then given post-training presentations of A either terminatingimmediately with the US (Group OX-Diff) or with the US following the same 5-s trace interval from training (Group OX-Same).When tested with X, Group OX-Diff showed a larger CR (i.e., more conditioned suppression) than OX-Same. Blaisdell et al.claimed that A more effectively competed with X when the temporal relations during post-training matched element andcompound training. Time was an important factor in OX, but due to their design, Blaisdell et al. were not able to demonstratewhether the decrement due to OX induced a temporally specific drop in the CR. Temporal specificity can be more directlyanalyzed by measuring fine grained changes in the time course of the CR (e.g., Leising, Sawa, & Blaisdell, 2007; Williams,Johns, & Bindras, 2008).

In the current experiments, we trained rats to nose poke for food and examined whether the OX effect could manifest ata temporally specific time period (Fig. 1). In Experiment 1, we trained two forward-paired elements that differed in duration(40 s vs. 10 s), mirroring the design of recent OX procedures with forward-paired elements (e.g., Rescorla, 2006, 2007; Sissons& Miller, 2009). In Experiment 2, we modified an embedded procedure utilized by Leising et al. (2007) to train rats to expect

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

sucrose either early or late within X and then embedded A directly into these time periods. Following a retraining procedureto further enhance responding to the shorter A (and enhance the decrement during X), we found evidence for timing of the

Fig. 1. The design of Experiments 1 and 2. Element training has been collapsed across Phases 1 and 2 and trials of Element B have been excluded forsimplicity (see Tables 1 and 2 for specific details).

G ModelY

Oi

E

dX

H

dw

M

S

Lvoutt

A

lccpwsc

7(aafliha

P

os

tt

TT

N

ARTICLE IN PRESSLMOT-1417; No. of Pages 11

C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx 3

X effect in the early and late groups, in that the decrement mapped well onto response timing functions observed earliern element training (left panel of Fig. 3).

xperiment 1

We tested for a timed OX effect by training rats with 40-s (X) and 10-s (A) elements, both of which terminated with theelivery of sucrose (see Fig. 1), followed by compound trials in which A was embedded into the last 10 s of X. Responding to

alone at test was compared to a group which received the same initial trials but did not receive compound trials.

ypothesis

We hypothesized that rats would learn the X-US and A-US temporal relationships across training and overexpect the USuring compound trials of XA. Behavior congruent with this hypothesis would manifest as fewer nose pokes in the intervalshen the US was most expected to occur in group OX during tests of X.

ethod

ubjectsThe subjects were 16 female and 16 male Long–Evans rats bred in the TCU vivarium from parents obtained from Harlan

aboratories (Indianapolis, IN). Rats were pair-housed in translucent plastic tubs with a substrate of wood shavings in aivarium maintained on a 12 h dark/12 h light cycle. All experimental manipulations were conducted during the light portionf the cycle. A progressive food restriction schedule was imposed over the week prior to the beginning of the experiment,ntil each rat received 15 g of food each day. All animals were handled daily for 30 s, during the week prior to the initiation ofhe study. Rats were randomly assigned to each group; however, one male rat in the CTL group died prior to testing, makinghe final number of subjects in group OX (n = 16) slightly higher than group CTL (n = 15).

pparatusEach of eight experimental chambers measuring 30 cm × 25 cm × 20 cm (l × w × h) was housed in a separate sound and

ight-attenuating environmental isolation chest (Med Associates). The walls and ceiling of the chamber were constructed oflear Plexiglas, and the floor was constructed of stainless-steel rods measuring 0.5 cm in diameter, spaced 1.5 cm center-to-enter. One wall of the chamber was equipped with a dipper that could deliver sucrose solution (16%). When in the raisedosition, a small well (0.05 cm3) at the end of the dipper arm protruded up into the feeding niche. An infrared photo-detectoras positioned across the entrance to the feeding niche. When a rat placed its nose into the feeding niche to lick the sucrose

olution (i.e., a nose poke), the photo beam was disrupted measuring responding at a rate of 10 breaks per s (i.e., dSec;onverted into seconds for data analysis).

A ventilation fan in the enclosure and a white-noise generator on a shelf outside of the enclosure provided a constant4-dB (A-Scale) background noise. Three speakers on the outside walls of the chamber delivered a high frequency tone3000 Hz, 8 dB above background noise) or a low frequency tone (750 Hz, 8 dB above background noise). The 3000 Hz tonend 750 Hz tone served as Elements X and B, and were counterbalanced across groups. A diffuse light was located 13 cmbove the floor, on the same wall as the food magazine. A flashing light (0.25 s on/0.25 s off) stimulus could be presented byashing the diffuse light; 10 s of the flashing light stimulus served as Element A for all subjects. The enclosure was dimly

lluminated by a 28-V, 100-mA shielded incandescent house light mounted on the wall opposite of the food magazine. Theouse light was turned on during the experimental sessions but turned off during the duration of the flashing light. Boxssignments were counterbalanced between groups.

rocedureTable 1 shows the procedural details of Experiment 1.Magazine training. For 2 sessions, rats received un-signaled presentations of sucrose that lasted 10 s and were delivered

n a variable interval (VI) schedule of 20 s (Session 1), followed by a VI-60 s schedule (Session 2). This 10-s presentation of

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

ucrose in the magazine served as the US for all remaining phases (Table 2).Element training. On sessions 3–6, all rats received 8 trials of 40-s X (reinforced, +) and 40-s B (non-reinforced, −). B−

rials were meant to encourage discrimination of X+ and B−. The inter-trial interval (ITI) during Element Training was seto a VI-50 s (range 30–70 s) for the first two sessions followed by a VI-80 s ITI (range 60–100 s) for sessions 5 and 6.

able 1he procedural details of Experiment 1.

Group Timing of US Element (sessions 7–19) Compound Test

OX (n = 15) After X X+(40 s)/B− (40 s)/A+ (10 s) XA+ (40 s) X− (40 s)CTL (n = 16) After X X+(40 s)/B− (40 s)/A+(10 s) X+ (40 s) X− (40 s)

ote: “+” = paired with sucrose, “−” = un-paired with sucrose. Element durations (in seconds) are in parenthesis.

ARTICLE IN PRESSG ModelYLMOT-1417; No. of Pages 11

4 C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx

Table 2The procedural details of Experiment 2.

Group Timing of US Element (sessions 7–19) Compound Test 1 Reacclimation Test 2

Early-OX (n = 8) 5 s after onset of X X+ (40 s)/B− (40 s)/A+ (10 s) XA+ (40 s) X− (40 s) XA+ (40 s)A+ (10 s)

X− (40 s)

Early-CTL (n = 8) Same Same A+ (40 s) Same A+ (40 s) SameLate-OX (n = 8) 25 s after onset of X Same XA+ (40 s) Same XA+ (40 s)

A+ (10 s)Same

Late-CTL (n = 8) Same Same A+ (40 s) Same A+ (40 s) Same

Note: “+” = paired with sucrose, “−” = un-paired with sucrose. Element durations (in seconds) are in parenthesis.

On sessions 7–19, 8 trials of A+ were added to training. Sessions consisted of 24 trials per session, with 8 trials each of 40-sX+, 40-s B-, and 10-s A+. The inter-trial interval (ITI) during sessions 7 and 8 was set to a VI-50 s (range 30–70 s) followedby a VI-80 s ITI (range 60–100 s) for the remainder of training. Rats received 136 trials of X and B, and 104 trials of A duringElement Training throughout element training. Two non-reinforced probe trials of each element (X−, A−, and B−) replacedregularly scheduled presentations of X+, A+, and B+ during the final session of Element Training (Session 19). Respondingwas averaged across the two probes for each element.

Compound training. During sessions 20, 21, and 22, rats in group OX (n = 16) received 8 compound trials (40-s XA+), duringwhich the onset of A occurred 30 s after the onset of X; both stimuli co-terminated with US delivery. Rats in group CTL (n = 15)received 8 trials of X+, which also co-terminated with the US delivery. A VI-80 s ITI (range 60–100 s) was used throughoutcompound training. A probe of XA− or X− replaced Trials 3 and 6 of XA+ or X+ trials on Session 22 for Groups OX and CTL,respectively.

Test. During Session 23 and Session 24, rats received 8 trials of XA+ (OX) or X+ (CTL) followed by 4 test trials of X−.Responding during each 5-s bin was recorded. A VI-80 s ITI (range 60–100 s) was used throughout the test sessions.

Data collection and analysis. The cumulative time spent nose poking (i.e., Mag Time) during each element and the 40 spreceding each elements (i.e., the Pre-CS; collapsed across trials) was recorded in 5-s bins. We approached the test data intwo ways: [1] Whole-Cue Analysis: we conducted an ANOVA on all eight time bins of X. Given we predicted a difference onlyin bins during which the prediction error was maximal (near the end of the cue), we did not expect a main effect of Groupor the interaction. [2] Target Bin Analysis: Given our a priori predictions regarding maximal overexpectation during the last10-s of X (Bins 7, 8) and the 10-s of the US (US-1 and 2) we used independent t-tests to investigate group differences duringthese bins and used Bins 5 and 6 as comparisons. These two bins were selected as comparisons for the following reasons: 1)they were adjacent to Bins 7 and 8, 2) they did not overlap with Element A in compound training, so they were bins duringwhich we wouldn’t expect the OX decrement and 3) they revealed the temporal specificity of the OX effect. For all t-tests,we utilized Cohen’s d (Rosenthal & Rosnow, 1991) to report effect size.

Results

Element trainingFig. 2 (left panel) displays the mean duration of responding during each bin separately for Element X and the Pre-CS at

sessions 5, 10, and 19. To evaluate the strength of excitatory conditioning, we compared responding to Elements X+ and A+to B− and the Pre-CS during the last session of element training (Session 19). Given that X+ and A+ differed in duration (40-svs. 10-s), we compared Mag Time averaged across Bins 7 and 8 of 40-s X, and Bins 1 and 2 of 10-s A, as these two time periodsshared adjacent, temporal relationships to the US during element training. Mag Time was collapsed across all 8 bins of the40-s Pre-CS. At the conclusion of element training (Session 19), rats were responding more during 40-s X+ and 10-s A+, thanduring 40-s B- and the Pre-CS. This observation was supported by a 4 (Element: X vs. A vs. B vs. Pre-CS) X 2 (Stimulus: HighTone vs. Low Tone) X 2 (Gender: Male vs. Female) X 2 (Group: OX vs. CTL) factorial analysis of variance (ANOVA) conductedon Mag Time from Session 19. The ANOVA revealed a main effect of element, F(3, 29) = 54.6, p < .001, �2 = .51 all other maineffects and interactions were non-significant, Fs < 2.61, p > .05. Post hoc analysis (Tukey’s Honestly Significant Difference;HSD) conducted within the main effect of element, revealed that X (M = 2.1 s, SD = 1.2 s) and A (M = 2.4 s, SD = 1.6 s), whichdid not differ reliably, p > .05, elicited higher responding than B (M = .76 s, SD = .7 s) or the Pre-CS (M = .43 s, SD = .6 s), p < .05,which did not reliably differ themselves, p > .05.

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

Compound trainingBetween group comparisons (OX vs. CTL) were made during Bins 7 and 8 (averaged) of XA+ or X+, respectively. At the

conclusion of compound training, rats in Group OX and CTL were responding more to XA+ and X+, respectively, whencompared to B− or the Pre-CS. An identical analysis to that above conducted on Mag Time during Session 22 of compoundtraining revealed a marginally nonsignificant main effect of group, F(3, 29) = 3.64, p = .07, �2 = .09; all other main effects andinteractions were nonsignificant (main effects: stimulus, Fs < 2.19, ps > .05).

ARTICLE IN PRESSG ModelYLMOT-1417; No. of Pages 11

C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx 5

Fig. 2. Left: Mag Time during Element X (averaged) on Session 5, Session 10, and Session 19 of element training. The Pre-CS, for comparison, was collapsedaTt

T

cOvopg

wlrddp

D

tbrDd

bbtobt1tE

sshot

cross all 3 sessions. Right: Mag Time during Element X (averaged across both test sessions) and the 10-s following X for Groups OX (n = 16) and CTL (n = 15).he y-axis displays time spent poking in the magazine in seconds (s). The grey symbols beneath the X-axis indicate when the time bins during X that A andhe US previously occurred during compound training. Data are from non-reinforced probe trials. *p < .05 (OX vs. CTL).

est of XWhole-Cue analysis. Fig. 2 (right panel) also shows responding in each 5-s bin during test trials of Element X. These data

ome from both test sessions with four trials per session (8 total test trials). Given the large overlap of bins during which theX and CTL groups responded similarly, group averages appear quite small. This observation was confirmed by a Group (OXs. CTL) x Time (Bins 1–8) mixed design ANOVA, with time as the repeated factor, which revealed a significant main effectf time, F(7, 203) = 3.65, p < .001, �2 = .11, and nonsignificant effects of group and the interaction of group and time, Fs < 1.01,s > .05. The lack of an interaction of group and time indicated that if any OX effect was occurring, the difference betweenroups would be at similar intervals and confined to only a small subset of intervals.

Target bin analysis. Fig. 2 (right panel, below X-axis) includes symbols that indicate the intervals during which Element Aas embedded in Element X (see also Fig. 1 for temporal relationships across phases). Prediction error was expected to be

argest at these intervals, as well as during the expected US periods. Independent t-tests conducted at Bin 7 did not reveal aeliable difference, t(29) = .89, p = .37, d = .33. Significant differences between groups were found at Bin 8, t(29) = 2.3, p = .02,

= .85, and during the first 5 s of expected US delivery, US-1 t(29) = 2.7, p = .02, d = 1.02, but not during US-2, t(29) = .06, p = .94, = .02. There was no reliable difference between groups at comparison Bin 5, t(29) = .71, p = .47, d = .26, or Bin 6, t(29) = .52, = .6, d = .19.

iscussion

The left panel of Fig. 2 indicates that the animals were indeed timing, as evidenced by the progressive development ofhe CR and more responding occurring toward the end of Element X. The overall depressive effects of extinction can alsoe seen when comparing the left and right panels of Fig. 2. Characteristically, the magnitude of responding during X waseduced in extinction but the temporal specificity of the CR was maintained (c.f., Ohyama, Deich, Gibbon, & Balsam, 1999).ata from rats in the OX condition demonstrated a reduction in magnitude that was confined to the last 5-s of X and alsouring the first 5-s of the expected US delivery.

The decrement in responding during the final 5-s of Element X was the strongest evidence for the overexpectation effect,ut responding during US-1 was also informative. The US-1 bin was the interval in which prediction error was expected toe the largest. Though Element X was absent during this interval, the US was also absent at test (US-1 and -2). The CR duringhe expected US intervals was anticipatory and reflects, at a minimum, two processes. The first was timing of the US from thenset of X. The second was the termination of A as a cue for US delivery. Most experiments investigating temporal control ofehavior train rats with a delayed procedure similar to that used in Experiment 1, but their procedures also include proberials when the duration of the CS is extended and reinforcement withheld (i.e., a peak procedure; e.g., Kirkpatrick & Church,998). The results of experiments with the peak procedure reveal a distribution in which responding peaks at the expectedime of US delivery, much like that found in Fig. 2 (right panel). Previous research suggests that timing from the onset oflement X was primarily controlling the CR during US-1.

One possible limitation concerns the probes taken from the final session of compound training, during which there wasome weak evidence for summation of X and A (a marginally significant difference between probes of XA and X). Summation

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

o late in compound training could be indicative of non-asymptotic readjustment to X and A. Insufficient readjustment,owever, would work against the predicted result of diminished responding during X in the OX group. We address the issuef incomplete readjustment, according to the parameters we utilized for compound training in Experiment 2, by examininghe effects of reacclimation training.

G Model

ARTICLE IN PRESSYLMOT-1417; No. of Pages 11

6 C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx

In sum, the results from Experiment 1 are the first, to our knowledge, to evaluate the CR throughout the duration of atarget Element X following OX training. The observed OX effect was confined to a period (Late vs. Earlier in X) during whichthe US was maximally over expected, based on the X-US and A-US temporal relationships encoded during training (seeFig. 1). This result mirrors previous reports on matching temporal relations in excitatory (Leising et al., 2007) and inhibitoryappetitive conditioning (Williams et al., 2008). However, to verify if the response decrement due to OX was truly sensitiveto where A was embedded in X, we sought to manipulate the temporal presentation of A and the US within differing periodsof X in Experiment 2.

Experiment 2

Experiment 1 obtained evidence for a timed decrement in responding during a pre-trained Element X, following com-pound training with another pre-trained Element, A (i.e., the OX effect). Experiment 2 aimed to capture a temporally specificOX effect evidenced by a timed decrement in rats’ nose poking during specific time periods of the target Element X. Duringelement training, the US arrived either early or late during X (Fig. 1).

Leising et al. (2007) demonstrated how an element embedded within another element could be used to study timingin associative conditioning. During sensory preconditioning, a short Element X (10 s in duration) was embedded within along Element X (40-s), either early (5 s) or late (25 s) after the onset of X. Following these pairings, A was simultaneouslypaired with access to a sucrose solution. During non-reinforced tests of X, responding was consistent with encoding of anX-A-sucrose temporal map; rats in Group Early checked for sucrose more often in the early portion of X, while rats in GroupLate checked more often in the latter portion of X. Borrowing from the logic of Leising et al. (2007), it is possible to examinethe response decrement imposed by the OX effect if elements communicating varying temporal information are embeddedinto one another.

One major change from Experiment 1 to Experiment 2 was that all rats received simultaneous pairings of A with theUS. Previous research has demonstrated that simultaneous CS–US pairings produce weak CRs (e.g., Ellison, 1964; Kamin,1954, 1965; Pavlov, 1927), which has been most commonly interpreted as a failure to learn the CS–US relationship. Morerecent research has shown that the deficit observed following simultaneous pairings often reflects a performance, ratherthan a learning deficit (e.g., Cole, Barnet, & Miller, 1995; Leising et al., 2007; Savastano & Miller, 1998, for a review). Theuse of simultaneous pairings in Experiment 2 was supported by two empirical results: (1) In Experiment 1, the OX effectwas spread across Element X (Bins 7 and 8) and the US-1 bin. Simultaneous pairings of Element X and the US allowed usto predict a priori the exact bins during which the prediction error would be maximal. (2) A large body of literature hasdemonstrated that simultaneous pairings are an effective method for confining temporal expectation to specific bins forstudies of associative cue competition and integration (for appetitive see Leising et al., 2007; for aversive conditioning seeSavastano & Miller, 1998, for a review). Lastly, placing the competing element and expected US delivery within the sameinterval allowed for non-overlapping Early and Late periods within Element X, which could be used to assess the temporalspecificity of the OX effect.

We replicated the parameters used in Experiment 1 but failed to observe any decrement during X. This failure to observean OX effect may have been due to simultaneously pairing A and the US, or the fact that A and the US were embedded withinX. To enhance the OX effect, we borrowed a reacclimation procedure reported by Rescorla (2007). After two experimentsrevealing weak OX effects using an appetitive Pavlovian conditioning procedure, Rescorla included separate trials of A+during the compound training phase (Experiment 3) in order to enhance the magnitude of the OX effect. Consistent with theR-W model, Rescorla interpreted the magnified OX effect as the result of X incurring more of the decrement in excitatorystrength during compound trials due to separate trials of A+ rescuing the deficit incurred by A during XA+ compound trials.More specifically, Rescorla (2007) suggests “According to most error correction models, these separate A+ trials shouldmaintain the strength of A+ trials, thereby forcing B [in our case X] to account for all of the decrement that occurred on AB+trials [in our case AX+]” (pp. 17). We utilized this training procedure to magnify the OX effect in Experiment 2.

Hypothesis

We expected clear evidence of response timing (e.g., temporally distinct peaks in conditioned responding; Catania, 1970)during X before subjects were advanced to compound training. During compound training, we anticipated that subjectswould overexpect the US during targeted time periods (early or late) of X and A. All rats were then tested on X followingcompound training. Behavior congruent with a timed OX decrement would manifest as less nose poking during the earlyperiod of X for group Early OX than Early CTL, and less nose poking during the late period of X for Group Late OX than LateCTL.

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

Method

SubjectsThe subjects were 16 male and 16 female Long–Evans rats breed, housed, and fed as described in Experiment 1. Subjects

were randomly assigned to each group (EarlyOX, n = 8; LateOX, n = 8; EarlyCTL, n = 8; and LateCTL, n = 8).

G ModelY

A

s

P

dUss

oat2

t

ftp2t

dawocE

R

E

E1dcbAe

tXtnX(

C

pMX

T

b

ARTICLE IN PRESSLMOT-1417; No. of Pages 11

C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx 7

pparatusThe apparatus and stimuli used were identical to those described in Experiments 1 and 2. X and B were 40-s auditory

timuli (high tone v. low tone), and A was a 10-s flashing light.

rocedureAll ITIs were conducted in each phase as in Experiment 1.Mag training. The first 2 sessions of magazine training were identical to those described in Experiment 1.Element training. Element Training proceeded identically to Experiment 1, with exception that the time of US delivery

uring X differed across the groups; for the early group, the US arrived 5 s after the onset of X, but for the late group, theS arrived 25 s after the onset of X. Starting on Session 7, trials of 10-s A were again added. The 10-s US (sucrose) arrived

imultaneously with the 10-s flashing light (A). During Session 19, non-reinforced probes replaced the 5th and 7th regularlycheduled presentation of X+, A+, and B+.

Compound training. On sessions 20, 21, and 22, rats in group Early OX received 8 compound trials during which the onsetf A and the US were embedded 5 s after the onset of X. Subjects in Group Late OX received 8 compound trials, during which And the US were embedded 25 s after the onset of X. Groups Early CTL and Late CTL received 8 respective trials of X, identicalo how they appeared in element training. Probes replaced two of the XA+ or X+ trials on Trial 3 and Trial 6 during Session2, respectively.

Test 1. During sessions 23 and 24, OX groups received 8 compound trials (XA+), whereas CTL groups received 8 presen-ations of X+. All subjects then received 4 test trials with X (i.e., X−).

Reacclimation and test 2. During sessions 25–28, all subjects received one additional element training session. This wasollowed by 3 sessions of compound training identical to previous training with the exception that during the 16 compoundrials (XA+), all subjects received 8 additional reinforced trials of A+, randomly inserted with the constraint that A was neverresented twice in a row (i.e., XA+/A+; e.g., Rescorla, 2007). All rats received 48 trials of XA+ or X+, respectively, as well as4 additional trials of A+. Rats were again tested on X (identical to Test 1) on sessions 29 and 30. A VI-80 s ITI was usedhroughout reacclimation and Test 2.

Data analysis. The dependent measure and main comparisons made during element and compound training did notiffer from Experiment 1. At test, we again expected our embedded procedure to produce a temporally specific, rather than

generalized decrement during X. [1] Whole-Cue Analysis: Given the 4 groups differed by two factors (time and group),e conducted separate ANOVAs on all eight bins for rats in the early and late conditions. [2] Target Bin Analysis: Given

ur a priori predictions regarding Bins 2 and 3 for the early rats and Bins 6 and 7 for the late rats, we made between-groupomparisons at these target bins for each group, and used the other groups respective target bins as comparisons (e.g., Grouparly: Target: Bin 2 and Bin 3; Comparison: Bin 6 and Bin 7).

esults

lement trainingEvidence for early vs. late timing. The left panel of Fig. 3 shows average time spent in the magazine during probes of

lements X-Early, X-Late, B and the Pre-CS (occurring before all 3 trial types) on the last session of element training (Session9). During element and compound training, we collapsed across groups to calculate a mean Pre-CS period since this measureid not differ reliably across groups. Of importance, we hoped to illustrate sufficient evidence of timing in the early and lateonditions. The timing functions of the early and late rats differed reliably; with responses in each condition peaking duringins associated with past US delivery. This was confirmed by a 2 (Group: Early vs. Late) X 8 (Time: Bins 1–8) mixed designNOVA, with Time as the repeated factor conducted on data from the last session of element training, which revealed a mainffect of time, F(3, 60) = 4.22, p < .001, �2 = .12, and an interaction between group and time, F(7, 210) = 4.16, p < .001, �2 = .13.

Another important finding from the element training probes was that Elements X+ and A+, elicited far more nose pokeshan Elements B− and the Pre-CS. We conducted a 4 (Element: X vs. A vs. B vs. Pre-CS) X 2 (Stimulus: High Tone vs. Low Tone)

2 (Gender: Male vs. Female) X 2 (Group: OX vs. CTL) factorial ANOVA with Mag Time as the outcome variable. As expected,here was a significant main effect of element, F(3) = 13.4, p < .001, �2 = .41, but all other main effects and interactions wereonsignificant., all Fs < 1.20, ps > .05. Within the main effect of element, post hoc analysis (Tukey’s HSD) revealed that both

(M = 3.1 s, SD = 1.7 s) and A (M = 2.8 s, SD = 1.3 s), which did not reliably differ, p > .05, elicited higher responding than BM = .88 s, SD = .5 s), p < .05, or the Pre-CS (M = .95 s, SD = .65 s), p < .05, which did not reliably differ, p > .05.

ompound trainingAgain, comparable time periods for early and late, Bins 2–3 and Bins 6–7, respectively, were tested. To test for any

reexisting relationships between groups during Phase 2, we conducted a 2 (Stimulus: High Tone vs. Low Tone) X 2 (Gender:ale vs. Female) X 2 (Group: OX vs. CTL) factorial ANOVA, with Mag Time as the outcome variable, on responding duringA (OX group) vs. X (CTL group). All main effects and interactions were nonsignificant, all Fs < 1.74, ps > .05.

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

est 1Whole-Cue analysis. At Test 1 (not illustrated), Early (OX and CTL) and Late (OX and CTL) groups differed in their timing

ehavior, but not by group (OX vs. CTL). Mag time during X was averaged across test sessions 21 and 22. A 2 (Group: EarlyOX

ARTICLE IN PRESSG ModelYLMOT-1417; No. of Pages 11

8 C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx

Fig. 3. Left: Mag Time during X+ for the Early (top; n = 16) and Late (bottom n = 16) conditions, as well as B− and the Pre-CS charted across in 5-s bins. Theprobe was taken on the final session of element training. Right: Mag time during X− at Test 2 for groups Early OX (n = 8), Early CTL (n = 8), Late OX (n = 8),and Late CTL (n = 8). *p < .05; (OX vs. CTL). All data are from non-reinforced probe trials.

vs. EarlyCTL) X 8 (Time: Bins 1–8) mixed design ANOVA, with time as the repeated factor, revealed a significant main effect oftime, F(2, 28) = 65.8, p < .001; a separate 2 (Group: LateOX vs. LateCTL) X 8 (Time: Bins 1–8) ANOVA also revealed a significantmain effect of time, F(2, 28) = 17.8, p < .001. There were no main effects of group, nor an interaction of group x time for eitherearly or late groups, all Fs < 1.98, ps > .05.

Target bin analysis. We examined between-group differences at each individual bins. All target bins and comparison binswere nonsignificant, ts < .52, ps > .05, ds < .19.

Test 2Whole-Cue analysis. The right panel of Fig. 3 illustrates responding seen during X and A during Test 2 after reacclimation

training (XA+; A+). Embedding additional trials of A had a marked effect on the decrement seen during X for the OX groups.Mag Time was averaged across test sessions 29 and 30. For the early group (top right of Fig. 3), a 2 (Group: EarlyOX vs.

EarlyCTL) X 8 (Time: Bins 1–8) mixed design ANOVA, with time as the repeated measure, conducted on Mag Time, revealeda significant main effect of time, F(1, 21) = 73, p < .001, �2 = .84, a nonsignificant main effect of group, F(1, 14) = 2.87, p = .12,

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

�2 = .17, and a nonsignificant interaction, F(7, 98) = .70, p = .60, �2 = .05. The late group (bottom right of Fig. 3) responded quitedramatically to reacclimation. A 2 (Group: LateOX vs. LateCTL) X 8 (Time: Bins 1–8) ANOVA revealed a significant main effectof group, F(2, 21) = 4.7, p < .05, �2 = .25, time, F(7, 98) = 4.29, p < .001, �2 = .23, and a significant interaction of group × time,F(7, 98) = 2.6, p < .01, �2 = .15.

G ModelY

gtB

Bwt

D

p2dtdowdbe

2dAmisg

G

phe2t

ttacdti

&piwmp(drbewi

ARTICLE IN PRESSLMOT-1417; No. of Pages 11

C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx 9

Target bin analysis. The target bins for the early groups were Bins 2 and 3. For the early rats (top right of Fig. 3), the OXroup nose poked less than the CTL group at Bin 2, t(14) = 2.1, p = .05, d = 1.12, but did not differ reliably at target Bin 3,(14) = 1.58, p = .13, d = .84. The comparison bins were Bin 5 and Bin 6. The early rats did not differ reliably at the comparisonin 6, t(14) = 1.56, p = .13, d = .83, or at Bin 7, t(14) = 1.4, p = .18, d = .74 (see top right panel of Fig. 3 for more comparisons).

The target bins for the late groups were Bins 6 and 7. For the late groups, OX rats nose poked less than CTL rats at targetin 6; t(14) = 2.71, p = .01, d = 1.44, but did not differ reliably at target Bin 7, t(14) = .71, p = .48, d = .37. The comparison binsere Bin 2 and Bin 3. The two late groups did not differ reliably at comparison Bin 2, t(14) = 1.36, p = .72, d = .19, or Bin 3,

(14) = .84, p = .41, d = .44, (see bottom right panel of Fig. 3).

iscussion

Using the same parameters as Experiment 1, we failed to detect temporally specific decrements to either early or lateortions of X at Test 1; moreover, there was no evidence of any decrement whatsoever to X. Reacclimation (c.f., Rescorla,007) was conducted to enhance the decrement to X. Following retraining, there was clearer evidence of temporally specificeficits to X in the early group, and a more generalized deficit across X in the late group. Experiment 2 demonstrated aemporally specific reduction in responding sensitive to when the US was delivered during X (early vs. late). While theeficit seen in the Late OX group did appear to be less specific (lower right side of Fig. 3), it is important to note that probesf X during element training revealed that the late group was not timing their responses as accurately as the early groupas (left panel of Fig. 3), as would be expected by Weber’s law and scalar expectancy theory (Gibbon, 1977; see the generaliscussion for more details). Moreover, the timing functions recorded during probe trials in training and Test 2 appear toe a reflection of one another’s scalar properties, with the exception of a deficit in responding during intervals of maximalxpectancy of the US for the OX groups.

One limitation to Experiment 2 was that the decrement induced by A seemed to be isolated to the first target bin (Bin for group Early OX, and Bin 6 for group LateOX), and not the second (Bin 3 vs. Bin 7). The decrement, it seemed, quicklyissipated by the portion of X previously overlapping with the second half of 10-s A. It is possible that the decrement to

was generally incomplete, but a more feasible explanation based on Figs. 2 and 3 is that the rats quickly discovered theissing US and abruptly stopped nose-poking (this same behavior is evident during Experiment 1, see Bins US-1 and US-2

n Fig. 2). This abrupt cessation of nose poking was not sensitive to training assignment (OX vs. CTL) but, importantly, wasensitive to the rats’ timing assignment. Abrupt drops in nose poking at the later target bins, if anything, are indicative ofood timing to X, as the rats readily noticed when the US had failed to be presented.

eneral discussion

The present studies examined if an embedded procedure modified for the OX effect would reveal whether elements over-redicting sucrose at specific time periods might lead to a timed decrement in responding to X after compound trainingad occurred. To summarize the two experiments: [1] the timing of animal’s expectations reflected learning about stimulusvents in past phases, [2] the decrement induced by the OX effect was sensitive to temporal parameters (e.g., Blaisdell et al.,001), and [3] additional trials of a competing element, A (c.f., Rescorla) during compound training enhanced the decremento X (Fig. 3).

The central question of what, precisely, animals encode about each pre-trained element during OX becomes even moreantalizing given the present findings. Much like the timed response peaks described by Kirkpatrick and Church (1998), inhe two experiments, the decrement to X peaked in the moments surrounding the US delivery. Williams et al. (2008) werelso able to capture a similar timed decrement in the period in which the US was maximally expected to be omitted duringonditioned inhibition summation tests. While the decrement in responding during conditioned inhibition is thought toiffer from the response readjustment assumed to occur during OX, both lines of inquiry suggest that, at the very least, theemporal locations of US deliveries are readily encoded when compounded with other cues and that the temporal placementnfluences the timing of CR magnitude.

Our behavioral findings are not at odds with simulations from temporal difference models (Ludvig et al., 2008; Sutton Barto, 1981; Vogel et al., 2003). Ludvig et al. (2008) recently proposed an extension to the basic TD model that not onlyredicts a temporally specific OX effect, but also predicts generalization of the response decrement to nearby time steps dur-

ng the stimulus. The microstimulus TD model assumes a coarsely coded memory trace (e.g., Gaussian in form) is activatedith every stimulus onset, including the delivery of reward. Each time step in a stimulus is considered a microstimulusade up of the memory traces of cues present on a trial. Each subsequent microstimulus after stimulus onset becomes

rogressively wider and lower in maximum peak. This property is consistent with the application of Weber’s law to timingCheng, 1992; Cheng & Roberts, 1991; Gibbon, 1977) which predicts a lower signal to noise ratio with timing of longerurations. Consequently, response peaks become progressively wider and lower as the CS-US interval is extended. Ouresults indicate that response decrements show a similar pattern. In the present studies, Experiment 2 revealed a much

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

roader distribution of response decrement during the Late than Early groups. The interval from onset of X to the US deliv-ry was longer in Experiment 1 (40 s vs. 25 s in Experiment 2), but the OX decrement in Experiment 1 was reliable onlyithin a narrow temporal window (the last 5 s of Element X). The difference in the temporal specificity of the OX effect

n Group Late-OX vs. the OX group in Experiment 1 could be driven by (1) In Experiment 1, Element A was sequentially

G Model

ARTICLE IN PRESSYLMOT-1417; No. of Pages 11

10 C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx

paired with sucrose vs. simultaneously paired in Experiment 2, or (2) In Experiment 1 the US arrived after the termina-tion of the 40-s X, whereas in Experiment 2 the US was embedded within the 40-s X (25 s after onset to be specific). Ifthe non-specific decrement was driven by factor 1, it would be difficult to explain why that did not equally disrupt tim-ing in the early time period for Group Early-OX. The second factor accounts better for the results of Experiments 1 and2.

Regarding the second factor, there is reason to believe timing would be more accurate when the US follows the terminationof the element (Experiment 1) than when it is embedded within the element (Experiment 2). The US delivery following theelement would garner more attention (i.e., be more salient) and may work backward to enhance timing of US deliveryduring the element (i.e., a CS) more than if the events co-occurred. For example, the arrival of a bus I’m waiting for mayenhance my representation of the last few seconds of a bell that preceded the bus’ arrival more than if the bus had arrivedsometime during the bell. This is consistent with the microstimulus TD model with one added assumption; the salience ofan overlapping or nearby event (e.g., reward delivery) influences the height and width of microstimuli within a stimulus.Higher peaks predict more accurate timing. Consequently, in Experiment 1 the US would be expected to be more salient andthe response decrement (i.e., moments of prediction error) more temporally specific than the Late-OX group in Experiment2 (when reward delivery was embedded within the element). It is also possible that data from the Late Group simply reflect awhole-cue decrement, like that predicted by the R-W model. As such, that result does not invalidate the temporal specificityfound in the results of Experiment 1 and the Early Group of Experiment 2.

The current experiments do not eliminate generalization decrement (GD) as an alternative to the overexpectation effect.Embedding A in X may cause a deficit in later tests with X, irrespective of pairings of A with the US. We posit that a GDexplanation does not fit the current results well for the following reasons. Firstly, despite receiving the same number ofcompound trials as rats in Experiment 1, the first set of tests in Experiment 2 yielded no differences between the OX and CTLgroups. If GD was the culprit behind the response drops during X in Experiment 1, then there should have been a differenceduring the first test of X in Experiment 2. Secondly, all rats were given initial training with the elements in isolation. In atypical GD effect (e.g., overshadowing), an animal is said to fail to recognize CS A is isolation when previously experiencedonly in compound trials with another cue (e.g., AX+). In our experiments, animals were given 136 trials with X alone (X+)followed by 32 compound trials of XA+. Furthermore, the decrement to X during test was fairly reliable. Responding wasmeasured across 4 trials for two test days. This seems an unusually long amount of time for the OX group to continuetreating X differently. If the data were obtained in one single test trial or the ratio of element to compound trials werelower, then GD would seem a better explanation of the data. Thirdly, if GD were the culprit, the decrement would have beenhighest in the seconds immediately following detection that A was missing. Our bin by bin analysis found the disruptionwas not reliable in the 5-s before or after the expected onset of A in Experiment 1. Furthermore, a comparison of data fromTrial 1 to the remaining trials (Trials 2–4) did not reveal a reliable difference, F < .60, p > .75, �2 = .08. Finally, while usingthe methodologies of others should never be grounds for eliminating alternative possibilities, it is worth mentioning thatseveral recent publications have ruled out GD as an alternative explanation while others include only the control group usedin our experiments (Blaisdell et al., 2001; Kehoe & White, 2004; McNally et al., 2004; Rescorla, 1970, 1999; Sissons & Miller,2009).

The true mechanism driving OX is still being debated. Although the OX effect is thought to be the direct result of anadjustment in US expectation, much attention has also been given to the fact that the first trial of compound training is aninstance of prediction error. Instances of prediction error are common in studies targeting learning and memory whetherthe animal over anticipates pain (e.g., Blaisdell et al., 2001; Garfield & McNally, 2009; McNally et al., 2004) or the arrival offood (e.g., Rescorla, 2006, 2007). This error shares a special link with the Pavlovian phenomenon of extinction. During trial1 of extinction, for example, animals learn that an element no longer predicts a US (� = 0), even though a single US (� = 1.00)was previously anticipated. The commonality between the OX effect and extinction has not been ignored. Rescorla (2006,2007) was able to show that the OX effect behaves similarly to extinction by being equally susceptible to memory decayphenomena such as spontaneous recovery and renewal.

Conclusion

We originally posited two empirical questions: 1) Can the response decrement during OX exhibit temporal specificity?2) Is the response decrement sensitive to manipulations of when during a stimulus the OX prediction error occurred? Theresults of Experiment 1 addressed mainly the first question: is the response decrement maximal during the time surroundingpast US delivery? In Experiment 2, embedding A at varying periods of X (Early vs. Late) created different response decrementfunctions for the two timing groups, addressing the second issue of whether the decrement timing was sensitive to when,during X, the compound manipulation occurred. The ability for animals to encode time as a factor, and utilize associativeinformation across time further extends predictions derived from temporal difference models. Overexpectation, a memberof the cue-competition family, continues to be a reliable, yet perplexing phenomenon in Pavlovian conditioning.

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

References

Blaisdell, A. P., Dennison, J. C., & Miller, R. R. (2001). Recovery from the overexpectation effect: Contrasting performance-focused and acquisition focusedmodels of retrospective revaluation. Animal Learning & Behavior, 29, 367–380.

G ModelY

C

C

CCC

EGGG

K

K

K

K

K

KK

K

LLL

M

O

PRRRRR

RRSSSVW

ARTICLE IN PRESSLMOT-1417; No. of Pages 11

C.M. Ruprecht et al. / Learning and Motivation xxx (2014) xxx–xxx 11

atania, A. C. (1970). Reinforcement schedules and psychophysical judgments: A study of some temporal properties of behavior. In W. N. Schoenfeld (Ed.),The theory of reinforcement schedules (pp. 1–42). New York: Appleton-Century-Crofts.

heng, K. (1992). Three psychophysical principles in the processing of spatial and temporal information. In W. K. Honig, & J. G. Fetterman (Eds.), Cognitiveaspects of stimulus control (pp. 69–88). England: Erlbaum.

heng, K., & Roberts, W. A. (1991). Three psychophysical principles of timing in pigeons. Learning and Motivation, 22, 112–128.ole, R. P., Barnet, R. C., & Miller, R. R. (1995). Temporal encoding in trace conditioning. Animal Learning & Behavior, 23, 144–153.ollins, D. J., & Shanks, D. R. (2006). Summation in causal learning: Elemental processing or configural generalization? The Quarterly Journal of Experimental

Psychology, 59, 1524–1534.llison, G. D. (1964). Differential salivary conditioning to traces. Journal of Comparative and Physiological Psychology, 57, 373–380.arfield, J. B. B., & McNally, G. P. (2009). The effects of FG7142 on overexpectation of Pavlovian fear conditioning. Behavioral Neuroscience, 123, 75–85.ibbon, J. (1977). Scalar expectancy theory and Weber’s law in animal timing. Psychological Review, 84, 279–325.ibbon, J., Malapani, C., Dale, C. L., & Gallistel, C. R. (1997). Toward a neurobiology of temporal cognition: Advances and challenges. Current Opinion in

Neurobiology, 7, 170–184.amin, L. J. (1954). Traumatic avoidance learning: The effects of CS–US interval with a trace conditioning procedure. Journal of Comparative and Physiological

Psychology, 47, 65–72.amin, L. J. (1965). Temporal and intensity characteristics of the conditioned stimulus. In W. F. Prokasy (Ed.), Classical conditioning (pp. 118–147). New

York: Appleton-Century-Crofts.amin, L. J., & Gaioni, S. J. (1974). Compound conditioned emotional response conditioning with differentially salient elements in rats. Journal of Comparative

and Physiological Psychology, 87, 591–597.ehoe, E. J., & White, N. E. (2004). Overexpectation: Response loss during sustained stimulus compounding in the rabbit nictitating membrane preparation.

Learning & Memory, 11, 476–483.hallad, Y., & Moore, J. (1996). Blocking, unblocking, and overexpectation in autoshaping with pigeons. Journal of the Experimental Analysis of Behavior, 65,

575–591.irkpatrick, K., & Church, R. M. (1998). Are separate theories of conditioning and timing necessary? Behavioral Processes, 44, 163–182.irkpatrick, K., & Church, R. M. (2000). Stimulus and temporal cues in classical conditioning. Journal of Experimental Psychology: Animal Behavior Processes,

26, 206–219.remer, E. F. (1978). The Rescorla–Wagner Model: Losses in associative strength in compound conditioned stimuli. Journal of Experimental Psychology:

Animal Behavior Processes, 4, 22–36.attal, K. M., & Nakajima, S. (1998). Overexpectation in appetitive pavlovian and instrumental conditioning. Animal Learning & Behavior, 26(3), 351–360.eising, K. J., Sawa, K., & Blaisdell, A. P. (2007). Temporal integration in Pavlovian appetitive conditioning in rats. Learning & Behavior, 35, 11–18.udvig, E. A., Sutton, R. S., & Kehoe, E. J. (2008). Stimulus representation and the timing of reward-prediction errors in models of the dopamine system.

Neural Computation, 20, 54–3034.cNally, G. P., Pigg, M., & Weidemann, G. (2004). Blocking, unblocking, and overexpectation of fear: A role for opiod receptors in the regulation of Pavlovian

association formation. Behavioral Neuroscience, 118, 111–120.hyama, T., Gibbon, J., Deich, J., & Balsam, P. (1999). Temporal control during maintenance and extinction of keypecking in ring doves. Learning & Behavior,

27, 89–99.avlov, I. P. (1927). Conditioned reflexes. London: Oxford University Press.escorla, R. A. (1970). Reduction in the effectiveness of reinforcement after prior excitatory conditioning. Learning & Motivation, 1, 372–381.escorla, R. A. (1999). Summation and overexpectation with qualitatively different outcomes. Animal Learning & Behavior, 27(1), 50–62.escorla, R. A. (2006). Spontaneous recovery from overexpectation. Learning & Behavior, 34, 13–20.escorla, R. A. (2007). Renewal after overexpectation. Learning & Behavior, 35, 19–26.escorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H.

Black, & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64–99). New York: Appleton-Century-Crofts.oberts, S. (1981). Isolation of an internal clock. Journal of Experimental Psychology: Animal Behavioural Processes, 7, 242–268.osenthal, R., & Rosnow, R. L. (1991). Essentials of behavioral research: Methods and data analysis (2nd ed.). New York: McGraw Hill.avastano, H. I., & Miller, R. R. (1998). Time as content in Pavlovian conditioning. Behavioural Processes, 44, 147–162.

Please cite this article in press as: Ruprecht, C. M., et al. Overexpectation in the context of reward timing. Learning andMotivation (2014), http://dx.doi.org/10.1016/j.lmot.2014.01.004

issons, H. T., & Miller, R. R. (2009). Overexpectation and trial massing. Journal of Experimental Psychology: Animal Behavior Processes, 35, 186–196.utton, R. S., & Barto, A. G. (1981). Toward a modern theory of adaptive networks: Expectation and prediction. Psychological Review, 88, 135–170.ogel, E. H., Brandon, S. E., & Wagner, A. R. (2003). Stimulus representation in SOP: II An application to inhibition of delay. Behavioural Processes, 62, 27–48.illiams, D. A., Johns, J. W., & Brindas, M. (2008). Timing during inhibitory conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 34,

237–246.