Model Building and Model Selection (advanced)

41
Model Building and Model Selection [email protected] .au

Transcript of Model Building and Model Selection (advanced)

Model Building and Model Selection

[email protected]

Overview• What is a model?• How the use of models drives science• Models of forgetting

– The model– The history – Some problems

• Criterion Variability• RT

• Solutions– Ratcliff Diffusion – DSDT

• Model Selection– What we look for - "The simplist best explanation of the data"– Complexity vs Fit– Fit

• Maximum likelihood/ Deviance• Model selection

– AIC– BIC

What is a model• A somewhat simplified explanation of reality.– Possesses many of the characteristics of the phenomena it represents

– Is however in a manageable and examinable "size"

– Think of a model airplane

What is a Quantitative model

• Mathematical instantiations of key assumptions and principles embodied in the theory from which it evolved.

• A formalization of a theory that enables the exploration of its operation.

Why Modeling?• To infer the underlying structural properties of a mental process from behavioral data that were thought to have been generated by that process.

• Often entertain multiple models as possible explanations of observed data

Forgetting• People forget information over time....– When we inspect the data over time we almost ubiquetiously a non-linear negitively accelerating decline

– Defining this decline has been the quest of experimental psychology for over 100 years

• THE FORGETTING CURVE

The forgetting curve

tbfatR ,, θθ

Issues to Consider • 1. The rate at which memories fade over time– The rate question

•Decay of memory traces or interference• 2. Do all memories completly fade over time or are some memories permanent or at least very long lasting– The fate question

Models of the Forgetting Curve

1)1(),( taatR

1),( ttRweight

M odels and param eters

teaatR )1(,θ asymptotea

scale

rate

Typical Forgetting m odels Param eters

Fits to existing data

1009080706050403020100

0.5

0.4

0.3

0.2

0.1

Lag

Retention Probability

ParetoPowerExponentialObserved

Rubin Hinton and Wenzel (1999)

DataExponentialPowerPareto

0 10 20 30 400.4

0.5

0.6

0.7

0.8

0.9

Tim e (s)

Proportion C

orrect

High learning

Low learning

Wixted and Ebbesen (1997)

Model Selection Techniques- 2 types of Analysis

Fixed Effect ModelsMaximum Likelihood

1.Deviance2.BIC3.AIC

Random Effect ModelsHierarchal Bayesian Analysis via MCMC methods

1.Posterior Deviance2.BICM3.DIC

Maximum Likelihood Estimation

Model Selection Techniques

Bayesian Information Criteria (BIC)

pDAIC 2Akaike Information Criteria (AIC)

=Number of parameters

p

Deviance (D)

7

11lnlnmaxarg2

iiiiii pnNpnD

i iNpDBIC ln

Bayesian Information Criteria Monte Carlo (BICM,Raftery,2007)

Posterior Deviance Measures of Model Choice;

pV=Effective number of parameterspV=Var(D)/2

1ln i ivM NpDBIC

DpDDIC 2

Deviance Information Criteria

DDpDˆ

The Deviance Information Crtierion (DIC, Spiegelhalter et al., 2002) subtracts the deviance of the mean of the posterior parameters from the mean of the deviance to estimate the effective number of model parameters (pD)

vM pDAIC

Bayesian Information Criteria Monte Carlo (BICM)

Akaike Information Criteria Monte Carlo (AICM)

)log(max2 npVBICM

Measures of Model Choice; Raferty et al. (2007)

pVAICM 2max2

)|(log2)( ypD

pV=Effective number of parameterspV=Var(D)/2

Rubin Hinton and Wenzel (1999)

Rubin, Hinton & Wenzel (1999) Model Selection BICM

M odel Selection

M odel

BICM

26.0

26.5

27.0

27.5

28.0

28.5

29.0

Exp3 W Par Pow2 Pow3 Exp2

Monte Carlo Error

Rubin, Hinton & Wenzel (1999) Model Selection AICM

M odel Selection

M odel

AICM

26.0

26.5

27.0

27.5

28.0

28.5

29.0

Exp3 W Par Pow2 Pow3 Exp2

Rubin, Hinton & Wenzel (1999) Complexity pVCom plexity

M odel

pV

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Exp3 W Par Pow2 Pow3 Exp2

Wixted and Ebbesen (1991)

Wixted & Ebbensen (1991) Model Selection BICMM odel Selection

M odel

BICM

25.0

25.5

26.0

26.5

27.0

Exp3 Pow2 W Par Pow3 Exp2

Inclusion of asymptote strongly aids the exponential

Com plexity

M odel

pV

1.01.5

2.02.5

3.0

Exp3 Pow2 W Par Pow3 Exp2

Wixted & Ebbesen (1991) Complexity pV

Averell and Heathcote (2006)

Averell and Heathcote (2006) Model Selection BICMM odel Choice

M odel

BICM

3638

4042

4446

48

Exp3 Pow3 W Par Pow2 Exp2

Averell and Heathcote (2006) Complexity pV

M odel Choice

M odel

pV

1.01.5

2.02.5

3.0

Exp3 Pow3 W Par Pow2 Exp2

Posterior Model ProbabilityAs BICm is a Bayes Factor estimate it can be used to get posterior model probabilities

iMBICMBIC

iimim eeMp 22

For individuals and on average two parameter power & exponential models are favoured

Eab Ea Pab Pa W b W

Im plicit: pBIC

Subje

ct Average

0.0

0.2

0.4

0.6

0.8

1.0

Subjects with best m odel/32Ea : 12Pa : 20

Eab Ea Pab Pa W b W

Explicit: pBIC

Subject Average

0.0

0.2

0.4

0.6

0.8

1.0

Subjects with best m odel/32Ea : 17Pa : 11W : 4

Average ML Fits

Asymptote Power Asymptote Exponential

Misfit was systematic across individuals

0 10 20 30 40 50 60

0.00.2

0.40.6

0.81.0

Average

M inutes

p(Co

rrect)

Pareto

Average ML Fits

0 10 20 30 40 50 60

0.2

0.3

0.4

0.5

0.6

All Subjects

Lag (m inutes)

p(Co

rrect)

DataExponentialPareto

p(Exponential>Pareto) = 0.78pV(Exponential) = 2.83pV(Pareto) = 2.37

Pareto and Asymptote Exponential

Asymptote greatly aids the exponential

Averell and Heathcote (2011)

Averell and Heathcote (2011)

Averell and Heathcote (2011)

Averell and Heathcote (2011)

Posterior Predictive Distributions

Posterior Predictive Distributions