Errors in Experimental Measurements

52
1 Errors in Experimental Measurements by tarungehlot Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence intervals For means For proportions How many measurements are needed for desired error?

Transcript of Errors in Experimental Measurements

1

Errors in Experimental Measurements by tarungehlot Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence intervals

For means For proportions

How many measurements are needed for desired error?

2

Why do we need statistics?1. Noise, noise, noise, noise, noise!

OK – not really this type of noise

3

Why do we need statistics?

2. Aggregate data into meaningful information.445 446 397 226

388 3445 188 100247762 432 54 1298 345 2245 883977492 472 565 9991 34 882 545 4022827 572 597 364

...x

4

What is a statistic? “A quantity that is computed

from a sample [of data].” Merriam-Webster

→ A single number used to summarize a larger collection of values.

5

What are statistics? “A branch of mathematics dealing

with the collection, analysis, interpretation, and presentation of masses of numerical data.”

Merriam-Webster → We are most interested in analysis and interpretation here.

“Lies, damn lies, and statistics!”

6

Goals Provide intuitive conceptual

background for some standard statistical tools.

Draw meaningful conclusions in presence of noisy measurements.

Allow you to correctly and intelligently apply techniques in new situations.

→ Don’t simply plug and crank from a formula.

7

Goals Present techniques for

aggregating large quantities of data.

Obtain a big-picture view of your results.

Obtain new insights from complex measurement and simulation results.

→ E.g. How does a new feature impact the overall system?

8

Sources of Experimental Errors Accuracy, precision, resolution

9

Experimental errors Errors → noise in measured values Systematic errors

Result of an experimental “mistake” Typically produce constant or slowly varying bias

Controlled through skill of experimenter

Examples Temperature change causes clock drift Forget to clear cache before timing run

10

Experimental errors Random errors

Unpredictable, non-deterministic Unbiased → equal probability of increasing

or decreasing measured value Result of

Limitations of measuring tool Observer reading output of tool Random processes within system

Typically cannot be controlled Use statistical tools to characterize and

quantify

11

Example: Quantization → Random error

12

Quantization error Timer resolution → quantization error

Repeated measurements X ± ΔCompletely unpredictable

13

A Model of ErrorsError Measured

valueProbability

-E x – E ½

+E x + E ½

14

A Model of ErrorsError 1 Error 2 Measured

valueProbability

-E -E x – 2E ¼

-E +E x ¼

+E -E x ¼

+E +E x + 2E ¼

15

A Model of ErrorsProbability

00.10.20.30.40.50.6

x-E x x+EM easured value

16

Probability of Obtaining a Specific Measured Value

17

A Model of Errors Pr(X=xi) = Pr(measure xi)= number of paths from real value to xi

Pr(X=xi) ~ binomial distribution As number of error sources becomes large n ,→ ∞ Binomial → Gaussian (Normal)

Thus, the bell curve

18

Frequency of Measuring Specific Values

Mean of measured values

True valueResolution

Precision

Accuracy

19

Accuracy, Precision, Resolution

Systematic errors → accuracy How close mean of measured values is to true value

Random errors → precision Repeatability of measurements

Characteristics of tools → resolution Smallest increment between measured values

20

Quantifying Accuracy, Precision, Resolution Accuracy

Hard to determine true accuracy Relative to a predefined standard

E.g. definition of a “second” Resolution

Dependent on tools Precision

Quantify amount of imprecision using statistical tools

21

Confidence Interval for the Mean

c1 c2

1-α

α/2 α/2

22

Normalize x

1)(deviation standard

mean

tsmeasuremen ofnumber /

n1i

21

nxx

s

x x

nnsxxz

i

n

ii

23

Confidence Interval for the Mean Normalized z follows a Student’s t distribution (n-1) degrees of freedom Area left of c2 = 1 – α/2 Tabulated values for t

c1 c2

1-α

α/2 α/2

24

Confidence Interval for the Mean As n → ∞, normalized distribution becomes Gaussian (normal)

c1 c2

1-α

α/2 α/2

25

Confidence Interval for the Mean

1)Pr(Then,

21

1;2/12

1;2/11

cxc

nstxc

nstxc

n

n

26

An ExampleExperiment Measured value

1 8.0 s2 7.0 s3 5.0 s4 9.0 s5 9.5 s6 11.3 s7 5.2 s8 8.5 s

27

An Example (cont.)

14.2deviation standard sample

94.71

sn

xx

ni i

28

An Example (cont.)

90% CI → 90% chance actual value in interval

90% CI → α = 0.10 1 - α /2 = 0.95

n = 8 → 7 degrees of freedom

c1 c2

1-α

α/2 α/2

29

90% Confidence Intervala

n 0.90 0.95 0.975

… … … …5 1.47

62.015

2.571

6 1.440

1.943

2.447

7 1.415

1.895

2.365

… … … …∞ 1.28

21.645

1.960

4.98

)14.2(895.194.7

5.68

)14.2(895.194.7

895.195.02/10.012/1

2

1

7;95.01;

c

c

tta

na

30

95% Confidence Intervala

n 0.90 0.95 0.975

… … … …5 1.47

62.015

2.571

6 1.440

1.943

2.447

7 1.415

1.895

2.365

… … … …∞ 1.28

21.645

1.960

7.98

)14.2(365.294.7

1.68

)14.2(365.294.7

365.2975.02/10.012/1

2

1

7;975.01;

c

c

tta

na

31

What does it mean? 90% CI = [6.5, 9.4]

90% chance real value is between 6.5, 9.4

95% CI = [6.1, 9.7] 95% chance real value is between 6.1, 9.7

Why is interval wider when we are more confident?

32

Higher Confidence → Wider Interval?

6.5 9.4

90%

6.1 9.7

95%

33

Key Assumption Measurement errors are Normally distributed.

Is this true for most measurements on real computer systems?

c1 c2

1-α

α/2

α/2

34

Key Assumption Saved by the Central Limit TheoremSum of a “large number” of values from any

distribution will be Normally (Gaussian) distributed.

What is a “large number?” Typically assumed to be >≈ 6 or 7.

35

How many measurements? Width of interval inversely proportional to √n

Want to minimize number of measurements

Find confidence interval for mean, such that: Pr(actual mean in interval) = (1 – α) xexecc )1(,)1(),( 21

36

How many measurements?

22/1

2/1

2/1

21 )1(),(

exszn

exnsz

nszx

xecc

37

How many measurements? But n depends on knowing mean and standard deviation!

Estimate s with small number of measurements

Use this s to find n needed for desired interval width

38

How many measurements? Mean = 7.94 s Standard deviation = 2.14 s Want 90% confidence mean is within 7% of actual mean.

39

How many measurements? Mean = 7.94 s Standard deviation = 2.14 s Want 90% confidence mean is within 7% of actual mean.

α = 0.90 (1-α/2) = 0.95 Error = ± 3.5% e = 0.035

40

How many measurements?

9.212)94.7(035.0)14.2(895.12

2/1

exszn

213 measurements→ 90% chance true mean is within ± 3.5% interval

41

Proportions p = Pr(success) in n trials of binomial experiment

Estimate proportion: p = m/n m = number of successes n = total number of trials

42

Proportions

nppzpc

nppzpc

)1(

)1(

2/12

2/11

43

Proportions How much time does processor spend in OS?

Interrupt every 10 ms Increment counters

n = number of interrupts m = number of interrupts when PC within OS

44

Proportions How much time does processor spend in OS?

Interrupt every 10 ms Increment counters

n = number of interrupts m = number of interrupts when PC within OS

Run for 1 minute n = 6000 m = 658

45

Proportions

)1176.0,1018.0(6000)1097.01(1097.096.11097.0

)1(),( 2/121

n

ppzpcc

95% confidence interval for proportion So 95% certain processor spends 10.2-11.8% of its time in OS

46

Number of measurements for proportions

2

22/1

2/1

2/1

)()1(

)1(

)1()1(

peppzn

nppzpe

nppzppe

47

Number of measurements for proportions How long to run OS experiment? Want 95% confidence ± 0.5%

48

Number of measurements for proportions How long to run OS experiment? Want 95% confidence ± 0.5% e = 0.005 p = 0.1097

49

Number of measurements for proportions

102,247,1

)1097.0(005.0)1097.01)(1097.0()960.1(

)()1(

2

2

2

22/1

peppzn

10 ms interrupts→ 3.46 hours

50

Important Points Use statistics to

Deal with noisy measurements Aggregate large amounts of data

Errors in measurements are due to: Accuracy, precision, resolution of tools

Other sources of noise→ Systematic, random errors

51

Important Points: Model errors with bell curve

True value

Precision

Mean of measured valuesResolution

Accuracy

52

Important Points Use confidence intervals to quantify

precision Confidence intervals for

Mean of n samples Proportions

Confidence level Pr(actual mean within computed interval)

Compute number of measurements needed for desired interval width