Identifying sources of Operating System Jitter through fine-grained kernel instrumentation

18
September 19, 2007 © 2007 IBM Corporation Identifying Sources of Operating System Jitter Through Fine-Grained Kernel Instrumentation Pradipta De, Ravi Kothari, Vijay Mann IBM Research, New Delhi, India IEEE Cluster 2007

Transcript of Identifying sources of Operating System Jitter through fine-grained kernel instrumentation

September 19, 2007 © 2007 IBM Corporation

Identifying Sources of Operating System Jitter Through Fine-Grained Kernel Instrumentation

Pradipta De, Ravi Kothari, Vijay MannIBM Research, New Delhi, India

IEEE Cluster 2007

© 2007 IBM Corporation2 September 19, 2007

IBM India Research Laboratory

Measuring OS Jitter: ProblemOS Jitter: interference due to scheduling of daemons, handling of interrupts

Can cause up to 100% performance degradation at 4096 proc (Petrini, et.al SC 03)

Low jitter systems: Use of specialized kernels on compute nodes

Identification of jitter sources is important for

creation of light weight versions of commodity operating systems

tuning “out of the box” commodity OSes for HPC applications

detecting new jitter sources that get introduced with software upgrades

Very little information available about the biggest contributors to OS Jitter

Few tools available that can measure impact of various sources of OS Jitter

Administrators resort to existing knowledge for tuning systems

error prone as new sources of OS Jitter get introduced when systems are patched

© 2007 IBM Corporation3 September 19, 2007

IBM India Research Laboratory

Contributions of this paper

Design and implementation of a tool that can be used to

identify sources of OS jitter and measure their impact

compare various system configurations in terms of their jitter impact

detect new sources of jitter that get introduced with time

detect patterns of scheduling (that can lead to jitter)

Experimental results that point to the biggest contributors of OS jitter on off

the shelf Linux run level 3 (Fedora core 5)

Validation of the methodology through introduction of synthetic daemons

© 2007 IBM Corporation4 September 19, 2007

IBM India Research Laboratory

Methodology

1. Instrument the kernel to record start and end times (in memory) for all processes

and interrupts – a kernel patch (Linux 2.6.17.7 and 2.6.20.7 kernels)

2. Expose the kernel data structures to user level applications through device driver

memory map

3. Run a user level micro benchmark that executes several rounds

4. Analyze the user level histogram and the scheduler/interrupt trace data

� generate a master histogram where the samples consists of

� runtime of all the processes that caused the user level benchmark to get descheduled

� runtime of all interrupts that occurred when the user level benchmark was running

� Ideally, the user level histogram and master histogram should match (if all interruptions

experienced by the user level benchmark are due to a context switch or an interrupt being

handled)

© 2007 IBM Corporation5 September 19, 2007

IBM India Research Laboratory

iptr=mmap(interrupt_device_file);

sptr=mmap(scheduler_device_file);

/* start of kernel-level tracing, iptr,sptr=>memory mapped pointers interrupt and scheduler device driver files*/

start_scheduler_index = sptr->current index, start_interrupt_index = iptr->current index;

for (i = 0 to N) do

ts[i] = rdtsc(); /* critical section */

end for

end_scheduler_index = sptr->current index, end_interrupt_index = iptr->current index;

for start_scheduler_index : end_scheduler_index do

read_and_print_to_file(start time, end time, process name);

end for

for start_interrupt_index : end_interrupt_index do

read_and_print_to_file(start time, end time, interrupt name);

end for

/* calculation of difference of successive samples - timestamp deltas*/

for i = 0 to N-1 do

ts[i] = (ts[i+1]-ts[i]);

end for

/* generation of user level histogram from the user-level delay data */

add_to_distribution (ts);

Step 1

Step 2

Step 3

User level micro benchmark

© 2007 IBM Corporation6 September 19, 2007

IBM India Research Laboratory

Experiments – 2.8 GHz Intel Xeon, 512 KB Cache, 1GB RAM

Experiment 1 – identifying all sources of jitter on Linux run level 3

x axis – interruptions in us, y axis – logarithmic function of the number of samples in a bucket (frequency)

-4

-3

-2

-1

0

1

1 10 100 1000 10000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

master_from_tracing

-4

-3

-2

-1

0

1

1 10 100 1000 10000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

user_level_benchmark

© 2007 IBM Corporation7 September 19, 2007

IBM India Research Laboratory

Experiment 1 – overall picture

0.002036.4936.49136.8636.34xfs

0.010164.24164.241164.43160.76runparts

0.014781.35219.2573.08381.9757.44smartd

0.028545.5396.28132.093134.3130.07atd

0.0219563.9472.12236.062349.48117.92cupsd

0.062773.011232.6851.362458.3248.1syslogd

0.095460.711641.891.2118164.436.01crond

0.22291.584022.335.477357.670.92watchdog0

0.2519435.054595.28353.4813363.31348.69hald

0.38787.986843.79.71705110.164.7kedac

0.403531.187169.6239.61181154.150.92kjournald

0.456298.978127.1899.1182220.175.83kblockd0

0.493210.18836.8256.65156160.755.67init

0.585064.5110446.9971.07147364.316.27idmapd

0.906389.9716213.35100.7161220.174.93pdflush

1.308714.4723323.57146.69159364.315.19sendmail

1.418703.5225383.71162.72156173.854.98automount

1.511584.6627115.424.17112280.357.12eth0

1.964452.2335152.0281.18433101.334.65events0

4.051256.2472569.821.57336464.7410.92ide1

5.193708.5792892.5561.031522364.314.52haldaddonstor

7.965985.53142494.64106.581337220.174.52python

9.353450.97167375.2149.173404364.314.52hidd

63.27828.11131522.3614.7769971042.059.74timer

total jitter %std dev (us)total jitter (us)

mean

(us)frequency

Highest

Interruption (us)

Lowest

Interruption (us)Noise Source

© 2007 IBM Corporation8 September 19, 2007

IBM India Research Laboratory

Experiment 1 – who contributes where

zooming in on peaks around 11-13 us

othersevents0ide1timerhaldaddonstorhiddpython

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

11.1 11.2 11.3 11.5 11.6 11.7 11.8 12 12.1 12.2 12.4 12.5 12.6 12.8 12.9

To

tal

Ev

ents

Mean time for a bucket (microsec)

© 2007 IBM Corporation9 September 19, 2007

IBM India Research Laboratory

zooming in on peaks around 100-110 us

othersevents0ide1timerhaldaddonstorhiddpython

0

10

20

30

40

50

60

70

80

101 102 104 105 107

To

tal

Ev

en

ts

Mean time for a bucket (microsec)

Experiment 1 – who contributes where

© 2007 IBM Corporation10 September 19, 2007

IBM India Research Laboratory

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAM

Experiment 2 – Introduction of synthetic daemons to verify

methodology

A. one synthetic daemon with a period of 10 seconds and execution time of

~2300 us

B. two synthetic daemons – one with a period of 10 seconds and another one

with 10.5 seconds and each having an execution time of ~2300 us

C. two synthetic daemons – both with a period of 10 seconds and an

execution time of ~1100 us each

© 2007 IBM Corporation11 September 19, 2007

IBM India Research Laboratory

-4

-3

-2

-1

0

1

10 100 1000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

1_synthetic_daemon

-4

-3

-2

-1

0

1

10 100 1000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

def_run_level3

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2A – Introduction of synthetic daemons to verify methodology

one synthetic daemon with a period of 10 seconds and execution time of ~2300 us

comparison of master distribution with that of the default run level 3 shows the synthetic daemon

© 2007 IBM Corporation12 September 19, 2007

IBM India Research Laboratory

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2A – Introduction of synthetic daemons to verify methodology

one synthetic daemon with a period of 10 seconds and execution time of ~2300 us

zooming in on the peak around 2000-3000 us

dummydaemon_1_python_dummydaemon_1_hidd_dummydaemon_1_rpc.idmapd_init_hidd_kedac_dummydaemon_1_dummydaemon_1_

0

5

10

15

20

25

30

2,308.281 2,329.581 2,351.956 2,382.862 2,449.985 2,479.753

To

tal

Ev

ents

Mean time for a bucket (microsec)

© 2007 IBM Corporation13 September 19, 2007

IBM India Research Laboratory

-4

-3

-2

-1

0

10 100 1000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

2_daemons_diff_period

-4

-3

-2

-1

0

10 100 1000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

def_run_level3

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2B – Introduction of synthetic daemons to verify methodology

two synthetic daemons – one with a period of 10 seconds and another one with 10.5 seconds and each having an execution time of ~2300 us

comparison of master distribution with that of the default run level 3 shows the synthetic daemon

© 2007 IBM Corporation14 September 19, 2007

IBM India Research Laboratory

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2B – Introduction of synthetic daemons to verify methodology

two synthetic daemons – one with a period of 10 seconds and another one with 10.5 seconds and each having an execution time of ~2300 us

zooming in on the peak around 2000-3000 us

hidd_python_sendmail_sendmail_sendmail_dummydaemon_1_hidd_watchdog0_dummydaemon_1_events0_dummydaemon_1_dummydaemon_1_python_dummydaemon_2_dummydaemon_1_

0

5

10

15

20

25

30

2,184.851 2,215.267 2,249.291 2,298.867 2,333.678 2,416.380

Tota

l E

vents

Mean time for a bucket (microsec)

© 2007 IBM Corporation15 September 19, 2007

IBM India Research Laboratory

-4

-3

-2

-1

0

10 100 1000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

2_daemons_same_period

-4

-3

-2

-1

0

10 100 1000

log1

0 [F

(X)]

X: time in us

Parzen Window distribution for multiple processes

def_run_level3

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAM

Experiment 2C – Introduction of synthetic daemons to verify methodology

two synthetic daemons – both with a period of 10 seconds and an execution time of ~1100 us each

© 2007 IBM Corporation16 September 19, 2007

IBM India Research Laboratory

Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2C – Introduction of synthetic daemons to verify methodology

two synthetic daemons – both with a period of 10 seconds and an execution time of ~1100 us each

dummydaemon_1_dummydaemon_2_init_rpc.idmapd_haldaddonstor_hidd_kedac_dummydaemon_1_dummydaemon_2_dummydaemon_1_dummydaemon_2_hidd_watchdog0_dummydaemon_1_dummydaemon_2_dummydaemon_1_dummydaemon_2_

0

2

4

6

8

10

12

14

16

2,148.49 2,176.39 2,206.38 2,216.85 2,233.24 2,272.39 2,358.83

Tota

l E

vents

Mean time for a bucket (microsec)

© 2007 IBM Corporation17 September 19, 2007

IBM India Research Laboratory

Conclusions and Future Work

Design and implementation of a tool that can be used to

identify sources of OS jitter and measure their impact

compare various system configurations in terms of their jitter impact

detect new sources of jitter that get introduced with time

detect patterns of scheduling (that can lead to jitter)

Future Work and Work in Progress

The jitter traces collected using the tool provide valuable information

can be used to model jitter which is representative of a particular configuration

can help predict the scaling behavior a particular cluster running a particular

OS (and a particular configuration)

© 2007 IBM Corporation18 September 19, 2007

IBM India Research Laboratory

Thank you!