Identifying sources of Operating System Jitter through fine-grained kernel instrumentation
-
Upload
independent -
Category
Documents
-
view
7 -
download
0
Transcript of Identifying sources of Operating System Jitter through fine-grained kernel instrumentation
September 19, 2007 © 2007 IBM Corporation
Identifying Sources of Operating System Jitter Through Fine-Grained Kernel Instrumentation
Pradipta De, Ravi Kothari, Vijay MannIBM Research, New Delhi, India
IEEE Cluster 2007
© 2007 IBM Corporation2 September 19, 2007
IBM India Research Laboratory
Measuring OS Jitter: ProblemOS Jitter: interference due to scheduling of daemons, handling of interrupts
Can cause up to 100% performance degradation at 4096 proc (Petrini, et.al SC 03)
Low jitter systems: Use of specialized kernels on compute nodes
Identification of jitter sources is important for
creation of light weight versions of commodity operating systems
tuning “out of the box” commodity OSes for HPC applications
detecting new jitter sources that get introduced with software upgrades
Very little information available about the biggest contributors to OS Jitter
Few tools available that can measure impact of various sources of OS Jitter
Administrators resort to existing knowledge for tuning systems
error prone as new sources of OS Jitter get introduced when systems are patched
© 2007 IBM Corporation3 September 19, 2007
IBM India Research Laboratory
Contributions of this paper
Design and implementation of a tool that can be used to
identify sources of OS jitter and measure their impact
compare various system configurations in terms of their jitter impact
detect new sources of jitter that get introduced with time
detect patterns of scheduling (that can lead to jitter)
Experimental results that point to the biggest contributors of OS jitter on off
the shelf Linux run level 3 (Fedora core 5)
Validation of the methodology through introduction of synthetic daemons
© 2007 IBM Corporation4 September 19, 2007
IBM India Research Laboratory
Methodology
1. Instrument the kernel to record start and end times (in memory) for all processes
and interrupts – a kernel patch (Linux 2.6.17.7 and 2.6.20.7 kernels)
2. Expose the kernel data structures to user level applications through device driver
memory map
3. Run a user level micro benchmark that executes several rounds
4. Analyze the user level histogram and the scheduler/interrupt trace data
� generate a master histogram where the samples consists of
� runtime of all the processes that caused the user level benchmark to get descheduled
� runtime of all interrupts that occurred when the user level benchmark was running
� Ideally, the user level histogram and master histogram should match (if all interruptions
experienced by the user level benchmark are due to a context switch or an interrupt being
handled)
© 2007 IBM Corporation5 September 19, 2007
IBM India Research Laboratory
iptr=mmap(interrupt_device_file);
sptr=mmap(scheduler_device_file);
/* start of kernel-level tracing, iptr,sptr=>memory mapped pointers interrupt and scheduler device driver files*/
start_scheduler_index = sptr->current index, start_interrupt_index = iptr->current index;
for (i = 0 to N) do
ts[i] = rdtsc(); /* critical section */
end for
end_scheduler_index = sptr->current index, end_interrupt_index = iptr->current index;
for start_scheduler_index : end_scheduler_index do
read_and_print_to_file(start time, end time, process name);
end for
for start_interrupt_index : end_interrupt_index do
read_and_print_to_file(start time, end time, interrupt name);
end for
/* calculation of difference of successive samples - timestamp deltas*/
for i = 0 to N-1 do
ts[i] = (ts[i+1]-ts[i]);
end for
/* generation of user level histogram from the user-level delay data */
add_to_distribution (ts);
Step 1
Step 2
Step 3
User level micro benchmark
© 2007 IBM Corporation6 September 19, 2007
IBM India Research Laboratory
Experiments – 2.8 GHz Intel Xeon, 512 KB Cache, 1GB RAM
Experiment 1 – identifying all sources of jitter on Linux run level 3
x axis – interruptions in us, y axis – logarithmic function of the number of samples in a bucket (frequency)
-4
-3
-2
-1
0
1
1 10 100 1000 10000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
master_from_tracing
-4
-3
-2
-1
0
1
1 10 100 1000 10000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
user_level_benchmark
© 2007 IBM Corporation7 September 19, 2007
IBM India Research Laboratory
Experiment 1 – overall picture
0.002036.4936.49136.8636.34xfs
0.010164.24164.241164.43160.76runparts
0.014781.35219.2573.08381.9757.44smartd
0.028545.5396.28132.093134.3130.07atd
0.0219563.9472.12236.062349.48117.92cupsd
0.062773.011232.6851.362458.3248.1syslogd
0.095460.711641.891.2118164.436.01crond
0.22291.584022.335.477357.670.92watchdog0
0.2519435.054595.28353.4813363.31348.69hald
0.38787.986843.79.71705110.164.7kedac
0.403531.187169.6239.61181154.150.92kjournald
0.456298.978127.1899.1182220.175.83kblockd0
0.493210.18836.8256.65156160.755.67init
0.585064.5110446.9971.07147364.316.27idmapd
0.906389.9716213.35100.7161220.174.93pdflush
1.308714.4723323.57146.69159364.315.19sendmail
1.418703.5225383.71162.72156173.854.98automount
1.511584.6627115.424.17112280.357.12eth0
1.964452.2335152.0281.18433101.334.65events0
4.051256.2472569.821.57336464.7410.92ide1
5.193708.5792892.5561.031522364.314.52haldaddonstor
7.965985.53142494.64106.581337220.174.52python
9.353450.97167375.2149.173404364.314.52hidd
63.27828.11131522.3614.7769971042.059.74timer
total jitter %std dev (us)total jitter (us)
mean
(us)frequency
Highest
Interruption (us)
Lowest
Interruption (us)Noise Source
© 2007 IBM Corporation8 September 19, 2007
IBM India Research Laboratory
Experiment 1 – who contributes where
zooming in on peaks around 11-13 us
othersevents0ide1timerhaldaddonstorhiddpython
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
11.1 11.2 11.3 11.5 11.6 11.7 11.8 12 12.1 12.2 12.4 12.5 12.6 12.8 12.9
To
tal
Ev
ents
Mean time for a bucket (microsec)
© 2007 IBM Corporation9 September 19, 2007
IBM India Research Laboratory
zooming in on peaks around 100-110 us
othersevents0ide1timerhaldaddonstorhiddpython
0
10
20
30
40
50
60
70
80
101 102 104 105 107
To
tal
Ev
en
ts
Mean time for a bucket (microsec)
Experiment 1 – who contributes where
© 2007 IBM Corporation10 September 19, 2007
IBM India Research Laboratory
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAM
Experiment 2 – Introduction of synthetic daemons to verify
methodology
A. one synthetic daemon with a period of 10 seconds and execution time of
~2300 us
B. two synthetic daemons – one with a period of 10 seconds and another one
with 10.5 seconds and each having an execution time of ~2300 us
C. two synthetic daemons – both with a period of 10 seconds and an
execution time of ~1100 us each
© 2007 IBM Corporation11 September 19, 2007
IBM India Research Laboratory
-4
-3
-2
-1
0
1
10 100 1000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
1_synthetic_daemon
-4
-3
-2
-1
0
1
10 100 1000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
def_run_level3
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2A – Introduction of synthetic daemons to verify methodology
one synthetic daemon with a period of 10 seconds and execution time of ~2300 us
comparison of master distribution with that of the default run level 3 shows the synthetic daemon
© 2007 IBM Corporation12 September 19, 2007
IBM India Research Laboratory
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2A – Introduction of synthetic daemons to verify methodology
one synthetic daemon with a period of 10 seconds and execution time of ~2300 us
zooming in on the peak around 2000-3000 us
dummydaemon_1_python_dummydaemon_1_hidd_dummydaemon_1_rpc.idmapd_init_hidd_kedac_dummydaemon_1_dummydaemon_1_
0
5
10
15
20
25
30
2,308.281 2,329.581 2,351.956 2,382.862 2,449.985 2,479.753
To
tal
Ev
ents
Mean time for a bucket (microsec)
© 2007 IBM Corporation13 September 19, 2007
IBM India Research Laboratory
-4
-3
-2
-1
0
10 100 1000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
2_daemons_diff_period
-4
-3
-2
-1
0
10 100 1000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
def_run_level3
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2B – Introduction of synthetic daemons to verify methodology
two synthetic daemons – one with a period of 10 seconds and another one with 10.5 seconds and each having an execution time of ~2300 us
comparison of master distribution with that of the default run level 3 shows the synthetic daemon
© 2007 IBM Corporation14 September 19, 2007
IBM India Research Laboratory
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2B – Introduction of synthetic daemons to verify methodology
two synthetic daemons – one with a period of 10 seconds and another one with 10.5 seconds and each having an execution time of ~2300 us
zooming in on the peak around 2000-3000 us
hidd_python_sendmail_sendmail_sendmail_dummydaemon_1_hidd_watchdog0_dummydaemon_1_events0_dummydaemon_1_dummydaemon_1_python_dummydaemon_2_dummydaemon_1_
0
5
10
15
20
25
30
2,184.851 2,215.267 2,249.291 2,298.867 2,333.678 2,416.380
Tota
l E
vents
Mean time for a bucket (microsec)
© 2007 IBM Corporation15 September 19, 2007
IBM India Research Laboratory
-4
-3
-2
-1
0
10 100 1000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
2_daemons_same_period
-4
-3
-2
-1
0
10 100 1000
log1
0 [F
(X)]
X: time in us
Parzen Window distribution for multiple processes
def_run_level3
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAM
Experiment 2C – Introduction of synthetic daemons to verify methodology
two synthetic daemons – both with a period of 10 seconds and an execution time of ~1100 us each
© 2007 IBM Corporation16 September 19, 2007
IBM India Research Laboratory
Experiments – 2.8 GHz Xeon, 512 KB Cache, 1GB RAMExperiment 2C – Introduction of synthetic daemons to verify methodology
two synthetic daemons – both with a period of 10 seconds and an execution time of ~1100 us each
dummydaemon_1_dummydaemon_2_init_rpc.idmapd_haldaddonstor_hidd_kedac_dummydaemon_1_dummydaemon_2_dummydaemon_1_dummydaemon_2_hidd_watchdog0_dummydaemon_1_dummydaemon_2_dummydaemon_1_dummydaemon_2_
0
2
4
6
8
10
12
14
16
2,148.49 2,176.39 2,206.38 2,216.85 2,233.24 2,272.39 2,358.83
Tota
l E
vents
Mean time for a bucket (microsec)
© 2007 IBM Corporation17 September 19, 2007
IBM India Research Laboratory
Conclusions and Future Work
Design and implementation of a tool that can be used to
identify sources of OS jitter and measure their impact
compare various system configurations in terms of their jitter impact
detect new sources of jitter that get introduced with time
detect patterns of scheduling (that can lead to jitter)
Future Work and Work in Progress
The jitter traces collected using the tool provide valuable information
can be used to model jitter which is representative of a particular configuration
can help predict the scaling behavior a particular cluster running a particular
OS (and a particular configuration)