Red Hat Customer Convergence
-
Upload
khangminh22 -
Category
Documents
-
view
0 -
download
0
Transcript of Red Hat Customer Convergence
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
1
Red Hat Customer Convergence#rhconvergence
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
2
RED HAT ENTERPRISE LINUX:
PERFORMANCE ENGINEERING
PERFORMANCE UPDATE RHEL 6/7
Douglas ShakshoberSenior Consulting Software EngineerFebruary 6, 2014
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
3 Red Hat Confidential
Red Hat Performance Engineering
Benchmarks – code path coverage
CPU – linpack, lmbench
Memory – lmbench, McCalpin Streams
Disk IO – Iozone, aiostress – scsi, FC, iSCSI
Filesystem – IOzone, postmark– ext3/4, xfs. gfs2,gluster
Network – Netperf – 10 Gbit, 40 Gbit IB, PCI3
Bare Metal, RHEL6/7 KVM
White box AMD/Intel, with our OEM partners
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
4
Red Hat Performance Engineering L
Application Performance
Linpack MPI, SPECcpu (omp) – single systems, clusters
AIM 7 – single systems, large smp
Database DB2, Oracle 11G, Sybase 15.x , MySQL, Postgres, Mongo
OLTP – metal/kvm/RHEV-M clusters - TPC-C/virt
DSS – metal/kvm/RHEV-M, IQ, TPC-H/virt
SPECsfs NFS, Postmark
SAP – SLCS, SD
STAC = FSI – trading AMQP,Reuters, Tibco, etc
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
6
Red Hat Performance R7 beta vs R6.5
● RHEL7 partner beta
− Intel in intel_idle driver - control cstate to 1 or 0
− NUMA (numa_balance), scheduler w/ large memory - 12 TB
Testing:
− CPU Performance Linpack/Stream, Java - SPECjbb− Iozone Performance w/ various filesystem +/- 3, EXT4 write issue
− Databases (Oracle, Sybase, DB2, mySQL, Postgress, SAP
• Advanced Performance Tools
− Tuna / Tuned / Perf− ISV support/request
● KVM new virtualization features
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
7
RHEL NUMA Scheduler
● RHEL6● numactl, numastat enhancements● numad – usermode tool, dynamically monitor, auto-tune
● RHEL7 beta – numabalance● 3.10-35 checked in by Rik van Riel
● Derived from Andrea Arcangeli, Mel Gorman, Peter Zijlstra, Ingo M
● Enable / Disable● echo NUMA > /sys/kernel/debug/sched_features● echo NO_NUMA > /sys/kernel/debug/sched_features
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
8
Non-Uniform Memory Access - NUMA
● The Linux system scheduler is very good at maintaining responsiveness and optimizing for CPU utilization
● Tries to use idle CPUs, regardless of where process memory is located.... Using remote memory degrades performance!
● Red Hat is working with the upstream community to increase NUMA awareness of the scheduler and to implement automatic NUMA balancing.
● Remote memory latency matters most for long-running, significant processes, e.g., HPTC, VMs, etc.
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
9
How to manage NUMA manually - Checklist
● Research NUMA topology of each system
● Make a resource plan for each system
● Bind both CPUs and Memory● Might also consider devices and IRQs
● Use numactl for native jobs:● numactl -N <nodes> -m <nodes> <workload>
● Use numatune for libvirt started guests● Edit xml: <numatune> <memory mode="strict" nodeset="1-2"/> </numatune>
● Use Cgroups w/ apps to bind cpu/mem to numa nodes
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
10
Know Your Hardware (hwloc)
Solarflare SFN6322
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
11
Numa Performance – Specjbb
3.10-54 nonuma 3.10-54 numa numactl 0
200000
400000
600000
800000
1000000
1200000
0.9
0.95
1
1.05
1.1
1.15
1.2
Multi-instance Java peak SPECjbb2005
Multi-instance Java loads fit within 1-node
4
3
2
1
%gain vs noauto
bops
(to
tal)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
12
Use numastat to see memory layout● Rewritten for RHEL to show per-node system and
process memory information
● 100% compatible with prior version by default, displaying /sys...node<n>/numastat memory allocation statistics
● Any command options invoke new functionality● -m for per-node system memory info● <pattern> for per-node process memory info
● See numastat(8)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
13
numastat - java processes w/NUMA-balance on
# numastat -c java (default scheduler – non-optimal)Per-node process memory usage (in MBs)PID Node 0 Node 1 Node 2 Node 3 Total------------ ------ ------ ------ ------ -----57501 (java) 755 1121 480 698 305457502 (java) 1068 702 573 723 306757503 (java) 649 1129 687 606 307157504 (java) 1202 678 1043 150 3073------------ ------ ------ ------ ------ -----Total 3674 3630 2783 2177 12265 # numastat -c java (numabalance close to opt)Per-node process memory usage (in MBs)PID Node 0 Node 1 Node 2 Node 3 Total------------ ------ ------ ------ ------ -----56918 (java) 49 2791 56 37 293356919 (java) 2769 76 55 32 293256920 (java) 19 55 77 2780 293256921 (java) 97 65 2727 47 2936------------ ------ ------ ------ ------ -----Total 2935 2987 2916 2896 11734
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
14
NUMA Performance – Database Single Large DB
10 20 300
100000
200000
300000
400000
500000
600000
700000
0.95
1
1.05
1.1
1.15
1.2
Postgres Sysbench OLTP
2-socket Westmere EP 24p/48 GB
3.10-54 base3.10-54 numaNumaD %
threads
tran
s/se
c
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
15
Numa Performance – Single Oracle Database
RHEL6.4 RHEL6.4 – numad 3.10-54 numa 3.10-54 no numa
RHEL7 vs RHEL6 Oracle OLTP Performance Miminize impact on large single app
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
16
RHEL7 beta Performance Tuning
● RHEL 7 beta potential tuning● tuned-adm profile throughput-performance● tuned-adm profile latency-performance (to turn cstate=1)
● NUMAbalance scheduler via ● echo NO_NUMA > /sys/kernel/debug/sched_feature
● Adjust dirty ratios back to rhel6 40 and 10● vm.dirty_ratio = 40● vm.dirty_background_ratio = 10
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
17
RHEL7 Network Features
• Overview of new Networking Features in RHEL7
• Adaptive Tickless (dynticks) Patchset
• BUSY_POLL Socket Option
• Power Management
• Tunable Workqueues
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
18
RHEL7 Networks 1/3
● IPv4 Routing Cache, bye-bye− Reduce overhead for route lookups
● Socket BUSY_POLL (aka low latency sockets)− Performance numbers later
● 127/8 is (optionally) routable now – for cloud stuff● 40G NIC support, bottleneck moves back to CPU :-)● RFS, aRFS, XPS etc● ipset is included, accelerates complex iptables rules● netsniff-ng included ... ifpps awesome
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
19
RHEL7 Networks 2/3
● SO_REUSEPORT socket option− Multiple sockets listen on same port, TCP & UDP
● Bufferbloat Avoidance – non-LAN-latency situations− TCP Small Queues (tcp_limit_output_bytes)
− CoDel and FW CoDel Packet Schedulers
● TCP Proportional Rate Reduction (PRR)− Improves reaction time of window scaling, 3-10% range
● TCP connection repair− To support LXC, stop TCP connection and restart on another host
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
20
RHEL7 Networks 3/3
● Performance Co-Pilot Support− pmatop awesome, also pmcollectl
● Per-cgroup TCP Buffer Limits− Memory pressure controls for TCP
● Stacked VLANs 802.1ad QinQ Support− Frame header includes > 1 VLAN tag
● PTP full support in 6.5 and 7.0− Requires NIC driver enablement
● Chrony offered instead of ntpd (ntpd still included)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
21
New Networking Features in RHEL7
● Linux Containers (LXC) Network Namespaces−Per-namespace sysctl tunables
● TCP Fast Open socket option−Combines first 2 steps of handshake
● TCP Tail Loss Probe−Reduce impact of lost packets (RTO ~ 15%)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
22
RHEL “tuned” package
# yum install tune*# tuned-adm profile latency-performance# tuned-adm listAvailable profiles:- latency-performance- default- enterprise-storage- virtual-guest- throughput-performance- virtual-host
Current active profile: latency-performance# tuned-adm profile default (to disable)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
23
“tuned” Profile Summary
Tunable default enterprise-storage
virtual-hostvirtual-guest
latency-performance
throughput-performance
kernel.sched_min_granularity_ns
4ms 10ms 10ms 10ms 10ms
kernel.sched_wakeup_granularity_ns
4ms 15ms 15ms 15ms 15ms
vm.dirty_ratio 20% RAM 40% 10% 40% 40%
vm.dirty_background_ratio
10% RAM 5%
vm.swappiness 60 10 30
I/O Scheduler (Elevator)CFQ deadline deadline deadline deadline deadline
Filesystem Barriers On Off Off Off
CPU Governor ondemand performance performance performance
Disk Read-ahead 4x
Disable THP Yes
CPU C-States Locked @ 1
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
24
0
50
100
150
200
250
Impact of Power Management on Latency and HighContext-Switching Workloads (storage/network)
C6 C3 C1 C0
Late
ncy
(Mic
rose
cond
s)
Current status Network off +/-3% Storage +/-5%
Future Plans Impact on Customers
R6 UDP baselineR7 UDP baseline
R6 TCP baselineR7 TCP baseline
R6 UDP lat-perfR7 UDP lat-perf
R6 TCP lat-perfR7 TCP lat-perf
0
10000
20000
30000
40000
50000
60000
net
per
f R
R T
ran
s/s
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
25
Adaptive Tickless (DynTicks) Patchset
● Goal of this patchset is to stop interrupting userspace when
● nr_running=1 (see /proc/sched_debug)
● Idea being that if runqueue depth is 1, then the scheduler
● should have nothing to do on that core
● Move all timekeeping to non-latency-sensitive cores
● Mark certain cores as full_nohz cores
● In addition to cmdline options full_nohz and rcu_nocbs− Also need to move RCU threads yourself (pgrep, taskset, tuna)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
26
Precision Time Protocol (IEEE-1588v2)
● Tech Preview in RHEL 6.4, Full Support in 6.5−Limited driver enablement in 6.4−6.5: bnx2x, tg3, e1000e, igb, ixgbe, and sfc
● Improved synchronization accuracy over NTP−PTP Hardware timestamping most accurate
• Query your NICs PTP capabilities: ethtool -T p1p1
● Improve time sync by disabling tickless kernel−nohz=off−Increased power consumption
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
27
Precision Time Protocol (IEEE-1588v2)
nohz=off
nohz=on
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
28
Adaptive Tickless (DynTicks) Patchset
● Reading:
−https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt
−http://lwn.net/Articles/549580/
−http://www.youtube.com/watch?v=G3jHP9kNjwc
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
29
Timeline of a tick...tick...tick...RHEL5
jiffies jiffies+1 jiffies+2 jiffies+3 jiffies+4
Userspace Task Timer Interrupt
Time
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
30
Timeline of a tick...tick...tick...RHEL6 and 7 CONFIG_NO_HZ
jiffies jiffies+1 jiffies+2 jiffies+3 jiffies+4
Userspace Task Timer Interrupt Idle
Time
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
31
Timeline of a tick...tick...tick...RHEL7CONFIG_NO_HZ_FULL
jiffies jiffies+1 jiffies+2 jiffies+3 jiffies+4
Userspace Task Timer Interrupt
Time
Tickless doesn't require idle
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
32
Examining the tick 1/3
# egrep 'CPU|LOC' /proc/interrupts
# perf list|grep local_timer
irq_vectors:local_timer_entry [Tracepoint event]
irq_vectors:local_timer_exit [Tracepoint event]
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
33
Examining the tick 2/3
# perf stat -C 1 -e irq_vectors:local_timer_entry sleep 1
9 irq_vectors:local_timer_entry
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 /root/pig -s 1
1,002 irq_vectors:local_timer_entry
Reboot with full_nohz=1 rcu_nocbs=1
# tuna -c 1 -i ; tuna -q \* -c 1 -i
# perf stat -C 1 -e irq_vectors:local_timer_entry taskset -c 1 /root/pig -s 1
5 irq_vectors:local_timer_entry
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
34
Examining the tick 3/3 (debugfs)
# mount -t debugfs nodev /sys/kernel/debug# cd /sys/kernel/debug/tracing# echo 1 > events/irq_vectors/enable# cat trace# tracer: nop## entries-in-buffer/entries-written: 432/432 #P:8## _-----=> irqs-off# / _----=> need-resched# | / _---=> hardirq/softirq# || / _--=> preempt-depth# ||| / delay# TASK-PID CPU# |||| TIMESTAMP FUNCTION# | | | |||| | | <idle>-0 [007] dNh. 22793.558298: reschedule_entry: vector=253 <idle>-0 [007] dNh. 22793.558299: reschedule_exit: vector=253 <idle>-0 [000] d.h. 22793.558969: local_timer_entry: vector=239 <idle>-0 [000] d.h. 22793.558977: local_timer_exit: vector=239 <idle>-0 [000] d.H. 22793.558980: irq_work_entry: vector=246 <idle>-0 [000] dNH. 22793.558983: irq_work_exit: vector=246 <idle>-0 [000] d.h. 22793.559970: local_timer_entry: vector=239 <idle>-0 [000] d.h. 22793.559977: local_timer_exit: vector=239...
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
35
NUMA Topology and PCI Bus● Servers may have more than 1 PCI bus.
● Install adapters “close” to the CPU that will run the performance critical application.
● When BIOS reports locality, irqbalance handles NUMA/IRQ affinity automatically.
42:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
# cat /sys/devices/pci0000\:40/0000\:40\:03.0/0000\:42\:00.0/local_cpulist
1,3,5,7,9,11,13,15
# dmesg | grep "NUMA node"
pci_bus 0000:00: on NUMA node 0 (pxm 1)
pci_bus 0000:40: on NUMA node 1 (pxm 2)
pci_bus 0000:3f: on NUMA node 0 (pxm 1)
pci_bus 0000:7f: on NUMA node 1 (pxm 2)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
36
Performance Projects / Tooling
● RHEL6.5 “numad” “tuna”, and “tuned”
● Tuna used to bind IRQ's / real-time like isolation
● Profiling challenges
−Data address profiling (cache-2-cache detection), providing:• the hottest contended cachelines
• the process names, addresses, pids, tids causing that contention
• the cpus they ran on,
• and how the cacheline is being accessed (read or write)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
37
“tuned” Profile Summary
Tunable default enterprise-storage
virtual-hostvirtual-guest
latency-performance
throughput-performance
kernel.sched_min_granularity_ns
4ms 10ms 10ms 10ms 10ms
kernel.sched_wakeup_granularity_ns
4ms 15ms 15ms 15ms 15ms
vm.dirty_ratio 20% RAM 40% 10% 40% 40%
vm.dirty_background_ratio
10% RAM 5%
vm.swappiness 60 10 30
I/O Scheduler (Elevator)CFQ deadline deadline deadline deadline deadline
Filesystem Barriers On Off Off Off
CPU Governor ondemand performance performance performance
Disk Read-ahead 4x
Disable THP Yes
CPU C-States Locked @ 1
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
38
Iozone Performance Effect of TUNED
ext3 ext4 xfs gfs20
500
1000
1500
2000
2500
3000
3500
4000
4500
RHEL6.4 File System In Cache Performance
Intel Large File I/O (iozone)
not tuned
tuned
Th
rou
gh
pu
t in
MB
/Se
c
ext3 ext4 xfs gfs20
100
200
300
400
500
600
700
800
RHEL6.4 File System Out of Cache Performance
Intel Large File I/O (iozone)
not tuned
tuned
Th
rou
gh
pu
t in
MB
/Se
c
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
39
System Tuning Tool - tuna
• Tool for fine grained control
• Display applications / processes
• Displays CPU enumeration
• Socket (useful for NUMA tuning)
• Dynamic control of tuning
• Process affinity
• Parent & threads
• Scheduling policy
• Device IRQ priorities, etc
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
41
Network Tuning: IRQ affinity● irqbalance for the common case – disable to tune
● New irqbalance automates NUMA affinity for IRQs
● Flow-Steering Technologies
● Move 'p1p1*' IRQs to Socket 1:
● Service irqbalance stop
# tuna -q p1p1* -S1 -m -x
# tuna -Q | grep p1p1
● Manual IRQ pinning for the last X percent/determinism
● Guide on Red Hat Customer Portal
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
42
CPU affinity for IRQs
CPU affinity for PIDs Scheduler Policy Scheduler Priority
Tuna IRQ/CPU affinity context menus
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
43
RHEL6.5 and RHEL7 Virt Performance
RHEL 6.5● Virtio dataplane, 4TB mem limit
RHEL 7● NUMA balance code
● KVM pvticketed_spinlocks, ACPIv
Large Guest Perf● NUMA in a guest, ACPIv, New 4TB mem limit
RHEV 3.3 (Based on RHEL 6.5)
● New memory overcommit manager – MOM
● Network QOS, Native Gluster (libgfapi)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
44
RHEL7 w/ ticketed spinlocks 3.10.0-12.el7 pvticketlocks.x86_64 – note R6 unfair-locks
1 2 4 8 120
20
40
60
80
100
120
140
39
39.5
40
40.5
41
41.5
42
Linpack NxN 20000x20016
Westmere 12core, 64 GB mem, pvticketlocks
Bare-metalnoticketedpvticketed%diff
gflo
ps
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
45
RH 3.10 OLTP Performance
RHEL63 – all nodes 3.6.0-0.24.autonuma28fast.test.x86_64 3.6.10-2.tlw16upstream.fc17.x86_640
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
0.94
0.96
0.98
1
1.02
1.04
1.06
1.08
1.1
R7 / F17 OLTP w/ spinlock backoff
(perf74, 4-socket, 512 GB, 2 FC clarion
80U100Udelta
TP
M
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
46
RH/IBM Top virtualized benchmarks
● SPECvirt2010/2012
● IBM SAP SD 2-tier bare metal / virtualized results− IBM System x3850 X5, 4 socket 40 core 80 thread system− Bare metal 12,560 SD users, KVM (80 CPU guest) 10,700− 85% of bare metal
● IBM TPC-C – World Record w/ DB2
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
47
Virtualization Benchmarks
SPECvirt_sc2013
− Increased workload injection rates
− Multi vcpu guests
• All one vcpu guests in SPECvirt_sc2010
− Up to four tiles using the same database VM
TPC-VMS
− Three independent TPC-C, TPC-H, TPC-E, or TPC-DS benchmarks
• running simultaneously
− Metric is lowest of the three scores
− Large vcpu count guests
− Large disk IO requirements
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
48
SPECvirt2010: RHEL 6 KVM Post Industry Leading Results
http://www.spec.org/virt_sc2010/results/
Virtualization Layer and HardwareBlue = Disk I/OGreen = Network I/O
Client HardwareSystem Under Test (SUT)
> 1 SPECvirt Tile/core> 1 SPECvirt Tile/core
Key Enablers: SR-IOV
Huge Pages
NUMA
Node Binding
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
49
VMware ESX 4.1 HP DL380 G7 (12 Cores, 78 VMs)
RHEL 6 (KVM) IBM HS22V (12 Cores, 84 VMs)
VMware ESXi 5.0 HP DL385 G7 (16 Cores, 102 VMs)
RHEV 3.1 HP DL380p gen8 (16 Cores,150 VMs)
VMware ESXi 4.1 HP BL620c G7 (20 Cores, 120 VMs)
RHEL 6 (KVM) IBM HX5 w/ MAX5 (20 Cores, 132 VMs)
VMware ESXi 4.1 HP DL380 G7 (12 Cores, 168 Vms)
VMware ESXi 4.1 IBM x3850 X5 (40 Cores, 234 VMs)
RHEL 6 (KVM) HP DL580 G7 (40 Cores, 288 VMs)
RHEL 6 (KVM) IBM x3850 X5 (64 Cores,336 VMs)
RHEL 6 (KVM) HP DL980 G7 (80 Cores, 552 VMs)
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
1,221 1,367 1,570
2,4421,878 2,144
2,742
3,824
4,6825,467
8,956
Best SPECvirt_sc2010 Scores by CPU Cores
(As of May 30, 2013)
System
SP
EC
virt
_sc
201
0 sc
ore
Comparison based on best performing Red Hat and VMware solutions by cpu core count published at www.spec.org as of May 17, 2013. SPEC® and the benchmark name SPECvir_sct® are registered trademarks of the Standard Performance Evaluation Corporation. For more information about SPECvirt_sc2010, see www.spec.org/virt_sc2010/.
2-socket 162-socket 12
2-socket 20
4-socket 40
8-socket 64/80
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
50
KVM / RHS Tuning
● gluster volume set <volume> group virt
● XFS mkfs -n size=8192, mount inode64, noatime
● RHS server: tuned-adm profile rhs-virtualization
● Increase in readahead, lower dirty ratio's ● KVM host: tuned-adm profile virtual-host
● Better response time shrink guest block device queue● /sys/block/vda/queue/nr_request (16 or 8)
● Best sequential read throughput, raise VM read-ahead● /sys/block/vda/queue/read_ahead_kb (4096/8192)
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
51
Iozone Performance Comparison RHS2.1/XFS w RHEV
rnd-write rnd-read seq-write seq-read0
1000
2000
3000
4000
5000
6000
7000
Out-of-the-box tuned rhs-virtualization
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
52
RHEL6 Performance Tuning Summary
● Use “Tuned”, “NumaD” and “Tuna” in RHEL6.x ● Tuned selects the deadline IO elevator
● Power savings mode (performance), locked (latency)
● Transparent Hugepages for annon memory (monitor it)
● Multi-instance consider NUMAD
● Virtualization – virtio drivers, consider SR-IOV
● Manually Tune● NUMA – via numactl, monitor numastat -c pid
● Huge Pages – static hugepages for pinned shared-memory
● Managing VM, dirty ratio and swappiness tuning
● Use cgroups for further access control
● Perf and Tuna examples in appendix
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
53
Helpful Links
● Red Hat Low Latency Performance Tuning Guide
● Optimizing RHEL Performance by Tuning IRQ Affinity
● Red Hat Performance Tuning Guide
● Red Hat Virtualization Tuning Guide
● STAC Network I/O SIG
● Finteligent Low Latency Tuning w/KVM
RED HAT CONFIDENTIAL | DOUGLAS SHAKSHOBER#rhconvergence
54
Questions