VMware ESX Guest Disk IO
Knowing the state of your disk IO latency in VMware ESX can help you pre-empt performance & capacity issues before the occur. There are a few guidelines you should keep in mind. These notes are directed towards people using directly attached storage.
- Write latency should be 0, because you have that fancy battery backed controller caching writes, right?
- Read latency should be under 8ms.
- Use the smallest stripe size possible for your RAID array setting. This helps keep random IO performance acceptable at the cost of some sequential performance.
- Do not virtualise very heavy random IO workloads on shared arrays, other guest VMs wont like you for it.
- Unless you have a very compelling reason not too, use RAID 10.
Some other notes, specific to Linux guests are:
- Mount file systems with noatime and nodiratime, this will help reduce random IO.
- Allocate enough memory to have some buffers.
- Do anything possible to stop your VM swapping heavily (see point above).
As with any system, having great monitoring and performance trending allows for you to have an excellent overview of your infrastructure. Even if you don’t have external systems for performance trending, the VMware Infrastructure client with a few tweaks will display the data you want to see.
- Login to the VI Client.
- Click on an object in the left navigation tree.
- Click on the performance tab at the top of the main display pane.
- Click the “Change Chart Options” button
- Select the Disk chart option from the left expanding menu.
- Now change the counters, pick the Latency counters and Number counters, un-ticking the KBps counters.
- Save the chart settings as disk-latency.
Now you can view in real time what is happening with your disk IO on the VMware ESX server. If you are more familiar with using a command line and Linux, you can SSH in to the ESX COS and use the command esxtop to view disk performance information.
- Launch esxtop (as root)
- Press “v”
- Press “s” and then “1”
Now you can see the per VM disk usage counters, with a 1 second sample period.
These rules of thumb are also applicable to Xen and Hyper-V.