What is iowait and how does it affect Linux performance?

i/o wait or iowait, wait, wa, %iowaiteither wait% The command line is frequently performed by Linux system monitoring tools such as top, sir, atop, and others. In itself, it is one of the many performance statistics that gives us an insight into Linux system performance.

I/O waiting came up in a recent discussion with a new client of mine. During our support call, they reported load spikes of 60 to 80 on their 32 CPU core system. This results in slow page loading, timeouts, and intermittent interruptions. reason? The storage I/O bottleneck was initially indicated by a consistently high iowait and was later confirmed with additional probes.

What is I/O Waiting? How does I/O wait affect Linux server performance? How can we monitor and reduce I/O wait related issues? Continue reading for the answers to these questions.

iostat iowait example

What is I/O Waiting?

I/O waiting applies to Unix and all Unix-based systems, including macOS, FreeBSD, Solaris, and Linux.

I/O wait (iowait) is the percentage of time that the CPU (or CPU) was idle during which there were pending disk I/O requests in the system. (Source: man sar) were top man page It gives a simple explanation: “I/O wait = Time taken to wait for I/O to complete.” In other words, the presence of an I/O wait tells us that the system is idle when it can process outstanding requests.

“iowait shows the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.” , iostat man page.

When using Linux top and other tools, you will notice that a CPU (and its cores) operate in the following states: We (the user), her (system), identity (Waste), Ni (Good), And (software interrupted), Greetings! (hardware interrupted), scheduled tribe (theft) and or (Wait). Of these, the user, system, idle and wait values ​​should add up to 100%. note that “Waste” And “Wait” are not the same. “Waste” CPU means that no workload is present, whereas, on the other hand, “Wait” (iowait) Indicates when the CPU is idle waiting for outstanding requests.

If the CPU is idle, the kernel will detect any pending I/O requests (ie, SSD or NFS) originating from the CPU. If there are, the ‘iowait’ counter is incremented. If nothing is pending, the ‘idle’ counter is incremented instead.

I/O Wait and Linux Server Performance

It is important to note that iowait can, at times, indicate a bottleneck in throughput, while at other times, iowait can be completely meaningless. It is possible to have a healthy system with high iowait, but it is also possible to have a bottleneck system without iowait.

I/O waiting is one of the indicated states of your CPU/CPU cores. As such, a high iowait means that your CPU is waiting on requests, but you will need to investigate further to confirm the source and effect.

For example, server storage (SSD, NVMe, NFS, etc.) is almost always slower than CPU performance. Because of this, I/O waiting can be confusing, especially when it comes to random read/write workloads. This is because iowait only measures CPU performance, not storage I/O.

Although iowait indicates that the CPU can handle more workload depending on the workload of your server and how the load calculates or uses storage I/O, it is not always possible to resolve I/O waiting. Or it is not possible to get almost zero value.

You must decide, based on end-user experience, database query health, transaction throughput, and overall application health, whether the reported iowait indicates poor Linux system performance.

For example, if you see a decrease in iowait of 1 to 4 percent, and then you upgrade the CPU to 2x performance, iowait will also increase, 2x faster CPU = ~2x wait with the same storage performance. You’ll want to consider your workload to determine what hardware you should be looking at first.

Monitoring and reduction of I/O wait related issues

iostat -xm 2 (check for iowait)

Let’s look at some valuable tools used to monitor I/O waits on Linux.

  • atop — run it with -d option or press d To toggle the Disk Stats view.
  • iostat – try it with -xm 2 Option for extended data, in megabytes and in two-second intervals.
  • iotop – Top-Like I/O Monitor. try it with -oPa Option to show accumulated I/O of active processes only.
  • ps – Use auxfThen “STAT” under column “D” usually indicates disk iowait.
  • strace – View actual operations issued by a process. read the strace man page.
  • lsof – After identifying the responsible process, use -p [PID] To find specific files.

Minimizing I/O Waiting Issues

To reduce I/O waiting issues, follow these steps.

  • Optimize your application’s code and database queries. This can go a long way in reducing disk read/write frequency. This should be your first approach because the more efficient your application is, the less you have to spend on hardware. See all: 100 Application Performance Monitoring (APM) and Observation Solutions,
  • Keep your Linux system and software versions up-to-date. Not only is it better for security, but more often than not, the latest supported versions provide remarkable performance improvements, be it Nginx, Node.js, PHP, Python, or MySQL.
  • Make sure you have free memory available. Enough free memory so that about half of the server’s memory can be used for in-memory buffers and caches instead of swapping and paging to disk. Of course, this ratio will vary from case to case. So, make sure you are not swapping and that the kernel cache is not under pressure due to lack of free memory.
  • Tweak your system, storage device(s) and Linux kernel for better storage performance and lifetime.
  • Finally, if all else fails: upgrade the storage device to a faster SSD, NVMe, or other high throughput storage device.

conclusion

The iowait statistic is a helpful performance stat for monitoring CPU usage health. It notifies the Sysadmin when the CPU is idle and can possibly do more computations. At that point, we can use the observation, benchmarking, and tracing tools listed and linked above to put together a complete picture of the overall I/O performance of the system. your main goal should be eliminate any iowait that results from waiting directly on disk, NFS, or other storage-related I/O.

Published: August 19, 2020 | Last Updated: 28 January 2022.

Leave a Comment