How to Monitor IO Performance and Disks Activities on Solaris

Mostly, we can monitor overall IO activities by using "top" in Solaris, in there, an indicator called "iowait" can be a good target to watch:

$ top
load averages:  5.47,  4.94,  5.10             orcl              19:49:02
475 processes: 147 sleeping, 1 stopped, 7 on cpu
CPU states: 38.0% idle, 37.6% user, 20.5% kernel, 13.9% iowait,  0.0% swap
Memory: 10.0G real, 1.6G free, 4.8G swap in use, 5.5G swap free
   PID USERNAME THR PR NCE  SIZE   RES STATE   TIME FLTS    CPU COMMAND
  9398 oracle     2  0   0  6.1G 50.3M sleep 124.0H    0  6.25% oracle
  9400 oracle     2  0   0  6.1G 52.0M cpu45  71.7H    0  5.42% oracle
  9402 oracle     2  0   0  6.1G 50.5M sleep  37.5H    0  4.68% oracle
...

Nowadays, "iowait" had been evolved into "wa" on Linux, let's see the definition of "wa" in Ubuntu (Ubuntu Manual: top)

wa -- iowait
Amount of time the CPU has been waiting for I/O to complete.

That is, 50% iowait means that half of processes are waiting for IO to complete in terms of CPU time.

A fine-grained monitoring is to use "iostat", you can see how busy among the disks by adding "-x" or "-xn". For example:

$ iostat -x 5
                  extended device statistics
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
...
sd10       56.0    4.9 2514.7   80.3  0.0  0.1    1.1   0   5
sd11      109.0    5.4 3585.9  109.6  0.0  0.1    0.7   0   6
sd12       76.3    4.0 2911.4   82.0  0.0  0.1    0.9   0   5
sd13       37.2    5.7 2000.3   87.3  0.0  0.1    1.6   0   4
...
                  extended device statistics
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
...
sd10        2.6   10.4   20.8   39.9  0.0  0.0    0.6   0   1
sd11       77.0    0.6 3123.2    4.8  0.0  0.0    0.5   0   3
sd12      379.4    0.2 4562.5    1.6  0.0  0.1    0.3   0   5
sd13        5.8    9.6   46.4   33.5  0.0  0.0    0.6   0   1
...
                  extended device statistics
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
...
sd10        1.6    4.4   12.8   42.1  0.0  0.0    1.3   0   1
sd11      206.8    0.8 6059.5    6.4  0.0  0.1    0.6   0   8
sd12      140.6    0.0 3659.4    0.0  0.0  0.1    0.4   0   4
sd13        2.0    4.8   16.0   45.3  0.0  0.0    1.4   0   1
...
                  extended device statistics
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
...
sd10        1.6    7.6   12.8   37.7  0.0  0.0    1.5   0   1
sd11      174.2    0.4 6462.5    3.2  0.0  0.1    0.5   0   6
sd12      613.6    0.0 7833.8    0.0  0.0  0.2    0.4   0   9
sd13        1.8    8.0   14.4   40.9  0.0  0.0    0.9   0   1
...
                  extended device statistics
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
...
sd10        1.6   16.6   12.8   52.4  0.0  0.0    0.7   0   1
sd11       33.6    0.6 1513.5    4.8  0.0  0.0    0.5   0   1
sd12      190.6    0.2 2255.9    1.6  0.0  0.1    0.4   0   3
sd13        0.6   15.6    4.8   44.4  0.0  0.0    0.4   0   1
...

It will refresh the status every 5 seconds. But the next question is how to interpret the columns. According Oracle documentation (http://docs.oracle.com/cd/E23824_01/html/821-1451/spmonitor-4.html), the definitions of the columns in "iostat -x" are:

r/s
Reads per second
w/s
Writes per second
kr/s
Kbytes read per second
kw/s
Kbytes written per second
wait
Average number of transactions that are waiting for service (queue length)
actv
Average number of transactions that are actively being serviced
svc_t
Average service time, in milliseconds
%w
Percentage of time that the queue is not empty
%b
Percentage of time that the disk is busy

That is the way we trace IO bottleneck down to specific disks for tuning.

How to Monitor IO Performance and Disks Activities on Solaris

Leave a Reply Cancel reply