[gpfsug-discuss] gpfs performance monitoring

Orlando Richards orlando.richards at ed.ac.uk
Thu Sep 4 14:54:37 BST 2014



On 04/09/14 14:32, Salvatore Di Nardo wrote:
> Sorry to bother you again but dstat have some issues with the plugin:
>
>         [root at gss01a util]# dstat --gpfs
>         /usr/bin/dstat:1672: DeprecationWarning: os.popen3 is
>         deprecated.  Use the subprocess module.
>            pipes[cmd] = os.popen3(cmd, 't', 0)
>         Module dstat_gpfs failed to load. (global name 'select' is not
>         defined)
>         None of the stats you selected are available.
>
> I found this solution , but involve dstat recompile....
>
> https://github.com/dagwieers/dstat/issues/44
>
> Are you aware about any easier solution (we use RHEL6.3) ?
>

This worked for me the other day on a dev box I was poking at:

# rm /usr/share/dstat/dstat_gpfsops*

# cp /usr/lpp/mmfs/samples/util/dstat_gpfsops.py.dstat.0.7 
/usr/share/dstat/dstat_gpfsops.py

# dstat --gpfsops
/usr/bin/dstat:1672: DeprecationWarning: os.popen3 is deprecated.  Use 
the subprocess module.
   pipes[cmd] = os.popen3(cmd, 't', 0)
---------------------------gpfs-vfs-ops--------------------------#-----------------------------gpfs-disk-i/o-----------------------------
   cr   del  op/cl   rd    wr  trunc fsync looku gattr sattr other mb_rd 
mb_wr  pref wrbeh steal clean  sync revok logwr logda oth_r oth_w
    0     0     0     0     0     0     0     0     0     0     0     0 
     0     0     0     0     0     0     0     0     0     0     0

...



>
> Regards,
> Salvatore
>
> On 04/09/14 01:50, Sven Oehme wrote:
>> > Hello everybody,
>>
>> Hi
>>
>> > here i come here again, this time to ask some hint about how to
>> monitor GPFS.
>> >
>> > I know about mmpmon, but the issue with its "fs_io_s" and "io_s" is
>> > that they return number based only on the request done in the
>> > current host, so i have to run them on all the clients ( over 600
>> > nodes) so its quite unpractical.  Instead i would like to know from
>> > the servers whats going on, and i came across the vio_s statistics
>> > wich are less documented and i dont know exacly what they mean.
>> > There is also this script "/usr/lpp/mmfs/samples/vdisk/viostat" that
>> > runs VIO_S.
>> >
>> > My problems with the output of this command:
>> >  echo "vio_s" | /usr/lpp/mmfs/bin/mmpmon -r 1
>> >
>> > mmpmon> mmpmon node 10.7.28.2 name gss01a vio_s OK VIOPS per second
>> > timestamp: 1409763206/477366
>> > recovery group: *
>> > declustered array: *
>> > vdisk: *
>> > client reads: 2584229
>> > client short writes: 55299693
>> > client medium writes: 190071
>> > client promoted full track writes:      465145
>> > client full track writes: 9249
>> > flushed update writes: 4187708
>> > flushed promoted full track writes: 123
>> > migrate operations: 114
>> > scrub operations: 450590
>> > log writes: 28509602
>> >
>> > it sais "VIOPS per second", but they seem to me just counters as
>> > every time i re-run the command, the numbers increase by a bit..
>> > Can anyone confirm if those numbers are counter or if they are OPS/sec.
>>
>> the numbers are accumulative so everytime you run them they just show
>> the value since start (or last reset) time.
>>
>> >
>> > On a closer eye about i dont understand what most of thosevalues
>> > mean. For example, what exacly are "flushed promoted full track
>> write" ??
>> > I tried to find a documentation about this output , but could not
>> > find any. can anyone point me a link where output of vio_s is explained?
>> >
>> > Another thing i dont understand about those numbers is if they are
>> > just operations, or the number of blocks that was read/write/etc .
>>
>> its just operations and if i would explain what the numbers mean i
>> might confuse you even more because this is not what you are really
>> looking for.
>> what you are looking for is what the client io's look like on the
>> Server side, while the VIO layer is the Server side to the disks, so
>> one lever lower than what you are looking for from what i could read
>> out of the description above.
>>
>> so the Layer you care about is the NSD Server layer, which sits on top
>> of the VIO layer (which is essentially the SW RAID Layer in GNR)
>>
>> > I'm asking that because if they are just ops, i don't know how much
>> > they could be usefull. For example one write operation could eman
>> > write 1 block or write a file of 100GB. If those are oprations,
>> > there is a way to have the oupunt in bytes or blocks?
>>
>> there are multiple ways to get infos on the NSD layer, one would be to
>> use the dstat plugin (see /usr/lpp/mmfs/sample/util) but thats counts
>> again.
>>
>> the alternative option is to use mmdiag --iohist. this shows you a
>> history of the last X numbers of io operations on either the client or
>> the server side like on a client :
>>
>> # mmdiag --iohist
>>
>> === mmdiag: iohist ===
>>
>> I/O history:
>>
>>  I/O start time RW    Buf type disk:sectorNum     nSec  time ms qTime
>> ms       RpcTimes ms  Type  Device/NSD ID         NSD server
>> --------------- -- ----------- ----------------- -----  -------
>> -------- -----------------  ---- ------------------ ---------------
>> 14:25:22.169617  R  LLIndBlock    1:1075622848      64   13.073
>>  0.000   12.959  0.063  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:22.182723  R       inode    1:1071252480       8    6.970  0.000
>>    6.908    0.038  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:53.659918  R  LLIndBlock    1:1081202176      64    8.309
>>  0.000    8.210    0.046  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:53.668262  R       inode    2:1081373696       8   14.117
>>  0.000   14.032    0.058  cli   C0A70402:53BEEA5E   192.167.4.2
>> 14:25:53.682750  R  LLIndBlock    1:1065508736      64    9.254
>>  0.000    9.180    0.038  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:53.692019  R       inode    2:1064356608       8   14.899
>>  0.000   14.847    0.029  cli   C0A70402:53BEEA5E   192.167.4.2
>> 14:25:53.707100  R       inode    2:1077830152       8   16.499
>>  0.000   16.449    0.025  cli   C0A70402:53BEEA5E   192.167.4.2
>> 14:25:53.723788  R  LLIndBlock    1:1081202432      64    4.280
>>  0.000    4.203    0.040  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:53.728082  R       inode    2:1081918976       8    7.760  0.000
>>    7.710    0.027  cli   C0A70402:53BEEA5E     192.167.4.2
>> 14:25:57.877416  R    metadata  2:678978560       16   13.343    0.000
>>   13.254    0.053  cli   C0A70402:53BEEA5E   192.167.4.2
>> 14:25:57.891048  R  LLIndBlock    1:1065508608      64   15.491
>>  0.000   15.401  0.058  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:57.906556  R       inode    2:1083476520       8   11.723
>>  0.000   11.676    0.029  cli   C0A70402:53BEEA5E   192.167.4.2
>> 14:25:57.918516  R  LLIndBlock    1:1075622720      64    8.062
>>  0.000    8.001    0.032  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:57.926592  R       inode    1:1076503480       8    8.087  0.000
>>    8.043    0.026  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:57.934856  R  LLIndBlock    1:1071088512      64    6.572
>>  0.000    6.510    0.033  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:57.941441  R       inode    2:1069885984       8   11.686
>>  0.000   11.641    0.024  cli   C0A70402:53BEEA5E   192.167.4.2
>> 14:25:57.953294  R       inode    2:1083476936       8    8.951  0.000
>>    8.912    0.021  cli   C0A70402:53BEEA5E     192.167.4.2
>> 14:25:57.965475  R       inode    1:1076503504       8    0.477  0.000
>>    0.053    0.000  cli   C0A70401:53BEEA7F     192.167.4.1
>> 14:25:57.965755  R       inode    2:1083476488       8    0.410  0.000
>>    0.061    0.321  cli   C0A70402:53BEEA5E     192.167.4.2
>> 14:25:57.965787  R       inode    2:1083476512       8    0.439  0.000
>>    0.053    0.342  cli   C0A70402:53BEEA5E     192.167.4.2
>>
>> you basically see if its a inode , data block , what size it has (in
>> sectors) , which nsd server you did send this request to, etc.
>>
>> on the Server side you see the type , which physical disk it goes to
>> and also what size of disk i/o it causes like :
>>
>> 14:26:50.129995  R       inode   12:3211886376      64   14.261
>>  0.000    0.000    0.000  pd   sdis
>> 14:26:50.137102  R       inode   19:3003969520      64    9.004
>>  0.000    0.000    0.000  pd   sdad
>> 14:26:50.136116  R       inode   55:3591710992      64   11.057
>>  0.000    0.000    0.000  pd   sdoh
>> 14:26:50.141510  R       inode   21:3066810504      64    5.909
>>  0.000    0.000    0.000  pd   sdaf
>> 14:26:50.130529  R       inode   89:2962370072      64   17.437
>>  0.000    0.000    0.000  pd   sddi
>> 14:26:50.131063  R       inode   78:1889457000      64   17.062
>>  0.000    0.000    0.000  pd   sdsj
>> 14:26:50.143403  R       inode   36:3323035688      64    4.807
>>  0.000    0.000    0.000  pd   sdmw
>> 14:26:50.131044  R       inode   37:2513579736     128   17.181
>>  0.000    0.000    0.000  pd   sddv
>> 14:26:50.138181  R       inode   72:3868810400      64   10.951
>>  0.000    0.000    0.000  pd   sdbz
>> 14:26:50.138188  R       inode  131:2443484784     128   11.792
>>  0.000    0.000    0.000  pd   sdug
>> 14:26:50.138003  R       inode  102:3696843872      64   11.994
>>  0.000    0.000    0.000  pd   sdgp
>> 14:26:50.137099  R       inode  145:3370922504      64   13.225
>>  0.000    0.000    0.000  pd   sdmi
>> 14:26:50.141576  R       inode   62:2668579904      64    9.313
>>  0.000    0.000    0.000  pd   sdou
>> 14:26:50.134689  R       inode  159:2786164648      64   16.577
>>  0.000    0.000    0.000  pd   sdpq
>> 14:26:50.145034  R       inode   34:2097217320      64    7.409
>>  0.000    0.000    0.000  pd   sdmt
>> 14:26:50.138140  R       inode  139:2831038792      64   14.898
>>  0.000    0.000    0.000  pd   sdlw
>> 14:26:50.130954  R       inode  164:282120312       64   22.274
>>  0.000    0.000    0.000  pd   sdzd
>> 14:26:50.137038  R       inode   41:3421909608      64   16.314
>>  0.000    0.000    0.000  pd   sdef
>> 14:26:50.137606  R       inode  104:1870962416      64   16.644
>>  0.000    0.000    0.000  pd   sdgx
>> 14:26:50.141306  R       inode   65:2276184264      64   16.593
>>  0.000    0.000    0.000  pd   sdrk
>>
>>
>> >
>> > Last but not least.. and this is what i really would like to
>> > accomplish, i would to be able to monitor the latency of metadata
>> operations.
>>
>> you can't do this on the server side as you don't know how much time
>> you spend on the client , network or anything between the app and the
>> physical disk, so you can only reliably look at this from the client,
>> the iohist output only shows you the Server disk i/o processing time,
>> but that can be a fraction of the overall time (in other cases this
>> obviously can also be the dominant part depending on your workload).
>>
>> the easiest way on the client is to run
>>
>> mmfsadm vfsstats enable
>> from now on vfs stats are collected until you restart GPFS.
>>
>> then run :
>>
>> vfs statistics currently enabled
>> started at: Fri Aug 29 13:15:05.380 2014
>>   duration: 448446.970 sec
>>
>>  name        calls  time per call     total time
>>  -------------------- -------- -------------- --------------
>>  statfs          9       0.000002     0.000021
>>  startIO  246191176       0.005853 1441049.976740
>>
>> to dump what ever you collected so far on this node.
>>
>> > In my environment there are users that litterally overhelm our
>> > storages with metadata request, so even if there is no massive
>> > throughput or huge waiters, any "ls" could take ages. I would like
>> > to be able to monitor metadata behaviour. There is a way to to do
>> > that from the NSD servers?
>>
>> not this simple as described above.
>>
>> >
>> > Thanks in advance for any tip/help.
>> >
>> > Regards,
>> > Salvatore_______________________________________________
>> > gpfsug-discuss mailing list
>> > gpfsug-discuss at gpfsug.org
>> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

-- 
             --
        Dr Orlando Richards
Research Facilities (ECDF) Systems Leader
        Information Services
    IT Infrastructure Division
        Tel: 0131 650 4994
      skype: orlando.richards

The University of Edinburgh is a charitable body, registered in 
Scotland, with registration number SC005336.



More information about the gpfsug-discuss mailing list