[gpfsug-discuss] gpfs performance monitoring

Thu Sep 4 14:32:15 BST 2014

Sorry to bother you again but dstat have some issues with the plugin:

        [root at gss01a util]# dstat --gpfs
        /usr/bin/dstat:1672: DeprecationWarning: os.popen3 is
        deprecated.  Use the subprocess module.
           pipes[cmd] = os.popen3(cmd, 't', 0)
        Module dstat_gpfs failed to load. (global name 'select' is not
        defined)
        None of the stats you selected are available.

I found this solution , but involve dstat recompile....

https://github.com/dagwieers/dstat/issues/44

Are you aware about any easier solution (we use RHEL6.3) ?

Regards,
Salvatore

On 04/09/14 01:50, Sven Oehme wrote:
> > Hello everybody,
>
> Hi
>
> > here i come here again, this time to ask some hint about how to 
> monitor GPFS.
> >
> > I know about mmpmon, but the issue with its "fs_io_s" and "io_s" is
> > that they return number based only on the request done in the
> > current host, so i have to run them on all the clients ( over 600
> > nodes) so its quite unpractical.  Instead i would like to know from
> > the servers whats going on, and i came across the vio_s statistics
> > wich are less documented and i dont know exacly what they mean.
> > There is also this script "/usr/lpp/mmfs/samples/vdisk/viostat" that
> > runs VIO_S.
> >
> > My problems with the output of this command:
> >  echo "vio_s" | /usr/lpp/mmfs/bin/mmpmon -r 1
> >
> > mmpmon> mmpmon node 10.7.28.2 name gss01a vio_s OK VIOPS per second
> > timestamp: 1409763206/477366
> > recovery group: *
> > declustered array: *
> > vdisk: *
> > client reads: 2584229
> > client short writes: 55299693
> > client medium writes: 190071
> > client promoted full track writes:      465145
> > client full track writes: 9249
> > flushed update writes: 4187708
> > flushed promoted full track writes: 123
> > migrate operations: 114
> > scrub operations: 450590
> > log writes: 28509602
> >
> > it sais "VIOPS per second", but they seem to me just counters as
> > every time i re-run the command, the numbers increase by a bit..
> > Can anyone confirm if those numbers are counter or if they are OPS/sec.
>
> the numbers are accumulative so everytime you run them they just show 
> the value since start (or last reset) time.
>
> >
> > On a closer eye about i dont understand what most of thosevalues
> > mean. For example, what exacly are "flushed promoted full track 
> write" ??
> > I tried to find a documentation about this output , but could not
> > find any. can anyone point me a link where output of vio_s is explained?
> >
> > Another thing i dont understand about those numbers is if they are
> > just operations, or the number of blocks that was read/write/etc .
>
> its just operations and if i would explain what the numbers mean i 
> might confuse you even more because this is not what you are really 
> looking for.
> what you are looking for is what the client io's look like on the 
> Server side, while the VIO layer is the Server side to the disks, so 
> one lever lower than what you are looking for from what i could read 
> out of the description above.
>
> so the Layer you care about is the NSD Server layer, which sits on top 
> of the VIO layer (which is essentially the SW RAID Layer in GNR)
>
> > I'm asking that because if they are just ops, i don't know how much
> > they could be usefull. For example one write operation could eman
> > write 1 block or write a file of 100GB. If those are oprations,
> > there is a way to have the oupunt in bytes or blocks?
>
> there are multiple ways to get infos on the NSD layer, one would be to 
> use the dstat plugin (see /usr/lpp/mmfs/sample/util) but thats counts 
> again.
>
> the alternative option is to use mmdiag --iohist. this shows you a 
> history of the last X numbers of io operations on either the client or 
> the server side like on a client :
>
> # mmdiag --iohist
>
> === mmdiag: iohist ===
>
> I/O history:
>
>  I/O start time RW    Buf type disk:sectorNum     nSec  time ms qTime 
> ms       RpcTimes ms  Type  Device/NSD ID         NSD server
> --------------- -- ----------- ----------------- -----  ------- 
> -------- -----------------  ---- ------------------ ---------------
> 14:25:22.169617  R  LLIndBlock    1:1075622848      64   13.073   
>  0.000   12.959  0.063  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:22.182723  R       inode    1:1071252480       8    6.970  0.000 
>    6.908    0.038  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:53.659918  R  LLIndBlock    1:1081202176      64    8.309   
>  0.000    8.210    0.046  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:53.668262  R       inode    2:1081373696       8   14.117   
>  0.000   14.032    0.058  cli   C0A70402:53BEEA5E   192.167.4.2
> 14:25:53.682750  R  LLIndBlock    1:1065508736      64    9.254   
>  0.000    9.180    0.038  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:53.692019  R       inode    2:1064356608       8   14.899   
>  0.000   14.847    0.029  cli   C0A70402:53BEEA5E   192.167.4.2
> 14:25:53.707100  R       inode    2:1077830152       8   16.499   
>  0.000   16.449    0.025  cli   C0A70402:53BEEA5E   192.167.4.2
> 14:25:53.723788  R  LLIndBlock    1:1081202432      64    4.280   
>  0.000    4.203    0.040  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:53.728082  R       inode    2:1081918976       8    7.760  0.000 
>    7.710    0.027  cli   C0A70402:53BEEA5E     192.167.4.2
> 14:25:57.877416  R    metadata  2:678978560       16   13.343    0.000 
>   13.254    0.053  cli   C0A70402:53BEEA5E   192.167.4.2
> 14:25:57.891048  R  LLIndBlock    1:1065508608      64   15.491   
>  0.000   15.401  0.058  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:57.906556  R       inode    2:1083476520       8   11.723   
>  0.000   11.676    0.029  cli   C0A70402:53BEEA5E   192.167.4.2
> 14:25:57.918516  R  LLIndBlock    1:1075622720      64    8.062   
>  0.000    8.001    0.032  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:57.926592  R       inode    1:1076503480       8    8.087  0.000 
>    8.043    0.026  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:57.934856  R  LLIndBlock    1:1071088512      64    6.572   
>  0.000    6.510    0.033  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:57.941441  R       inode    2:1069885984       8   11.686   
>  0.000   11.641    0.024  cli   C0A70402:53BEEA5E   192.167.4.2
> 14:25:57.953294  R       inode    2:1083476936       8    8.951  0.000 
>    8.912    0.021  cli   C0A70402:53BEEA5E     192.167.4.2
> 14:25:57.965475  R       inode    1:1076503504       8    0.477  0.000 
>    0.053    0.000  cli   C0A70401:53BEEA7F     192.167.4.1
> 14:25:57.965755  R       inode    2:1083476488       8    0.410  0.000 
>    0.061    0.321  cli   C0A70402:53BEEA5E     192.167.4.2
> 14:25:57.965787  R       inode    2:1083476512       8    0.439  0.000 
>    0.053    0.342  cli   C0A70402:53BEEA5E     192.167.4.2
>
> you basically see if its a inode , data block , what size it has (in 
> sectors) , which nsd server you did send this request to, etc.
>
> on the Server side you see the type , which physical disk it goes to 
> and also what size of disk i/o it causes like :
>
> 14:26:50.129995  R       inode   12:3211886376      64   14.261   
>  0.000    0.000    0.000  pd   sdis
> 14:26:50.137102  R       inode   19:3003969520      64    9.004   
>  0.000    0.000    0.000  pd   sdad
> 14:26:50.136116  R       inode   55:3591710992      64   11.057   
>  0.000    0.000    0.000  pd   sdoh
> 14:26:50.141510  R       inode   21:3066810504      64    5.909   
>  0.000    0.000    0.000  pd   sdaf
> 14:26:50.130529  R       inode   89:2962370072      64   17.437   
>  0.000    0.000    0.000  pd   sddi
> 14:26:50.131063  R       inode   78:1889457000      64   17.062   
>  0.000    0.000    0.000  pd   sdsj
> 14:26:50.143403  R       inode   36:3323035688      64    4.807   
>  0.000    0.000    0.000  pd   sdmw
> 14:26:50.131044  R       inode   37:2513579736     128   17.181   
>  0.000    0.000    0.000  pd   sddv
> 14:26:50.138181  R       inode   72:3868810400      64   10.951   
>  0.000    0.000    0.000  pd   sdbz
> 14:26:50.138188  R       inode  131:2443484784     128   11.792   
>  0.000    0.000    0.000  pd   sdug
> 14:26:50.138003  R       inode  102:3696843872      64   11.994   
>  0.000    0.000    0.000  pd   sdgp
> 14:26:50.137099  R       inode  145:3370922504      64   13.225   
>  0.000    0.000    0.000  pd   sdmi
> 14:26:50.141576  R       inode   62:2668579904      64    9.313   
>  0.000    0.000    0.000  pd   sdou
> 14:26:50.134689  R       inode  159:2786164648      64   16.577   
>  0.000    0.000    0.000  pd   sdpq
> 14:26:50.145034  R       inode   34:2097217320      64    7.409   
>  0.000    0.000    0.000  pd   sdmt
> 14:26:50.138140  R       inode  139:2831038792      64   14.898   
>  0.000    0.000    0.000  pd   sdlw
> 14:26:50.130954  R       inode  164:282120312       64   22.274   
>  0.000    0.000    0.000  pd   sdzd
> 14:26:50.137038  R       inode   41:3421909608      64   16.314   
>  0.000    0.000    0.000  pd   sdef
> 14:26:50.137606  R       inode  104:1870962416      64   16.644   
>  0.000    0.000    0.000  pd   sdgx
> 14:26:50.141306  R       inode   65:2276184264      64   16.593   
>  0.000    0.000    0.000  pd   sdrk
>
>
> >
> > Last but not least.. and this is what i really would like to
> > accomplish, i would to be able to monitor the latency of metadata 
> operations.
>
> you can't do this on the server side as you don't know how much time 
> you spend on the client , network or anything between the app and the 
> physical disk, so you can only reliably look at this from the client, 
> the iohist output only shows you the Server disk i/o processing time, 
> but that can be a fraction of the overall time (in other cases this 
> obviously can also be the dominant part depending on your workload).
>
> the easiest way on the client is to run
>
> mmfsadm vfsstats enable
> from now on vfs stats are collected until you restart GPFS.
>
> then run :
>
> vfs statistics currently enabled
> started at: Fri Aug 29 13:15:05.380 2014
>   duration: 448446.970 sec
>
>  name        calls  time per call     total time
>  -------------------- -------- -------------- --------------
>  statfs          9       0.000002     0.000021
>  startIO  246191176       0.005853 1441049.976740
>
> to dump what ever you collected so far on this node.
>
> > In my environment there are users that litterally overhelm our
> > storages with metadata request, so even if there is no massive
> > throughput or huge waiters, any "ls" could take ages. I would like
> > to be able to monitor metadata behaviour. There is a way to to do
> > that from the NSD servers?
>
> not this simple as described above.
>
> >
> > Thanks in advance for any tip/help.
> >
> > Regards,
> > Salvatore_______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at gpfsug.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20140904/ba28845b/attachment.htm>