[gpfsug-discuss] Baseline testing GPFS with gpfsperf

Wed Jul 26 18:37:45 BST 2017

Hi Scott,

>>- Should the number of threads equal the number of NSDs for the file 
system? or equal to the number of nodes? 
>>- If I execute a large multi-threaded run of this tool from a single 
node in the cluster, will that give me an accurate result of the 
performance of the file system?  

To add to Valdis's note,  the answer to above also depends on the node, 
network used for GPFS communication between client and server, as well as 
storage performance capabilities constituting the GPFS 
cluster/network/storage stack. 

As an example, if the storage subsystem (including controller + disks) 
hosting the file-system can deliver ~20 GB/s and the networking between 
NSD client and server is FDR 56Gb/s Infiniband (with verbsRdma = ~6GB/s). 
Assuming, one FDR-IB link (verbsPorts) is configured per NSD server as 
well as client, then you could need minimum of 4 x NSD servers (4 x 6GB/s 
==> 24 GB/s) to saturate the backend storage.  So, you would need to run 
gpfsperf (or anyother parallel I/O benchmark) across minimum of 4 x GPFS 
NSD clients to saturate the backend storage.  You can scale the gpfsperf 
thread counts (-th parameter) depending on access pattern (buffered/dio 
etc) but this would only be able to drive load from single NSD client 
node. If you would like to drive I/O load from multiple NSD client nodes + 
synchronize the parallel runs across multiple nodes for accuracy, then 
gpfsperf-mpi would be strongly recommended. You would need to use MPI to 
launch gpfsperf-mpi across multiple NSD client nodes and scale the MPI 
processes (across NSD clients with 1 or more MPI process per NSD client) 
accordingly to drive the I/O load for good performance. 

>>The cluster that I will be running this tool on will not have MPI 
installed and will have multiple file systems in the cluster. 

Without MPI, alternative would be to use ssh or pdsh to launch gpfsperf 
across multiple nodes however if there are slow NSD clients then the 
performance may not be accurate (slow clients taking longer and after 
faster clients finished it will get all the network/storage resources 
skewing the performance analysis. You may also consider using parallel 
Iozone as it can be run across multiple node using rsh/ssh with 
combination of  "-+m" and "-t" option. 

http://iozone.org/docs/IOzone_msword_98.pdf

##
-+m filename 
Use this file to obtain the configuration informati
on of the clients for cluster testing. The file 
contains one line for each client. Each line has th
ree fields. The fields are space delimited. A # 
sign in column zero is a comment line. The first fi
eld is the name of the client. The second field is 
the path, on the client, for the working directory 
where Iozone will execute. The third field is the 
path, on the client, for the executable Iozone. 
To use this option one must be able to execute comm
ands on the clients without being challenged 
for a password. Iozone will start remote execution 
by using “rsh"

To use ssh, export RSH=/usr/bin/ssh 

-t #
Run Iozone in a throughput mode. This option allows
 the user to specify how 
many threads or processes to have active during  th
e measurement.
##

Hope this helps,
-Kums

From:   valdis.kletnieks at vt.edu
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   07/25/2017 07:59 PM
Subject:        Re: [gpfsug-discuss] Baseline testing GPFS with gpfsperf
Sent by:        gpfsug-discuss-bounces at spectrumscale.org

On Tue, 25 Jul 2017 15:46:45 -0500, "Scott C Batchelder" said:

> - Should the number of threads equal the number of NSDs for the file
> system? or equal to the number of nodes?

Depends on what definition of "throughput" you are interested in. If your
configuration has 50 clients banging on 5 NSD servers, your numbers for 5
threads and 50 threads are going to tell you subtly different things...

(Basically, one thread per NSD is going to tell you the maximum that
one client can expect to get with little to no contention, while one
per client will tell you about the maximum *aggregate* that all 50
can get together - which is probably still giving each individual client
less throughput than one-to-one....)

We usually test with "exactly one thread total", "one thread per server",
and "keep piling the clients on till the total number doesn't get any 
bigger".

Also be aware that it only gives you insight to your workload performance 
if
your workload is comprised of large file access - if your users are 
actually
doing a lot of medium or small files, that changes the results 
dramatically
as you end up possibly pounding on metadata more than the actual data....
[attachment "att0twxd.dat" deleted by Kumaran Rajaram/Arlington/IBM] 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170726/5122f7ce/attachment.htm>