[gpfsug-discuss] Baseline testing GPFS with gpfsperf
Kumaran Rajaram
kums at us.ibm.com
Wed Jul 26 18:37:45 BST 2017
Hi Scott,
>>- Should the number of threads equal the number of NSDs for the file
system? or equal to the number of nodes?
>>- If I execute a large multi-threaded run of this tool from a single
node in the cluster, will that give me an accurate result of the
performance of the file system?
To add to Valdis's note, the answer to above also depends on the node,
network used for GPFS communication between client and server, as well as
storage performance capabilities constituting the GPFS
cluster/network/storage stack.
As an example, if the storage subsystem (including controller + disks)
hosting the file-system can deliver ~20 GB/s and the networking between
NSD client and server is FDR 56Gb/s Infiniband (with verbsRdma = ~6GB/s).
Assuming, one FDR-IB link (verbsPorts) is configured per NSD server as
well as client, then you could need minimum of 4 x NSD servers (4 x 6GB/s
==> 24 GB/s) to saturate the backend storage. So, you would need to run
gpfsperf (or anyother parallel I/O benchmark) across minimum of 4 x GPFS
NSD clients to saturate the backend storage. You can scale the gpfsperf
thread counts (-th parameter) depending on access pattern (buffered/dio
etc) but this would only be able to drive load from single NSD client
node. If you would like to drive I/O load from multiple NSD client nodes +
synchronize the parallel runs across multiple nodes for accuracy, then
gpfsperf-mpi would be strongly recommended. You would need to use MPI to
launch gpfsperf-mpi across multiple NSD client nodes and scale the MPI
processes (across NSD clients with 1 or more MPI process per NSD client)
accordingly to drive the I/O load for good performance.
>>The cluster that I will be running this tool on will not have MPI
installed and will have multiple file systems in the cluster.
Without MPI, alternative would be to use ssh or pdsh to launch gpfsperf
across multiple nodes however if there are slow NSD clients then the
performance may not be accurate (slow clients taking longer and after
faster clients finished it will get all the network/storage resources
skewing the performance analysis. You may also consider using parallel
Iozone as it can be run across multiple node using rsh/ssh with
combination of "-+m" and "-t" option.
http://iozone.org/docs/IOzone_msword_98.pdf
##
-+m filename
Use this file to obtain the configuration informati
on of the clients for cluster testing. The file
contains one line for each client. Each line has th
ree fields. The fields are space delimited. A #
sign in column zero is a comment line. The first fi
eld is the name of the client. The second field is
the path, on the client, for the working directory
where Iozone will execute. The third field is the
path, on the client, for the executable Iozone.
To use this option one must be able to execute comm
ands on the clients without being challenged
for a password. Iozone will start remote execution
by using “rsh"
To use ssh, export RSH=/usr/bin/ssh
-t #
Run Iozone in a throughput mode. This option allows
the user to specify how
many threads or processes to have active during th
e measurement.
##
Hope this helps,
-Kums
From: valdis.kletnieks at vt.edu
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 07/25/2017 07:59 PM
Subject: Re: [gpfsug-discuss] Baseline testing GPFS with gpfsperf
Sent by: gpfsug-discuss-bounces at spectrumscale.org
On Tue, 25 Jul 2017 15:46:45 -0500, "Scott C Batchelder" said:
> - Should the number of threads equal the number of NSDs for the file
> system? or equal to the number of nodes?
Depends on what definition of "throughput" you are interested in. If your
configuration has 50 clients banging on 5 NSD servers, your numbers for 5
threads and 50 threads are going to tell you subtly different things...
(Basically, one thread per NSD is going to tell you the maximum that
one client can expect to get with little to no contention, while one
per client will tell you about the maximum *aggregate* that all 50
can get together - which is probably still giving each individual client
less throughput than one-to-one....)
We usually test with "exactly one thread total", "one thread per server",
and "keep piling the clients on till the total number doesn't get any
bigger".
Also be aware that it only gives you insight to your workload performance
if
your workload is comprised of large file access - if your users are
actually
doing a lot of medium or small files, that changes the results
dramatically
as you end up possibly pounding on metadata more than the actual data....
[attachment "att0twxd.dat" deleted by Kumaran Rajaram/Arlington/IBM]
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170726/5122f7ce/attachment.htm>
More information about the gpfsug-discuss
mailing list