[gpfsug-discuss] Performance problems + (MultiThreadWorkInstanceCond), reason 'waiting for helper threads'

Thu Apr 18 21:55:25 BST 2019

Thanks for the information.  Since the waiters information is from one of 
the IO servers then the threads waiting for IO should be waiting for 
actual IO requests to the storage.  Seeing IO operations taking seconds 
long generally indicates your storage is not working optimally.  We would 
expect IOs to complete in sub-second time, as in some number of 
milliseconds.

You are using a record size of 16M yet you stated the file system block 
size is 1M.  Is that really what you wanted to test?  Also, you have 
included the -fsync option to gpfsperf which will impact the results.

Have you considered using the nsdperf program instead of the gpfsperf 
program?  You can find nsdperf in the samples/net directory.

One last thing I noticed was in the configuration of your management node. 
 It showed the following.

[merlindssmgt01,dssg]
prefetchPct 20
nsdRAIDTracks 128k
nsdMaxWorkerThreads 3k
nsdMinWorkerThreads 3k

To my understanding the management node has no direct access to the 
storage, that is any IO requests to the file system from the management 
node go through the IO nodes.  That being true GPFS will not make use of 
NSD worker threads on the management node.  As you can see your 
configuration is creating 3K NSD worker threads and none will be used so 
you might want to consider changing that value to 1.  It will not change 
your performance numbers but it should free up a bit of memory on the 
management node.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.

From:   "Caubet Serrabou Marc (PSI)" <marc.caubet at psi.ch>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc:     "gpfsug-discuss-bounces at spectrumscale.org" 
<gpfsug-discuss-bounces at spectrumscale.org>
Date:   04/18/2019 01:45 PM
Subject:        Re: [gpfsug-discuss] Performance problems + 
(MultiThreadWorkInstanceCond), reason 'waiting for helper threads'
Sent by:        gpfsug-discuss-bounces at spectrumscale.org

Hi,

thanks a lot. About the requested information:

* Waiters were captured with the command 'mmdiag --waiters', and it was 
performed on one of the IO (NSD) nodes.
* Connection between storage and client clusters is with Infiniband EDR. 
For the GPFS client cluster we have 3 chassis, each one has 24 blades with 
unmanaged EDR switch (24 for the blades, 12 external), and currently 10 
EDR external ports are connected for external connectivity. On the other 
hand, the GPFS storage cluster has 2 IO nodes (as commented in the 
previous e-mail, DSS G240). Each IO node has connected 4 x EDR ports. 
Regarding the Infiniband connectivty, my network contains 2 top EDR 
managed switches configured with up/down routing, connecting the unmanaged 
switches from the chassis and the 2 managed Infiniband switches for the 
storage (for redundancy).

Whenever needed I can go through PMR if this would easy the debug, no 
problem for me. I was wondering about the meaning "waiting for helper 
threads" and what could be the reason for that 

Thanks a lot for your help and best regards,
Marc 
_________________________________________
Paul Scherrer Institut 
High Performance Computing
Marc Caubet Serrabou
Building/Room: WHGA/019A
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland

Telephone: +41 56 310 46 67
E-Mail: marc.caubet at psi.ch

From: gpfsug-discuss-bounces at spectrumscale.org 
[gpfsug-discuss-bounces at spectrumscale.org] on behalf of IBM Spectrum Scale 
[scale at us.ibm.com]
Sent: Thursday, April 18, 2019 5:54 PM
To: gpfsug main discussion list
Cc: gpfsug-discuss-bounces at spectrumscale.org
Subject: Re: [gpfsug-discuss] Performance problems + 
(MultiThreadWorkInstanceCond), reason 'waiting for helper threads'

We can try to provide some guidance on what you are seeing but generally 
to do true analysis of performance issues customers should contact IBM lab 
based services (LBS).  We need some additional information to understand 
what is happening.
On which node did you collect the waiters and what command did you run to 
capture the data?
What is the network connection between the remote cluster and the storage 
cluster?

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.

From:        "Caubet Serrabou Marc (PSI)" <marc.caubet at psi.ch>
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:        04/18/2019 11:41 AM
Subject:        [gpfsug-discuss] Performance problems + 
(MultiThreadWorkInstanceCond), reason 'waiting for helper threads'
Sent by:        gpfsug-discuss-bounces at spectrumscale.org

Hi all,

I would like to have some hints about the following problem:

Waiting 26.6431 sec since 17:18:32, ignored, thread 38298 
NSPDDiscoveryRunQueueThread: on ThCond 0x7FC98EB6A2B8 
(MultiThreadWorkInstanceCond), reason 'waiting for helper threads'
Waiting 2.7969 sec since 17:18:55, monitored, thread 39736 NSDThread: for 
I/O completion
Waiting 2.8024 sec since 17:18:55, monitored, thread 39580 NSDThread: for 
I/O completion
Waiting 3.0435 sec since 17:18:55, monitored, thread 39448 NSDThread: for 
I/O completion

I am testing a new GPFS cluster (GPFS cluster client with computing nodes 
remotely mounting the Storage GPFS Cluster) and I am running 65 gpfsperf 
commands (1 command per client in parallell) as follows:

/usr/lpp/mmfs/samples/perf/gpfsperf create seq 
/gpfs/home/caubet_m/gpfsperf/$(hostname).txt -fsync -n 24g -r 16m -th 8 

I am unable to reach more than 6.5GBps (Lenovo DSS G240 GPFS 5.0.2-1, on a 
testing a 'home' filesystem with 1MB blocksize and subblocks of 8KB). 
After several seconds I see many waiters for I/O completion (up to 5 
seconds)
and also the 'waiting for helper threads' message shown above. Can 
somebody explain me the meaning for this message? How could I improve 
that?

Current config in the storage cluster is:

[root at merlindssio02 ~]# mmlsconfig 
Configuration data for cluster merlin.psi.ch:
---------------------------------------------
clusterName merlin.psi.ch
clusterId 1511090979434548295
autoload no
dmapiFileHandleSize 32
minReleaseLevel 5.0.2.0
ccrEnabled yes
nsdRAIDFirmwareDirectory /opt/lenovo/dss/firmware
cipherList AUTHONLY
maxblocksize 16m
[merlindssmgt01]
ignorePrefetchLUNCount yes
[common]
pagepool 4096M
[merlindssio01,merlindssio02]
pagepool 270089M
[merlindssmgt01,dssg]
pagepool 57684M
maxBufferDescs 2m
numaMemoryInterleave yes
[common]
prefetchPct 50
[merlindssmgt01,dssg]
prefetchPct 20
nsdRAIDTracks 128k
nsdMaxWorkerThreads 3k
nsdMinWorkerThreads 3k
nsdRAIDSmallThreadRatio 2
nsdRAIDThreadsPerQueue 16
nsdClientCksumTypeLocal ck64
nsdClientCksumTypeRemote ck64
nsdRAIDFlusherFWLogHighWatermarkMB 1000
nsdRAIDBlockDeviceMaxSectorsKB 0
nsdRAIDBlockDeviceNrRequests 0
nsdRAIDBlockDeviceQueueDepth 0
nsdRAIDBlockDeviceScheduler off
nsdRAIDMaxPdiskQueueDepth 128
nsdMultiQueue 512
verbsRdma enable
verbsPorts mlx5_0/1 mlx5_1/1
verbsRdmaSend yes
scatterBufferSize 256K
maxFilesToCache 128k
maxMBpS 40000
workerThreads 1024
nspdQueues 64
[common]
subnets 192.168.196.0/merlin-hpc.psi.ch;merlin.psi.ch
adminMode central

File systems in cluster merlin.psi.ch:
--------------------------------------
/dev/home
/dev/t16M128K
/dev/t16M16K
/dev/t1M8K
/dev/t4M16K
/dev/t4M32K
/dev/test

And for the computing cluster:

[root at merlin-c-001 ~]# mmlsconfig 
Configuration data for cluster merlin-hpc.psi.ch:
-------------------------------------------------
clusterName merlin-hpc.psi.ch
clusterId 14097036579263601931
autoload yes
dmapiFileHandleSize 32
minReleaseLevel 5.0.2.0
ccrEnabled yes
cipherList AUTHONLY
maxblocksize 16M
numaMemoryInterleave yes
maxFilesToCache 128k
maxMBpS 20000
workerThreads 1024
verbsRdma enable
verbsPorts mlx5_0/1
verbsRdmaSend yes
scatterBufferSize 256K
ignorePrefetchLUNCount yes
nsdClientCksumTypeLocal ck64
nsdClientCksumTypeRemote ck64
pagepool 32G
subnets 192.168.196.0/merlin-hpc.psi.ch;merlin.psi.ch
adminMode central

File systems in cluster merlin-hpc.psi.ch:
------------------------------------------
(none)

Thanks a lot and best regards,
Marc 
_________________________________________
Paul Scherrer Institut 
High Performance Computing
Marc Caubet Serrabou
Building/Room: WHGA/019A
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland

Telephone: +41 56 310 46 67
E-Mail: marc.caubet at psi.ch_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=YUp1yAfDFGnpxatHqsvM9LzHFt--RrMBCKoQF_Fa_zQ&s=4NBW1TmPGKAkvbymtK2QWCnLnBp-S0AVmEJxT2H1z0k&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190418/f30982a2/attachment.htm>