[gpfsug-discuss] bizarre performance behavior
Aaron Knister
aaron.s.knister at nasa.gov
Fri Feb 17 15:52:19 GMT 2017
This is a good one. I've got an NSD server with 4x 16GB fibre
connections coming in and 1x FDR10 and 1x QDR connection going out to
the clients. I was having a really hard time getting anything resembling
sensible performance out of it (4-5Gb/s writes but maybe 1.2Gb/s for
reads). The back-end is a DDN SFA12K and I *know* it can do better than
that.
I don't remember quite how I figured this out but simply by running
"openssl speed -multi 16" on the nsd server to drive up the load I saw
an almost 4x performance jump which is pretty much goes against every
sysadmin fiber in me (i.e. "drive up the cpu load with unrelated crap to
quadruple your i/o performance").
This feels like some type of C-states frequency scaling shenanigans that
I haven't quite ironed down yet. I booted the box with the following
kernel parameters "intel_idle.max_cstate=0 processor.max_cstate=0" which
didn't seem to make much of a difference. I also tried setting the
frequency governer to userspace and setting the minimum frequency to
2.6ghz (it's a 2.6ghz cpu). None of that really matters-- I still have
to run something to drive up the CPU load and then performance improves.
I'm wondering if this could be an issue with the C1E state? I'm curious
if anyone has seen anything like this. The node is a dx360 M4
(Sandybridge) with 16 2.6GHz cores and 32GB of RAM.
-Aaron
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
More information about the gpfsug-discuss
mailing list