[gpfsug-discuss] Problem Determination

Wahl, Edward ewahl at osc.edu
Fri Oct 2 19:00:46 BST 2015


I'm not yet in the 4.x release stream so this may be taken with a grain (or more) of salt as we say.

PLEASE keep the ability of commands to set -x or dump debug when the env DEBUG=1 is set.  This has been extremely useful over the years.   Granted I've never worked out why sometimes we see odd little  things like machines deciding they suddenly need an FPO license or one nsd server suddenly decides it's name is part of the FQDN instead of just it's hostname and only for certain commands, but it's DAMN useful.  Minor issues especially can be tracked down with it.

Undocumented features and logged items abound.  I'd say start there.  This is one area where it is definitely more art than science with Spectrum Scale (meh GPFS still sounds better. So does Shark. Can we go back to calling it the Shark Server Project?)

  Complete failure of the verbs layer and fallback to other defined networks would be nice to know about during operation. It's excellent about telling you at startup but not so much during operation, at least in 3.5.

 I imagine with the 'automated compatibility layer building' I'll be looking for some serious amounts of PD for the issues we _will_ see there.  We frequently build against kernels we are not yet running at this site, so this needs well documented PD and resolution.

Ed Wahl
OSC


________________________________
From: gpfsug-discuss-bounces at gpfsug.org [gpfsug-discuss-bounces at gpfsug.org] on behalf of Patrick Byrne [PATBYRNE at uk.ibm.com]
Sent: Thursday, October 01, 2015 6:09 AM
To: gpfsug-discuss at gpfsug.org
Subject: [gpfsug-discuss] Problem Determination

Hi all,

As I'm sure some of you aware, problem determination is an area that we are looking to try and make significant improvements to over the coming releases of Spectrum Scale. To help us target the areas we work to improve and make it as useful as possible I am trying to get as much feedback as I can about different problems users have, and how people go about solving them.

I am interested in hearing everything from day to day annoyances to problems that have caused major frustration in trying to track down the root cause. Where possible it would be great to hear how the problems were dealt with as well, so that others can benefit from your experience. Feel free to reply to the mailing list - maybe others have seen similar problems and could provide tips for the future - or to me directly if you'd prefer (patbyrne at uk.ibm.com).

On a related note, in 4.1.1 there was a component added that monitors the state of the various protocols that are now supported (NFS, SMB, Object). The output from this is available with the 'mmces state' and 'mmces events' CLIs and I would like to get feedback from anyone who has had the chance make use of this. Is it useful? How could it be improved? We are looking at the possibility of extending this component to cover more than just protocols, so any feedback would be greatly appreciated.

Thanks in advance,

Patrick Byrne
IBM Spectrum Scale - Development Engineer
IBM Systems - Manchester Lab
IBM UK Limited

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20151002/5fa79ef0/attachment.htm>


More information about the gpfsug-discuss mailing list