[gpfsug-discuss] GPFS GUI - DataPool_capUtil error
Buterbaugh, Kevin L
Kevin.Buterbaugh at Vanderbilt.Edu
Mon Apr 9 18:17:52 BST 2018
Hi All,
I’m pretty new to using the GPFS GUI for health and performance monitoring, but am finding it very useful. I’ve got an issue that I can’t figure out. In my events I see:
Event name:pool-data_high_error
Component:File SystemEntity
type:PoolEntity
name: <redacted>
Event time:3/26/18 4:44:10 PM
Message:The pool <redacted> of file system <redacted> reached a nearly exhausted data level. DataPool_capUtilDescription:The pool reached a nearly exhausted level.
Cause:The pool reached a nearly exhausted level.
User action:Add more capacity to pool or move data to different pool or delete data and/or snapshots.
Reporting node:<redacted>
Event type:Active health state of an entity which is monitored by the system.
Now this is for a “capacity” pool … i.e. one that mmapplypolicy is going to fill up to 97% full. Therefore, I’ve modified the thresholds:
### Threshold Rules ###
rule_name metric error warn direction filterBy groupBy sensitivity
--------------------------------------------------------------------------------------------------------------------------------------------------
InodeCapUtil_Rule Fileset_inode 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_fset_name 300
MemFree_Rule mem_memfree 50000 100000 low node 300
MetaDataCapUtil_Rule MetaDataPool_capUtil 90.0 80.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300
DataCapUtil_Rule DataPool_capUtil 99.0 90.0 high gpfs_cluster_name,gpfs_fs_name,gpfs_diskpool_name 300
But it’s still in an “Error” state. I see that the time of the event is March 26th at 4:44 PM, so I’m thinking this is something that’s just stale, but I can’t figure out how to clear it. The mmhealth command shows the error, too, and from that message it appears as if the event was triggered prior to my adjusting the thresholds:
Event Parameter Severity Active Since Event Message
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
pool-data_high_error redacted ERROR 2018-03-26 16:44:10 The pool redacted of file system redacted reached a nearly exhausted data level. 90.0
What do I need to do to get the GUI / mmhealth to recognize the new thresholds and clear this error? I’ve searched and searched in the GUI for a way to clear it. I’ve read the “Monitoring and Managing IBM Spectrum Scale Using the GUI” rebook pretty much cover to cover and haven’t found anything there about how to clear this. Thanks...
Kevin
—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180409/307de958/attachment.htm>
More information about the gpfsug-discuss
mailing list