[gpfsug-discuss] Unexpected data in message/Bad message

Aaron Knister aaron.s.knister at nasa.gov
Wed Nov 7 23:37:37 GMT 2018


We're experiencing client nodes falling out of the cluster with errors 
that look like this:

Tue Nov  6 15:10:34.939 2018: [E] Unexpected data in message. Header 
dump: 00000000 0000 0000 00000047 00000000 00 00 0000 00000000 00000000 
0000 0000
Tue Nov  6 15:10:34.942 2018: [E] [0/0] 512 more bytes were available:
Tue Nov  6 15:10:34.965 2018: [N] Close connection to 10.100.X.X 
nsdserver1 <c0n71> (Unexpected error 120)
Tue Nov  6 15:10:34.966 2018: [E] Network error on 10.100.X.X nsdserver1 
<c0n71>, Check connectivity
Tue Nov  6 15:10:36.726 2018: [N] Restarting mmsdrserv
Tue Nov  6 15:10:38.850 2018: [E] Bad message
Tue Nov  6 15:10:38.851 2018: [X] The mmfs daemon is shutting down 
abnormally.
Tue Nov  6 15:10:38.852 2018: [N] mmfsd is shutting down.
Tue Nov  6 15:10:38.853 2018: [N] Reason for shutdown: LOGSHUTDOWN called

The cluster is running various PTF Levels of 4.1.1.

Has anyone seen this before? I'm struggling to understand what it means 
from a technical point of view. Was GPFS expecting a larger message than 
it received? Did it receive all of the bytes it expected and some of it 
was corrupt? It says "512 more bytes were available" but then doesn't 
show any additional bytes.

Thanks!

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list