[gpfsug-discuss] GPFS autoload - wait for IB portstobecomeactive

Jan-Frode Myklebust janfrode at tanso.net
Fri Apr 27 09:40:44 BST 2018


Alternative solution we're trying...

Create the file /etc/systemd/system/gpfs.service.d/delay.conf containing:

[Service]
  ExecStartPre=/bin/sleep 60


Then I expect we should have long enough delay for infiniband to start
before starting gpfs..



  -jf


On Fri, Mar 16, 2018 at 1:05 PM, Frederick Stock <stockf at us.ibm.com> wrote:

> I have my doubts that mmdiag can be used in this script.  In general the
> guidance is to avoid or be very careful with mm* commands in a callback due
> to the potential for deadlock.
>
> Fred
> __________________________________________________
> Fred Stock | IBM Pittsburgh Lab | 720-430-8821
> stockf at us.ibm.com
>
>
>
> From:        Jan-Frode Myklebust <janfrode at tanso.net>
> To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Date:        03/16/2018 04:30 AM
>
> Subject:        Re: [gpfsug-discuss] GPFS autoload - wait for IB ports
>      tobecomeactive
> Sent by:        gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Thanks Olaf, but we don't use NetworkManager on this cluster..
>
> I now created this simple script:
>
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> -------------------------------------
> #! /bin/bash -
> #
> # Fail mmstartup if not all configured IB ports are active.
> #
> # Install with:
> #
> # mmaddcallback fail-if-ibfail --command /var/mmfs/etc/fail-if-ibfail
> --event preStartup --sync --onerror shutdown
> #
>
> for port in $(/usr/lpp/mmfs/bin/mmdiag --config|grep verbsPorts | cut -f
> 4- -d " ")
> do
> grep  -q ACTIVE /sys/class/infiniband/${port%/*}/ports/${port##*/}/state
> || exit 1
> done
> ------------------------------------------------------------
> ------------------------------------------------------------
> -------------------------------------
>
> which I haven't tested, but assume should work. Suggestions for
> improvements would be much appreciated!
>
>
>
>   -jf
>
>
> On Thu, Mar 15, 2018 at 6:30 PM, Olaf Weiser <*olaf.weiser at de.ibm.com*
> <olaf.weiser at de.ibm.com>> wrote:
>
> you can try :
> systemctl enable  NetworkManager-wait-online
> ln -s '/usr/lib/systemd/system/NetworkManager-wait-online.service'
> '/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online.
> service'
>
> in many cases .. it helps ..
>
>
>
>
>
> From:        Jan-Frode Myklebust <*janfrode at tanso.net*
> <janfrode at tanso.net>>
> To:        gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org*
> <gpfsug-discuss at spectrumscale.org>>
> Date:        03/15/2018 06:18 PM
> Subject:        Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to
>        becomeactive
> Sent by:        *gpfsug-discuss-bounces at spectrumscale.org*
> <gpfsug-discuss-bounces at spectrumscale.org>
> ------------------------------
>
>
>
> I found some discussion on this at
> *https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25*
> <https://www.ibm.com/developerworks/community/forums/html/threadTopic?id=77777777-0000-0000-0000-000014471957&ps=25>and
> there it's claimed that none of the callback events are early enough to
> resolve this. That we need a pre-preStartup trigger. Any idea if this has
> changed -- or is the callback option then only to do a "--onerror
> shutdown" if it has failed to connect IB ?
>
>
> On Thu, Mar 8, 2018 at 1:42 PM, Frederick Stock <*stockf at us.ibm.com*
> <stockf at us.ibm.com>> wrote:
> You could also use the GPFS prestartup callback (mmaddcallback) to execute
> a script synchronously that waits for the IB ports to become available
> before returning and allowing GPFS to continue.  Not systemd integrated but
> it should work.
>
> Fred
> __________________________________________________
> Fred Stock | IBM Pittsburgh Lab | *720-430-8821* <(720)%20430-8821>
> *stockf at us.ibm.com* <stockf at us.ibm.com>
>
>
>
> From:        *david_johnson at brown.edu* <david_johnson at brown.edu>
> To:        gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org*
> <gpfsug-discuss at spectrumscale.org>>
> Date:        03/08/2018 07:34 AM
> Subject:        Re: [gpfsug-discuss] GPFS autoload - wait for IB ports to
> become        active
> Sent by:        *gpfsug-discuss-bounces at spectrumscale.org*
> <gpfsug-discuss-bounces at spectrumscale.org>
> ------------------------------
>
>
>
>
> Until IBM provides a solution, here is my workaround. Add it so it runs
> before the gpfs script, I call it from our custom xcat diskless boot
> scripts. Based on rhel7, not fully systemd integrated. YMMV!
>
> Regards,
>  — ddj
> ——-
> [ddj at storage041 ~]$ cat /etc/init.d/ibready
> #! /bin/bash
> #
> # chkconfig: 2345 06 94
> # /etc/rc.d/init.d/ibready
> # written in 2016 David D Johnson (ddj <at> *brown.edu*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__brown.edu&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=DZ8S9rlTWQ8XfqHR6o5CWfRorBROzg9akyebO0kFd0M&e=>
> )
> #
> ### BEGIN INIT INFO
> # Provides:             ibready
> # Required-Start:
> # Required-Stop:
> # Default-Stop:
> # Description: Block until infiniband is ready
> # Short-Description: Block until infiniband is ready
> ### END INIT INFO
>
> RETVAL=0
> if [[ -d /sys/class/infiniband ]]
> then
>         IBDEVICE=$(dirname $(grep -il infiniband
> /sys/class/infiniband/*/ports/1/link* | head -n 1))
> fi
> # See how we were called.
> case "$1" in
>   start)
>         if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
>         then
>                 echo -n "Polling for InfiniBand link up: "
>                 for (( count = 60; count > 0; count-- ))
>                 do
>                         if grep -q ACTIVE $IBDEVICE/state
>                         then
>                                 echo ACTIVE
>                                 break
>                         fi
>                         echo -n "."
>                         sleep 5
>                 done
>                 if (( count <= 0 ))
>                 then
>                         echo DOWN - $0 timed out
>                 fi
>         fi
>         ;;
>   stop|restart|reload|force-reload|condrestart|try-restart)
>         ;;
>   status)
>         if [[ -n $IBDEVICE && -f $IBDEVICE/state ]]
>         then
>                 echo "$IBDEVICE is $(< $IBDEVICE/state) $(<
> $IBDEVICE/rate)"
>         else
>                 echo "No IBDEVICE found"
>         fi
>         ;;
>   *)
>         echo "Usage: ibready {start|stop|status|restart|
> reload|force-reload|condrestart|try-restart}"
>         exit 2
> esac
> exit ${RETVAL}
> ————
>
>   -- ddj
> Dave Johnson
>
> On Mar 8, 2018, at 6:10 AM, Caubet Serrabou Marc (PSI) <
> *marc.caubet at psi.ch* <marc.caubet at psi.ch>> wrote:
>
> Hi all,
>
> with autoload = yes we do not ensure that GPFS will be started after the
> IB link becomes up. Is there a way to force GPFS waiting to start until IB
> ports are up? This can be probably done by adding something like
> After=network-online.target and Wants=network-online.target in the systemd
> file but I would like to know if this is natively possible from the GPFS
> configuration.
>
> Thanks a lot,
> Marc
> _________________________________________
> Paul Scherrer Institut
> High Performance Computing
> Marc Caubet Serrabou
> WHGA/036
> 5232 Villigen PSI
> Switzerland
>
> Telephone: *+41 56 310 46 67* <+41%2056%20310%2046%2067>
> E-Mail: *marc.caubet at psi.ch* <marc.caubet at psi.ch>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=Cn4NIxkWXmTOrwjnMFpO8KxH1BvuZLdC5_C9fwPSQCg&e=>
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=j35YX5vYr7_YZ5e8mzqvyCel2GUSQqjP2s7dBECkOQw&e=>
>
> *https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=u-EMob09-dkE6jZbD3dTjBi3vWhmDXtxiOK3nqFyIgY&s=JCfJgq6pZnKUI6d-rIgJXVcdZh7vmA5ypB1_goP_FFA&e=>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org_&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=j35YX5vYr7_YZ5e8mzqvyCel2GUSQqjP2s7dBECkOQw&e=>
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=ocL9CBYdvYLa3eMuhGzZkyyDKzVCWSbQGeSj7t-OYTA&e=>
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at *spectrumscale.org*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__spectrumscale.org&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=ocL9CBYdvYLa3eMuhGzZkyyDKzVCWSbQGeSj7t-OYTA&e=>
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss*
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwMFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.
> org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_
> iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=
> xImYTxt4pm1o5znVn5Vdoka2uxgsTRpmlCGdEWhB9vw&s=
> veOZZz80aBzoCTKusx6WOpVlYs64eNkp5pM9kbHgvic&e=
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180427/e5471b42/attachment.htm>


More information about the gpfsug-discuss mailing list