[gpfsug-discuss] Preferred NSD

Wed Mar 14 19:23:18 GMT 2018

My understanding is that with Spectrum Scale 5.0 there is no longer a 
standard edition, only data management and advanced, and the pricing is 
all done  via storage not sockets.  Now there may be some grandfathering 
for those with existing socket licenses but I really do not know.  My 
point is that data management is not the same as advanced edition.  Again 
I could be wrong because I tend not to concern myself with how the product 
is licensed.

Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
stockf at us.ibm.com

From:   Stephen Ulmer <ulmer at ulmer.org>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   03/14/2018 03:06 PM
Subject:        Re: [gpfsug-discuss] Preferred NSD
Sent by:        gpfsug-discuss-bounces at spectrumscale.org

Depending on the size... I just quoted something both ways and DME (which 
is Advanced Edition equivalent) was about $400K cheaper than Standard 
Edition socket pricing for this particular customer and use case. It all 
depends.

Also, for the case where the OP wants to distribute the file system around 
on NVMe in *every* node, there is always the FPO license. The FPO license 
can share NSDs with other FPO licensed nodes and servers (just not 
clients).

-- 
Stephen

On Mar 14, 2018, at 1:33 PM, Sobey, Richard A <r.sobey at imperial.ac.uk> 
wrote:

2. Have data management edition and capacity license the amount of 
storage.
There goes the budget 😉

Richard

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org <
gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Simon Thompson (IT 
Research Support)
Sent: 14 March 2018 16:54
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Preferred NSD

Not always true.

1. Use them with socket licenses as HAWC or LROC is OK on a client.
2. Have data management edition and capacity license the amount of 
storage.

Simon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org [
gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jeffrey R. Lang [
JRLang at uwyo.edu]
Sent: 14 March 2018 14:11
To: gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Preferred NSD

Something I haven't heard in this discussion, it that of licensing of 
GPFS.

I believe that once you export disks from a node it then becomes a server 
node and the license may need to be changed, from client to server.  There 
goes the budget.

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org <
gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Lukas Hejtmanek
Sent: Wednesday, March 14, 2018 4:28 AM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Preferred NSD

Hello,

thank you for insight. Well, the point is, that I will get ~60 with 120 
NVMe disks in it, each about 2TB size. It means that I will have 240TB in 
NVMe SSD that could build nice shared scratch. Moreover, I have no 
different HW or place to put these SSDs into. They have to be in the 
compute nodes.

On Tue, Mar 13, 2018 at 10:48:21AM -0700, Alex Chekholko wrote:
I would like to discourage you from building a large distributed 
clustered filesystem made of many unreliable components.  You will 
need to overprovision your interconnect and will also spend a lot of 
time in "healing" or "degraded" state.

It is typically cheaper to centralize the storage into a subset of 
nodes and configure those to be more highly available.  E.g. of your
60 nodes, take 8 and put all the storage into those and make that a 
dedicated GPFS cluster with no compute jobs on those nodes.  Again, 
you'll still need really beefy and reliable interconnect to make this 
work.

Stepping back; what is the actual problem you're trying to solve?  I 
have certainly been in that situation before, where the problem is 
more like: "I have a fixed hardware configuration that I can't change, 
and I want to try to shoehorn a parallel filesystem onto that."

I would recommend looking closer at your actual workloads.  If this is 
a "scratch" filesystem and file access is mostly from one node at a 
time, it's not very useful to make two additional copies of that data 
on other nodes, and it will only slow you down.

Regards,
Alex

On Tue, Mar 13, 2018 at 7:16 AM, Lukas Hejtmanek 
<xhejtman at ics.muni.cz>
wrote:

On Tue, Mar 13, 2018 at 10:37:43AM +0000, John Hearns wrote:
Lukas,
It looks like you are proposing a setup which uses your compute 
servers
as storage servers also?

yes, exactly. I would like to utilise NVMe SSDs that are in every 
compute servers.. Using them as a shared scratch area with GPFS is 
one of the options.

 *   I'm thinking about the following setup:
~ 60 nodes, each with two enterprise NVMe SSDs, FDR IB 
interconnected

There is nothing wrong with this concept, for instance see 
https://www.beegfs.io/wiki/BeeOND

I have an NVMe filesystem which uses 60 drives, but there are 10 servers.
You should look at "failure zones" also.

you still need the storage servers and local SSDs to use only for 
caching, do I understand correctly?

From: gpfsug-discuss-bounces at spectrumscale.org
[mailto:gpfsug-discuss-
bounces at spectrumscale.org] On Behalf Of Knister, Aaron S.
(GSFC-606.2)[COMPUTER SCIENCE CORP]
Sent: Monday, March 12, 2018 4:14 PM
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Preferred NSD

Hi Lukas,

Check out FPO mode. That mimics Hadoop's data placement features.
You
can have up to 3 replicas both data and metadata but still the 
downside, though, as you say is the wrong node failures will take your 
cluster down.

You might want to check out something like Excelero's NVMesh
(note: not
an endorsement since I can't give such things) which can create 
logical volumes across all your NVMe drives. The product has erasure 
coding on their roadmap. I'm not sure if they've released that 
feature yet but in theory it will give better fault tolerance *and* 
you'll get more efficient usage of your SSDs.

I'm sure there are other ways to skin this cat too.

-Aaron

On March 12, 2018 at 10:59:35 EDT, Lukas Hejtmanek 
<xhejtman at ics.muni.cz
<mailto:xhejtman at ics.muni.cz>> wrote:
Hello,

I'm thinking about the following setup:
~ 60 nodes, each with two enterprise NVMe SSDs, FDR IB 
interconnected

I would like to setup shared scratch area using GPFS and those 
NVMe
SSDs. Each
SSDs as on NSD.

I don't think like 5 or more data/metadata replicas are practical here.
On the
other hand, multiple node failures is something really expected.

Is there a way to instrument that local NSD is strongly preferred 
to
store
data? I.e. node failure most probably does not result in 
unavailable
data for
the other nodes?

Or is there any other recommendation/solution to build shared 
scratch
with
GPFS in such setup? (Do not do it including.)

--
Lukáš Hejtmánek
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org 
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-- The information contained in this communication and any 
attachments
is confidential and may be privileged, and is for the sole use of 
the intended recipient(s). Any unauthorized review, use, disclosure 
or distribution is prohibited. Unless explicitly stated otherwise in 
the body of this communication or the attachment thereto (if any), 
the information is provided on an AS-IS basis without any express or 
implied warranties or liabilities. To the extent you are relying on 
this information, you are doing so at your own risk. If you are not 
the intended recipient, please notify the sender immediately by 
replying to this message and destroy all copies of this message and 
any attachments. Neither the sender nor the company/group of 
companies he or she represents shall be liable for the proper and 
complete transmission of the information contained in this communication, 
or for any delay in its receipt.

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org 
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

--
Lukáš Hejtmánek

Linux Administrator only because
 Full Time Multitasking Ninja
 is not an official job title
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

--
Lukáš Hejtmánek

Linux Administrator only because
 Full Time Multitasking Ninja
 is not an official job title
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p_1XEUyoJ7-VJxF_w8h9gJh8_Wj0Pey73LCLLoxodpw&m=kB88vNQV9x5UFOu3tBxpRKmS3rSCi68KIBxOa_D5ji8&s=R9wxUL1IMkjtWZsFkSAXRUmuKi8uS1jpQRYVTvOYq3g&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180314/55fb9d8a/attachment.htm>