Discussion:
[ceph-users] Benchmark performance when using SSD as the journal
D***@Dell.com
2018-11-14 04:21:02 UTC
Permalink
Hi all,

We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run "rados bench" utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run "rados bench" again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change,

The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)

The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand

Is there anything wrong from what I did? Could anyone give me some suggestion?


Best Regards,
Dave Chen
Ashley Merrick
2018-11-14 04:29:39 UTC
Permalink
Only certain SSD's are good for CEPH Journals as can be seen @
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

The SSD your using isn't listed but doing a quick search online it appears
to be a SSD designed for read workloads as a "upgrade" from a HD so
probably is not designed for the high write requirements a journal demands.
Therefore when it's been hit by 3 OSD's of workloads your not going to get
much more performance out of it than you would just using the disk as your
seeing.
Post by D***@Dell.com
Hi all,
We want to compare the performance between HDD partition as the journal
(inline from OSD disk) and SSD partition as the journal, here is what we
have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it.
Firstly, we created the OSD with journal from OSD partition, and run “rados
bench” utility to test the performance, and then migrate the journal from
HDD to SSD (Intel S4500) and run “rados bench” again, the expected result
is SSD partition should be much better than HDD, but the result shows us
there is nearly no change,
The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)
The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand
Is there anything wrong from what I did? Could anyone give me some suggestion?
Best Regards,
Dave Chen
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
D***@Dell.com
2018-11-14 04:45:26 UTC
Permalink
Thanks Merrick!

I checked with Intel spec [1], the performance Intel said is,

• Sequential Read (up to) 500 MB/s
• Sequential Write (up to) 330 MB/s
• Random Read (100% Span) 72000 IOPS
• Random Write (100% Span) 20000 IOPS

I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference.

And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)?

[1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-

Best Regards,
Dave Chen

From: Ashley Merrick <***@amerrick.co.uk>
Sent: Wednesday, November 14, 2018 12:30 PM
To: Chen2, Dave
Cc: ceph-users
Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands.
Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing.

On Wed, Nov 14, 2018 at 12:21 PM <***@dell.com<mailto:***@dell.com>> wrote:
Hi all,

We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change,

The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)

The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand

Is there anything wrong from what I did? Could anyone give me some suggestion?


Best Regards,
Dave Chen

_______________________________________________
ceph-users mailing list
ceph-***@lists.ceph.com<mailto:ceph-***@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Ashley Merrick
2018-11-14 04:49:05 UTC
Permalink
Well as you mentioned Journals I guess you was using filestore in your test?

You could go down the route of bluestore and put the WAL + DB onto the SSD
and the bluestore data onto the HD, you should notice an increase in
performance over both methods you have tried on filestore.
Post by D***@Dell.com
Thanks Merrick!
I checked with Intel spec [1], the performance Intel said is,
· Sequential Read (up to) 500 MB/s
· Sequential Write (up to) 330 MB/s
· Random Read (100% Span) 72000 IOPS
· Random Write (100% Span) 20000 IOPS
I think these indicator should be must better than general HDD, and I have
run read/write commands with “rados bench” respectively, there should be
some difference.
And is there any kinds of configuration that could give us any performance
gain with this SSD (Intel S4500)?
[1]
https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-
Best Regards,
Dave Chen
*Sent:* Wednesday, November 14, 2018 12:30 PM
*To:* Chen2, Dave
*Cc:* ceph-users
*Subject:* Re: [ceph-users] Benchmark performance when using SSD as the
journal
[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
The SSD your using isn't listed but doing a quick search online it appears
to be a SSD designed for read workloads as a "upgrade" from a HD so
probably is not designed for the high write requirements a journal demands.
Therefore when it's been hit by 3 OSD's of workloads your not going to get
much more performance out of it than you would just using the disk as your
seeing.
Hi all,
We want to compare the performance between HDD partition as the journal
(inline from OSD disk) and SSD partition as the journal, here is what we
have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it.
Firstly, we created the OSD with journal from OSD partition, and run “rados
bench” utility to test the performance, and then migrate the journal from
HDD to SSD (Intel S4500) and run “rados bench” again, the expected result
is SSD partition should be much better than HDD, but the result shows us
there is nearly no change,
The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)
The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand
Is there anything wrong from what I did? Could anyone give me some suggestion?
Best Regards,
Dave Chen
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
D***@Dell.com
2018-11-14 06:15:20 UTC
Permalink
Thanks Merrick!

I haven’t tried the blue store but I believe what you said, I tried again with “rbd bench-write” with filestore, the result has more than 50% performance increase with the SSD as the journal, so I am still cannot understand why “rados bench” cannot give us any difference, what’s the rationale behind it? Do you know that?


Best Regards,
Dave Chen

From: Ashley Merrick <***@amerrick.co.uk>
Sent: Wednesday, November 14, 2018 12:49 PM
To: Chen2, Dave
Cc: ceph-users
Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
Well as you mentioned Journals I guess you was using filestore in your test?

You could go down the route of bluestore and put the WAL + DB onto the SSD and the bluestore data onto the HD, you should notice an increase in performance over both methods you have tried on filestore.

On Wed, Nov 14, 2018 at 12:45 PM <***@dell.com<mailto:***@dell.com>> wrote:
Thanks Merrick!

I checked with Intel spec [1], the performance Intel said is,

• Sequential Read (up to) 500 MB/s
• Sequential Write (up to) 330 MB/s
• Random Read (100% Span) 72000 IOPS
• Random Write (100% Span) 20000 IOPS

I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference.

And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)?

[1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-

Best Regards,
Dave Chen

From: Ashley Merrick <***@amerrick.co.uk<mailto:***@amerrick.co.uk>>
Sent: Wednesday, November 14, 2018 12:30 PM
To: Chen2, Dave
Cc: ceph-users
Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands.
Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing.

On Wed, Nov 14, 2018 at 12:21 PM <***@dell.com<mailto:***@dell.com>> wrote:
Hi all,

We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change,

The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)

The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand

Is there anything wrong from what I did? Could anyone give me some suggestion?


Best Regards,
Dave Chen

_______________________________________________
ceph-users mailing list
ceph-***@lists.ceph.com<mailto:ceph-***@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Martin Verges
2018-11-14 05:48:42 UTC
Permalink
Please never use the Datasheet values to select your SSD. We never had a
single one that that delivers the shown perfomance in a Ceph Journal use
case.

However, do not use Filestore anymore. Especialy with newer kernel
versions. Use Bluestore instead.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: ***@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx
Post by D***@Dell.com
Thanks Merrick!
I checked with Intel spec [1], the performance Intel said is,
· Sequential Read (up to) 500 MB/s
· Sequential Write (up to) 330 MB/s
· Random Read (100% Span) 72000 IOPS
· Random Write (100% Span) 20000 IOPS
I think these indicator should be must better than general HDD, and I have
run read/write commands with “rados bench” respectively, there should be
some difference.
And is there any kinds of configuration that could give us any performance
gain with this SSD (Intel S4500)?
[1]
https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-
Best Regards,
Dave Chen
*Sent:* Wednesday, November 14, 2018 12:30 PM
*To:* Chen2, Dave
*Cc:* ceph-users
*Subject:* Re: [ceph-users] Benchmark performance when using SSD as the
journal
[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
The SSD your using isn't listed but doing a quick search online it appears
to be a SSD designed for read workloads as a "upgrade" from a HD so
probably is not designed for the high write requirements a journal demands.
Therefore when it's been hit by 3 OSD's of workloads your not going to get
much more performance out of it than you would just using the disk as your
seeing.
Hi all,
We want to compare the performance between HDD partition as the journal
(inline from OSD disk) and SSD partition as the journal, here is what we
have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it.
Firstly, we created the OSD with journal from OSD partition, and run “rados
bench” utility to test the performance, and then migrate the journal from
HDD to SSD (Intel S4500) and run “rados bench” again, the expected result
is SSD partition should be much better than HDD, but the result shows us
there is nearly no change,
The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)
The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand
Is there anything wrong from what I did? Could anyone give me some suggestion?
Best Regards,
Dave Chen
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
D***@Dell.com
2018-11-14 06:27:59 UTC
Permalink
Thanks Martin for your suggestion!
I will definitely try bluestore later. The version of Ceph I am using is v10.2.10 Jewel, do you think it’s stable enough to use Bluestore for Jewel or should I upgrade Ceph to Luminous?


Best Regards,
Dave Chen

From: Martin Verges <***@croit.io>
Sent: Wednesday, November 14, 2018 1:49 PM
To: Chen2, Dave
Cc: ***@amerrick.co.uk; ceph-users
Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
Please never use the Datasheet values to select your SSD. We never had a single one that that delivers the shown perfomance in a Ceph Journal use case.

However, do not use Filestore anymore. Especialy with newer kernel versions. Use Bluestore instead.
--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: ***@croit.io<mailto:***@croit.io>
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Mi., 14. Nov. 2018, 05:46 hat <***@dell.com<mailto:***@dell.com>> geschrieben:
Thanks Merrick!

I checked with Intel spec [1], the performance Intel said is,

• Sequential Read (up to) 500 MB/s
• Sequential Write (up to) 330 MB/s
• Random Read (100% Span) 72000 IOPS
• Random Write (100% Span) 20000 IOPS

I think these indicator should be must better than general HDD, and I have run read/write commands with “rados bench” respectively, there should be some difference.

And is there any kinds of configuration that could give us any performance gain with this SSD (Intel S4500)?

[1] https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-

Best Regards,
Dave Chen

From: Ashley Merrick <***@amerrick.co.uk<mailto:***@amerrick.co.uk>>
Sent: Wednesday, November 14, 2018 12:30 PM
To: Chen2, Dave
Cc: ceph-users
Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.
Only certain SSD's are good for CEPH Journals as can be seen @ https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

The SSD your using isn't listed but doing a quick search online it appears to be a SSD designed for read workloads as a "upgrade" from a HD so probably is not designed for the high write requirements a journal demands.
Therefore when it's been hit by 3 OSD's of workloads your not going to get much more performance out of it than you would just using the disk as your seeing.

On Wed, Nov 14, 2018 at 12:21 PM <***@dell.com<mailto:***@dell.com>> wrote:
Hi all,

We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run “rados bench” utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run “rados bench” again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change,

The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)

The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand

Is there anything wrong from what I did? Could anyone give me some suggestion?


Best Regards,
Dave Chen

_______________________________________________
ceph-users mailing list
ceph-***@lists.ceph.com<mailto:ceph-***@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-***@lists.ceph.com<mailto:ceph-***@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Joe Comeau
2018-11-14 20:56:15 UTC
Permalink
Hi Dave

Have you looked at the Intel P4600 vsd the P4500

The P4600 has better random writes and a better drive writes per day I
believe

Thanks Joe
Thanks Merrick!

I checked with Intel spec [1], the performance Intel said is,

· Sequential Read (up to) 500 MB/s
· Sequential Write (up to) 330 MB/s
· Random Read (100% Span) 72000 IOPS
· Random Write (100% Span) 20000 IOPS

I think these indicator should be must better than general HDD, and I
have run read/write commands with “rados bench” respectively, there
should be some difference.

And is there any kinds of configuration that could give us any
performance gain with this SSD (Intel S4500)?

[1]
https://ark.intel.com/products/120521/Intel-SSD-DC-S4500-Series-480GB-2-5in-SATA-6Gb-s-3D1-TLC-

Best Regards,
Dave Chen

From: Ashley Merrick <***@amerrick.co.uk>
Sent: Wednesday, November 14, 2018 12:30 PM
To: Chen2, Dave
Cc: ceph-users
Subject: Re: [ceph-users] Benchmark performance when using SSD as the
journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for
sensitive information.

Only certain SSD's are good for CEPH Journals as can be seen @
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/



The SSD your using isn't listed but doing a quick search online it
appears to be a SSD designed for read workloads as a "upgrade" from a HD
so probably is not designed for the high write requirements a journal
demands.

Therefore when it's been hit by 3 OSD's of workloads your not going to
get much more performance out of it than you would just using the disk
as your seeing.



On Wed, Nov 14, 2018 at 12:21 PM <***@dell.com> wrote:



Hi all,

We want to compare the performance between HDD partition as the journal
(inline from OSD disk) and SSD partition as the journal, here is what we
have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it.
Firstly, we created the OSD with journal from OSD partition, and run
“rados bench” utility to test the performance, and then migrate the
journal from HDD to SSD (Intel S4500) and run “rados bench” again, the
expected result is SSD partition should be much better than HDD, but the
result shows us there is nearly no change,

The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)

The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand

Is there anything wrong from what I did? Could anyone give me some
suggestion?


Best Regards,
Dave Chen

Maged Mokhtar
2018-11-14 07:36:00 UTC
Permalink
Hi Dave,

The SSD journal will help boost iops  & latency which will be more
apparent for small block sizes. The rados benchmark default block size
is 4M, use the -b option to specify the size. Try at 4k, 32k, 64k ...
As a side note, this is a rados level test, the rbd image size is not
relevant here.

Maged.
Post by D***@Dell.com
Hi all,
We want to compare the performance between HDD partition as the
journal (inline from OSD disk) and SSD partition as the journal, here
is what we have done, we have 3 nodes used as Ceph OSD,  each has 3
OSD on it. Firstly, we created the OSD with journal from OSD
partition, and run “rados bench” utility to test the performance, and
then migrate the journal from HDD to SSD (Intel S4500) and run “rados
bench” again, the expected result is SSD partition should be much
better than HDD, but the result shows us there is nearly no change,
The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)
The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand
Is there anything wrong from what I did?  Could anyone give me some
suggestion?
Best Regards,
Dave Chen
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
D***@Dell.com
2018-11-14 09:19:06 UTC
Permalink
Thanks Mokhtar! This is what I am looking for, thanks for your explanation!


Best Regards,
Dave Chen

From: Maged Mokhtar <***@petasan.org>
Sent: Wednesday, November 14, 2018 3:36 PM
To: Chen2, Dave; ceph-***@lists.ceph.com
Subject: Re: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.

Hi Dave,

The SSD journal will help boost iops & latency which will be more apparent for small block sizes. The rados benchmark default block size is 4M, use the -b option to specify the size. Try at 4k, 32k, 64k ...
As a side note, this is a rados level test, the rbd image size is not relevant here.

Maged.
On 14/11/18 06:21, ***@Dell.com<mailto:***@Dell.com> wrote:
Hi all,

We want to compare the performance between HDD partition as the journal (inline from OSD disk) and SSD partition as the journal, here is what we have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it. Firstly, we created the OSD with journal from OSD partition, and run "rados bench" utility to test the performance, and then migrate the journal from HDD to SSD (Intel S4500) and run "rados bench" again, the expected result is SSD partition should be much better than HDD, but the result shows us there is nearly no change,

The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)

The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand

Is there anything wrong from what I did? Could anyone give me some suggestion?


Best Regards,
Dave Chen





_______________________________________________

ceph-users mailing list

ceph-***@lists.ceph.com<mailto:ceph-***@lists.ceph.com>

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Marc Roos
2018-11-14 08:36:33 UTC
Permalink
Try comparing results from something like this test


[global]
ioengine=posixaio
invalidate=1
ramp_time=30
iodepth=1
runtime=180
time_based
direct=1
filename=/mnt/cephfs/ssd/fio-bench.img

[write-4k-seq]
stonewall
bs=4k
rw=write
#write_bw_log=sdx-4k-write-seq.results
#write_iops_log=sdx-4k-write-seq.results

[randwrite-4k-seq]
stonewall
bs=4k
rw=randwrite
#write_bw_log=sdx-4k-randwrite-seq.results
#write_iops_log=sdx-4k-randwrite-seq.results

[read-4k-seq]
stonewall
bs=4k
rw=read
#write_bw_log=sdx-4k-read-seq.results
#write_iops_log=sdx-4k-read-seq.results

[randread-4k-seq]
stonewall
bs=4k
rw=randread
#write_bw_log=sdx-4k-randread-seq.results
#write_iops_log=sdx-4k-randread-seq.results

[rw-4k-seq]
stonewall
bs=4k
rw=rw
#write_bw_log=sdx-4k-rw-seq.results
#write_iops_log=sdx-4k-rw-seq.results

[randrw-4k-seq]
stonewall
bs=4k
rw=randrw
#write_bw_log=sdx-4k-randrw-seq.results
#write_iops_log=sdx-4k-randrw-seq.results

[write-128k-seq]
stonewall
bs=128k
rw=write
#write_bw_log=sdx-128k-write-seq.results
#write_iops_log=sdx-128k-write-seq.results

[randwrite-128k-seq]
stonewall
bs=128k
rw=randwrite
#write_bw_log=sdx-128k-randwrite-seq.results
#write_iops_log=sdx-128k-randwrite-seq.results

[read-128k-seq]
stonewall
bs=128k
rw=read
#write_bw_log=sdx-128k-read-seq.results
#write_iops_log=sdx-128k-read-seq.results

[randread-128k-seq]
stonewall
bs=128k
rw=randread
#write_bw_log=sdx-128k-randread-seq.results
#write_iops_log=sdx-128k-randread-seq.results

[rw-128k-seq]
stonewall
bs=128k
rw=rw
#write_bw_log=sdx-128k-rw-seq.results
#write_iops_log=sdx-128k-rw-seq.results

[randrw-128k-seq]
stonewall
bs=128k
rw=randrw
#write_bw_log=sdx-128k-randrw-seq.results
#write_iops_log=sdx-128k-randrw-seq.results

[write-1024k-seq]
stonewall
bs=1024k
rw=write
#write_bw_log=sdx-1024k-write-seq.results
#write_iops_log=sdx-1024k-write-seq.results

[randwrite-1024k-seq]
stonewall
bs=1024k
rw=randwrite
#write_bw_log=sdx-1024k-randwrite-seq.results
#write_iops_log=sdx-1024k-randwrite-seq.results

[read-1024k-seq]
stonewall
bs=1024k
rw=read
#write_bw_log=sdx-1024k-read-seq.results
#write_iops_log=sdx-1024k-read-seq.results

[randread-1024k-seq]
stonewall
bs=1024k
rw=randread
#write_bw_log=sdx-1024k-randread-seq.results
#write_iops_log=sdx-1024k-randread-seq.results

[rw-1024k-seq]
stonewall
bs=1024k
rw=rw
#write_bw_log=sdx-1024k-rw-seq.results
#write_iops_log=sdx-1024k-rw-seq.results

[randrw-1024k-seq]
stonewall
bs=1024k
rw=randrw
#write_bw_log=sdx-1024k-randrw-seq.results
#write_iops_log=sdx-1024k-randrw-seq.results

[write-4096k-seq]
stonewall
bs=4096k
rw=write
#write_bw_log=sdx-4096k-write-seq.results
#write_iops_log=sdx-4096k-write-seq.results

[randwrite-4096k-seq]
stonewall
bs=4096k
rw=randwrite
#write_bw_log=sdx-4096k-randwrite-seq.results
#write_iops_log=sdx-4096k-randwrite-seq.results

[read-4096k-seq]
stonewall
bs=4096k
rw=read
#write_bw_log=sdx-4096k-read-seq.results
#write_iops_log=sdx-4096k-read-seq.results

[randread-4096k-seq]
stonewall
bs=4096k
rw=randread
#write_bw_log=sdx-4096k-randread-seq.results
#write_iops_log=sdx-4096k-randread-seq.results

[rw-4096k-seq]
stonewall
bs=4096k
rw=rw
#write_bw_log=sdx-4096k-rw-seq.results
#write_iops_log=sdx-4096k-rw-seq.results

[randrw-4096k-seq]
stonewall
bs=4096k
rw=randrw
#write_bw_log=sdx-4096k-randrw-seq.results
#write_iops_log=sdx-4096k-randrw-seq.results



-----Original Message-----
From: ***@Dell.com [mailto:***@Dell.com]
Sent: woensdag 14 november 2018 5:21
To: ceph-***@lists.ceph.com
Subject: [ceph-users] Benchmark performance when using SSD as the
journal

Hi all,



We want to compare the performance between HDD partition as the journal
(inline from OSD disk) and SSD partition as the journal, here is what we
have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it.
Firstly, we created the OSD with journal from OSD partition, and run
“rados bench” utility to test the performance, and then migrate the
journal from HDD to SSD (Intel S4500) and run “rados bench” again, the
expected result is SSD partition should be much better than HDD, but the
result shows us there is nearly no change,



The configuration of Ceph is as below,

pool size: 3

osd size: 3*3

pg (pgp) num: 300

osd nodes are separated across three different nodes

rbd image size: 10G (10240M)



The utility I used is,

rados bench -p rbd $duration write

rados bench -p rbd $duration seq

rados bench -p rbd $duration rand



Is there anything wrong from what I did? Could anyone give me some
suggestion?





Best Regards,

Dave Chen
D***@Dell.com
2018-11-14 09:20:28 UTC
Permalink
Hi Roos,

I will try with the configuration, thank you very much!

Best Regards,
Dave Chen

-----Original Message-----
From: Marc Roos <***@f1-outsourcing.eu>
Sent: Wednesday, November 14, 2018 4:37 PM
To: ceph-users; Chen2, Dave
Subject: RE: [ceph-users] Benchmark performance when using SSD as the journal


[EXTERNAL EMAIL]
Please report any suspicious attachments, links, or requests for sensitive information.




Try comparing results from something like this test


[global]
ioengine=posixaio
invalidate=1
ramp_time=30
iodepth=1
runtime=180
time_based
direct=1
filename=/mnt/cephfs/ssd/fio-bench.img

[write-4k-seq]
stonewall
bs=4k
rw=write
#write_bw_log=sdx-4k-write-seq.results
#write_iops_log=sdx-4k-write-seq.results

[randwrite-4k-seq]
stonewall
bs=4k
rw=randwrite
#write_bw_log=sdx-4k-randwrite-seq.results
#write_iops_log=sdx-4k-randwrite-seq.results

[read-4k-seq]
stonewall
bs=4k
rw=read
#write_bw_log=sdx-4k-read-seq.results
#write_iops_log=sdx-4k-read-seq.results

[randread-4k-seq]
stonewall
bs=4k
rw=randread
#write_bw_log=sdx-4k-randread-seq.results
#write_iops_log=sdx-4k-randread-seq.results

[rw-4k-seq]
stonewall
bs=4k
rw=rw
#write_bw_log=sdx-4k-rw-seq.results
#write_iops_log=sdx-4k-rw-seq.results

[randrw-4k-seq]
stonewall
bs=4k
rw=randrw
#write_bw_log=sdx-4k-randrw-seq.results
#write_iops_log=sdx-4k-randrw-seq.results

[write-128k-seq]
stonewall
bs=128k
rw=write
#write_bw_log=sdx-128k-write-seq.results
#write_iops_log=sdx-128k-write-seq.results

[randwrite-128k-seq]
stonewall
bs=128k
rw=randwrite
#write_bw_log=sdx-128k-randwrite-seq.results
#write_iops_log=sdx-128k-randwrite-seq.results

[read-128k-seq]
stonewall
bs=128k
rw=read
#write_bw_log=sdx-128k-read-seq.results
#write_iops_log=sdx-128k-read-seq.results

[randread-128k-seq]
stonewall
bs=128k
rw=randread
#write_bw_log=sdx-128k-randread-seq.results
#write_iops_log=sdx-128k-randread-seq.results

[rw-128k-seq]
stonewall
bs=128k
rw=rw
#write_bw_log=sdx-128k-rw-seq.results
#write_iops_log=sdx-128k-rw-seq.results

[randrw-128k-seq]
stonewall
bs=128k
rw=randrw
#write_bw_log=sdx-128k-randrw-seq.results
#write_iops_log=sdx-128k-randrw-seq.results

[write-1024k-seq]
stonewall
bs=1024k
rw=write
#write_bw_log=sdx-1024k-write-seq.results
#write_iops_log=sdx-1024k-write-seq.results

[randwrite-1024k-seq]
stonewall
bs=1024k
rw=randwrite
#write_bw_log=sdx-1024k-randwrite-seq.results
#write_iops_log=sdx-1024k-randwrite-seq.results

[read-1024k-seq]
stonewall
bs=1024k
rw=read
#write_bw_log=sdx-1024k-read-seq.results
#write_iops_log=sdx-1024k-read-seq.results

[randread-1024k-seq]
stonewall
bs=1024k
rw=randread
#write_bw_log=sdx-1024k-randread-seq.results
#write_iops_log=sdx-1024k-randread-seq.results

[rw-1024k-seq]
stonewall
bs=1024k
rw=rw
#write_bw_log=sdx-1024k-rw-seq.results
#write_iops_log=sdx-1024k-rw-seq.results

[randrw-1024k-seq]
stonewall
bs=1024k
rw=randrw
#write_bw_log=sdx-1024k-randrw-seq.results
#write_iops_log=sdx-1024k-randrw-seq.results

[write-4096k-seq]
stonewall
bs=4096k
rw=write
#write_bw_log=sdx-4096k-write-seq.results
#write_iops_log=sdx-4096k-write-seq.results

[randwrite-4096k-seq]
stonewall
bs=4096k
rw=randwrite
#write_bw_log=sdx-4096k-randwrite-seq.results
#write_iops_log=sdx-4096k-randwrite-seq.results

[read-4096k-seq]
stonewall
bs=4096k
rw=read
#write_bw_log=sdx-4096k-read-seq.results
#write_iops_log=sdx-4096k-read-seq.results

[randread-4096k-seq]
stonewall
bs=4096k
rw=randread
#write_bw_log=sdx-4096k-randread-seq.results
#write_iops_log=sdx-4096k-randread-seq.results

[rw-4096k-seq]
stonewall
bs=4096k
rw=rw
#write_bw_log=sdx-4096k-rw-seq.results
#write_iops_log=sdx-4096k-rw-seq.results

[randrw-4096k-seq]
stonewall
bs=4096k
rw=randrw
#write_bw_log=sdx-4096k-randrw-seq.results
#write_iops_log=sdx-4096k-randrw-seq.results



-----Original Message-----
From: ***@Dell.com [mailto:***@Dell.com]
Sent: woensdag 14 november 2018 5:21
To: ceph-***@lists.ceph.com
Subject: [ceph-users] Benchmark performance when using SSD as the
journal

Hi all,



We want to compare the performance between HDD partition as the journal
(inline from OSD disk) and SSD partition as the journal, here is what we
have done, we have 3 nodes used as Ceph OSD, each has 3 OSD on it.
Firstly, we created the OSD with journal from OSD partition, and run
“rados bench” utility to test the performance, and then migrate the
journal from HDD to SSD (Intel S4500) and run “rados bench” again, the
expected result is SSD partition should be much better than HDD, but the
result shows us there is nearly no change,



The configuration of Ceph is as below,

pool size: 3

osd size: 3*3

pg (pgp) num: 300

osd nodes are separated across three different nodes

rbd image size: 10G (10240M)



The utility I used is,

rados bench -p rbd $duration write

rados bench -p rbd $duration seq

rados bench -p rbd $duration rand



Is there anything wrong from what I did? Could anyone give me some
suggestion?





Best Regards,

Dave Chen
v***@yourcmc.ru
2018-11-14 11:35:48 UTC
Permalink
Hi Dave,

The main line in SSD specs you should look at is
Enhanced Power Loss Data Protection: Yes
This makes SSD cache nonvolatile and makes SSD ignore fsync()s so
transactional performance becomes equal to non-transactional. So your
SSDs should be OK for journal.

rados bench is a bad tool for testing because of 4M default block size
and a very small number of objects created for testing. Better test it
with fio -ioengine=rbd -bs=4k -rw=randwrite and -sync=1 -iodepth=1 for
latency or -iodepth=128 for max random load.

Another recent thing that I've discovered was that turning off write
cache for all drives (for i in /dev/sd*; do hdparm -W 0 $i; done)
increased write iops by an order of magnitude.
Hi all,
We want to compare the performance between HDD partition as the
journal (inline from OSD disk) and SSD partition as the journal, here
is what we have done, we have 3 nodes used as Ceph OSD, each has 3
OSD on it. Firstly, we created the OSD with journal from OSD
partition, and run "rados bench" utility to test the performance, and
then migrate the journal from HDD to SSD (Intel S4500) and run "rados
bench" again, the expected result is SSD partition should be much
better than HDD, but the result shows us there is nearly no change,
The configuration of Ceph is as below,
pool size: 3
osd size: 3*3
pg (pgp) num: 300
osd nodes are separated across three different nodes
rbd image size: 10G (10240M)
The utility I used is,
rados bench -p rbd $duration write
rados bench -p rbd $duration seq
rados bench -p rbd $duration rand
Is there anything wrong from what I did? Could anyone give me some suggestion?
Best Regards,
Dave Chen
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Loading...