Discussion:
[ceph-users] Performance Problems
Scharfenberg, Buddy
2018-12-07 17:33:54 UTC
Permalink
Hello all,

I'm new to Ceph management, and we're having some performance issues with a basic cluster we've set up.

We have 3 nodes set up, 1 with several large drives, 1 with a handful of small ssds, and 1 with several nvme drives. We have 46 OSDs in total, a healthy FS being served out, and 1024 pgs split over metadata and data pools. I am having performance problems on the clients which I've been unable to nail down to the cluster itself and could use some guidance. I am seeing around 600MB/s out of each pool using rados bench, however I'm only seeing around 6MB/s direct transfer from clients using fuse and 30MB/s using the kernel client. I've asked over in IRC and have been told essentially that my performance will be tied to our lowest performing OSD speed / ( 2 * ${num_rep} ) and I have numbers which reflect that as my lowest performing disks are 180 MB/s according to osd bench and my writes are down around 30MB/s at best, with replication at 3. (180/(2*3)=30)

What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?

Thanks,
Buddy.
Paul Emmerich
2018-12-07 17:51:41 UTC
Permalink
How are you measuring the performance when using CephFS?

Paul
--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
Hello all,
I’m new to Ceph management, and we’re having some performance issues with a basic cluster we’ve set up.
We have 3 nodes set up, 1 with several large drives, 1 with a handful of small ssds, and 1 with several nvme drives. We have 46 OSDs in total, a healthy FS being served out, and 1024 pgs split over metadata and data pools. I am having performance problems on the clients which I’ve been unable to nail down to the cluster itself and could use some guidance. I am seeing around 600MB/s out of each pool using rados bench, however I’m only seeing around 6MB/s direct transfer from clients using fuse and 30MB/s using the kernel client. I’ve asked over in IRC and have been told essentially that my performance will be tied to our lowest performing OSD speed / ( 2 * ${num_rep} ) and I have numbers which reflect that as my lowest performing disks are 180 MB/s according to osd bench and my writes are down around 30MB/s at best, with replication at 3. (180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Scharfenberg, Buddy
2018-12-07 18:04:27 UTC
Permalink
I'm measuring with dd writing from /dev/zero with a size of 1 MB 1000 times to get client write speeds.

-----Original Message-----
From: Paul Emmerich [mailto:***@croit.io]
Sent: Friday, December 07, 2018 11:52 AM
To: Scharfenberg, Buddy <***@mst.edu>
Cc: Ceph Users <ceph-***@lists.ceph.com>
Subject: Re: [ceph-users] Performance Problems

How are you measuring the performance when using CephFS?

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
Hello all,
I’m new to Ceph management, and we’re having some performance issues with a basic cluster we’ve set up.
We have 3 nodes set up, 1 with several large drives, 1 with a handful of small ssds, and 1 with several nvme drives. We have 46 OSDs in total, a healthy FS being served out, and 1024 pgs split over metadata and data pools. I am having performance problems on the clients which I’ve been unable to nail down to the cluster itself and could use some guidance. I am seeing around 600MB/s out of each pool using rados bench, however I’m only seeing around 6MB/s direct transfer from clients using fuse and 30MB/s using the kernel client. I’ve asked over in IRC and have been told essentially that my performance will be tied to our lowest performing OSD speed / ( 2 * ${num_rep} ) and I have numbers which reflect that as my lowest performing disks are 180 MB/s according to osd bench and my writes are down around 30MB/s at best, with replication at 3. (180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Paul Emmerich
2018-12-07 18:30:46 UTC
Permalink
What are the exact parameters you are using? I often see people using
dd in a way that effectively just measures write latency instead of
throughput.
Check out fio as a better/more realistic benchmarking tool.

Paul
--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
I'm measuring with dd writing from /dev/zero with a size of 1 MB 1000 times to get client write speeds.
-----Original Message-----
Sent: Friday, December 07, 2018 11:52 AM
Subject: Re: [ceph-users] Performance Problems
How are you measuring the performance when using CephFS?
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
Hello all,
I’m new to Ceph management, and we’re having some performance issues with a basic cluster we’ve set up.
We have 3 nodes set up, 1 with several large drives, 1 with a handful of small ssds, and 1 with several nvme drives. We have 46 OSDs in total, a healthy FS being served out, and 1024 pgs split over metadata and data pools. I am having performance problems on the clients which I’ve been unable to nail down to the cluster itself and could use some guidance. I am seeing around 600MB/s out of each pool using rados bench, however I’m only seeing around 6MB/s direct transfer from clients using fuse and 30MB/s using the kernel client. I’ve asked over in IRC and have been told essentially that my performance will be tied to our lowest performing OSD speed / ( 2 * ${num_rep} ) and I have numbers which reflect that as my lowest performing disks are 180 MB/s according to osd bench and my writes are down around 30MB/s at best, with replication at 3. (180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Scharfenberg, Buddy
2018-12-07 18:38:13 UTC
Permalink
`dd if=/dev/zero of=/mnt/test/writetest bs=1M count=1000 oflag=dsync`

-----Original Message-----
From: Paul Emmerich [mailto:***@croit.io]
Sent: Friday, December 07, 2018 12:31 PM
To: Scharfenberg, Buddy <***@mst.edu>
Cc: Ceph Users <ceph-***@lists.ceph.com>
Subject: Re: [ceph-users] Performance Problems

What are the exact parameters you are using? I often see people using dd in a way that effectively just measures write latency instead of throughput.
Check out fio as a better/more realistic benchmarking tool.

Paul
--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
I'm measuring with dd writing from /dev/zero with a size of 1 MB 1000 times to get client write speeds.
-----Original Message-----
Sent: Friday, December 07, 2018 11:52 AM
Subject: Re: [ceph-users] Performance Problems
How are you measuring the performance when using CephFS?
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at
https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
Hello all,
I’m new to Ceph management, and we’re having some performance issues with a basic cluster we’ve set up.
We have 3 nodes set up, 1 with several large drives, 1 with a
handful of small ssds, and 1 with several nvme drives. We have 46
OSDs in total, a healthy FS being served out, and 1024 pgs split
over metadata and data pools. I am having performance problems on
the clients which I’ve been unable to nail down to the cluster
itself and could use some guidance. I am seeing around 600MB/s out
of each pool using rados bench, however I’m only seeing around 6MB/s
direct transfer from clients using fuse and 30MB/s using the kernel
client. I’ve asked over in IRC and have been told essentially that
my performance will be tied to our lowest performing OSD speed / ( 2
* ${num_rep} ) and I have numbers which reflect that as my lowest
performing disks are 180 MB/s according to osd bench and my writes
are down around 30MB/s at best, with replication at 3.
(180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Paul Emmerich
2018-12-07 18:59:20 UTC
Permalink
That creates IO with a queue depth of 1, so you are effectively
measuring latency and not bandwidth.

30 mb/s would be ~33ms of latency on average (a little bit less
because it still needs to do the actual IO).
Assuming you distribute to all 3 servers: each IO will have to wait
for one of your "large drives" which seem to have a latency of 33 ms
on average for a 1 MB write which is ~5ms for the write (from your 180
MB/s) and ~28 ms latency.
Which seems like a reasonable result.


Paul
--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
`dd if=/dev/zero of=/mnt/test/writetest bs=1M count=1000 oflag=dsync`
-----Original Message-----
Sent: Friday, December 07, 2018 12:31 PM
Subject: Re: [ceph-users] Performance Problems
What are the exact parameters you are using? I often see people using dd in a way that effectively just measures write latency instead of throughput.
Check out fio as a better/more realistic benchmarking tool.
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
I'm measuring with dd writing from /dev/zero with a size of 1 MB 1000 times to get client write speeds.
-----Original Message-----
Sent: Friday, December 07, 2018 11:52 AM
Subject: Re: [ceph-users] Performance Problems
How are you measuring the performance when using CephFS?
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at
https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
Post by Scharfenberg, Buddy
Hello all,
I’m new to Ceph management, and we’re having some performance issues with a basic cluster we’ve set up.
We have 3 nodes set up, 1 with several large drives, 1 with a
handful of small ssds, and 1 with several nvme drives. We have 46
OSDs in total, a healthy FS being served out, and 1024 pgs split
over metadata and data pools. I am having performance problems on
the clients which I’ve been unable to nail down to the cluster
itself and could use some guidance. I am seeing around 600MB/s out
of each pool using rados bench, however I’m only seeing around 6MB/s
direct transfer from clients using fuse and 30MB/s using the kernel
client. I’ve asked over in IRC and have been told essentially that
my performance will be tied to our lowest performing OSD speed / ( 2
* ${num_rep} ) and I have numbers which reflect that as my lowest
performing disks are 180 MB/s according to osd bench and my writes
are down around 30MB/s at best, with replication at 3.
(180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Stefan Kooman
2018-12-10 09:09:50 UTC
Permalink
Post by Scharfenberg, Buddy
We have 3 nodes set up, 1 with several large drives, 1 with a handful of
small ssds, and 1 with several nvme drives.
This is a very unusual setup. Do you really have all your HDDs in one
node, the SSDs in another and NVMe in the third?
How do you guarantee redundancy?
Disk type != redundancy.
You should evenly distribute your storage devices across your nodes,
this may already be a performance boost as it distributes the requests.
If performance is indeed important, it makes sense to do as what Robert
suggests. If you want to reduce the chance of having your drives in the
three different hosts die at the same time, it can make sense. Assuming
you have 3 replicas and host as failure domain.

Gr. Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / ***@bit.nl
Marc Roos
2018-12-10 08:54:16 UTC
Permalink
I think this is a April fools day joke of someone that did not setup his
time correctly.



-----Original Message-----
From: Robert Sander [mailto:***@heinlein-support.de]
Sent: 10 December 2018 09:49
To: ceph-***@lists.ceph.com
Subject: Re: [ceph-users] Performance Problems
Post by Scharfenberg, Buddy
We have 3 nodes set up, 1 with several large drives, 1 with a handful
of small ssds, and 1 with several nvme drives.
This is a very unusual setup. Do you really have all your HDDs in one
node, the SSDs in another and NVMe in the third?

How do you guarantee redundancy?

You should evenly distribute your storage devices across your nodes,
this may already be a performance boost as it distributes the requests.

Regards
--
Robert Sander
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 93818 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin

Loading...