Scharfenberg, Buddy
2018-12-07 17:33:54 UTC
Hello all,
I'm new to Ceph management, and we're having some performance issues with a basic cluster we've set up.
We have 3 nodes set up, 1 with several large drives, 1 with a handful of small ssds, and 1 with several nvme drives. We have 46 OSDs in total, a healthy FS being served out, and 1024 pgs split over metadata and data pools. I am having performance problems on the clients which I've been unable to nail down to the cluster itself and could use some guidance. I am seeing around 600MB/s out of each pool using rados bench, however I'm only seeing around 6MB/s direct transfer from clients using fuse and 30MB/s using the kernel client. I've asked over in IRC and have been told essentially that my performance will be tied to our lowest performing OSD speed / ( 2 * ${num_rep} ) and I have numbers which reflect that as my lowest performing disks are 180 MB/s according to osd bench and my writes are down around 30MB/s at best, with replication at 3. (180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.
I'm new to Ceph management, and we're having some performance issues with a basic cluster we've set up.
We have 3 nodes set up, 1 with several large drives, 1 with a handful of small ssds, and 1 with several nvme drives. We have 46 OSDs in total, a healthy FS being served out, and 1024 pgs split over metadata and data pools. I am having performance problems on the clients which I've been unable to nail down to the cluster itself and could use some guidance. I am seeing around 600MB/s out of each pool using rados bench, however I'm only seeing around 6MB/s direct transfer from clients using fuse and 30MB/s using the kernel client. I've asked over in IRC and have been told essentially that my performance will be tied to our lowest performing OSD speed / ( 2 * ${num_rep} ) and I have numbers which reflect that as my lowest performing disks are 180 MB/s according to osd bench and my writes are down around 30MB/s at best, with replication at 3. (180/(2*3)=30)
What I was wondering is what, if anything I can do to get performance for the individual clients near at least the write performance of my slowest OSDs. Also given the constraints I have on most of my clients, how can I get better performance out of the ceph-fuse client?
Thanks,
Buddy.