[ceph-users] SLOW SSD's after moving to Bluestore

Discussion:

Tyler Bishop

2018-12-11 00:09:53 UTC

Hi,

I have an SSD only cluster that I recently converted from filestore to
bluestore and performance has totally tanked. It was fairly decent before,
only having a little additional latency than expected. Now since
converting to bluestore the latency is extremely high, SECONDS. I am
trying to determine if it an issue with the SSD's or Bluestore treating
them differently than filestore... potential garbage collection? 24+ hrs ???

I am now seeing constant 100% IO utilization on ALL of the devices and
performance is terrible!

IOSTAT

avg-cpu: %user %nice %system %iowait %steal %idle
1.37 0.00 0.34 18.59 0.00 79.70

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 9.50 0.00 64.00 13.47
0.01 1.16 0.00 1.16 1.11 1.05
sdb 0.00 96.50 4.50 46.50 34.00 11776.00 463.14
132.68 1174.84 782.67 1212.80 19.61 100.00
dm-0 0.00 0.00 5.50 128.00 44.00 8162.00 122.94
507.84 1704.93 674.09 1749.23 7.49 100.00

avg-cpu: %user %nice %system %iowait %steal %idle
0.85 0.00 0.30 23.37 0.00 75.48

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.00 0.00 17.00 11.33
0.01 2.17 0.00 2.17 2.17 0.65
sdb 0.00 24.50 9.50 40.50 74.00 10000.00 402.96
83.44 2048.67 1086.11 2274.46 20.00 100.00
dm-0 0.00 0.00 10.00 33.50 78.00 2120.00 101.06
287.63 8590.47 1530.40 10697.96 22.99 100.00

avg-cpu: %user %nice %system %iowait %steal %idle
0.81 0.00 0.30 11.40 0.00 87.48

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 6.00 0.00 40.25 13.42
0.01 1.33 0.00 1.33 1.25 0.75
sdb 0.00 314.50 15.50 72.00 122.00 17264.00 397.39
61.21 1013.30 740.00 1072.13 11.41 99.85
dm-0 0.00 0.00 10.00 427.00 78.00 27728.00 127.26
224.12 712.01 1147.00 701.82 2.28 99.85

avg-cpu: %user %nice %system %iowait %steal %idle
1.22 0.00 0.29 4.01 0.00 94.47

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.50 0.00 17.00 9.71
0.00 1.29 0.00 1.29 1.14 0.40
sdb 0.00 0.00 1.00 39.50 8.00 10112.00 499.75
78.19 1711.83 1294.50 1722.39 24.69 100.00

Mark Nelson

2018-12-11 00:43:31 UTC

Permalink

Hi Tyler,

I think we had a user a while back that reported they had background
deletion work going on after upgrading their OSDs from filestore to
bluestore due to PGs having been moved around. Is it possible that your
cluster is doing a bunch of work (deletion or otherwise) beyond the
regular client load? I don't remember how to check for this off the top
of my head, but it might be something to investigate. If that's what it
is, we just recently added the ability to throttle background deletes:

https://github.com/ceph/ceph/pull/24749

If the logs/admin socket don't tell you anything, you could also try
using our wallclock profiler to see what the OSD is spending it's time
doing:

https://github.com/markhpc/gdbpmp/

./gdbpmp -t 1000 -p`pidof ceph-osd` -o foo.gdbpmp

./gdbpmp -i foo.gdbpmp -t 1

Mark

Post by Tyler Bishop
Hi,
I have an SSD only cluster that I recently converted from filestore to
bluestore and performance has totally tanked. It was fairly decent
before, only having a little additional latency than expected. Now
since converting to bluestore the latency is extremely high, SECONDS.
I am trying to determine if it an issue with the SSD's or Bluestore
treating them differently than filestore... potential garbage
collection? 24+ hrs ???
I am now seeing constant 100% IO utilization on ALL of the devices and
performance is terrible!
IOSTAT
avg-cpu: %user %nice %system %iowait %steal %idle
1.37 0.00 0.34 18.59 0.00 79.70
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 9.50 0.00 64.00
13.47 0.01 1.16 0.00 1.16 1.11 1.05
sdb 0.00 96.50 4.50 46.50 34.00 11776.00
463.14 132.68 1174.84 782.67 1212.80 19.61 100.00
dm-0 0.00 0.00 5.50 128.00 44.00 8162.00
122.94 507.84 1704.93 674.09 1749.23 7.49 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.85 0.00 0.30 23.37 0.00 75.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.00 0.00 17.00
11.33 0.01 2.17 0.00 2.17 2.17 0.65
sdb 0.00 24.50 9.50 40.50 74.00 10000.00
402.96 83.44 2048.67 1086.11 2274.46 20.00 100.00
dm-0 0.00 0.00 10.00 33.50 78.00 2120.00
101.06 287.63 8590.47 1530.40 10697.96 22.99 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.81 0.00 0.30 11.40 0.00 87.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 6.00 0.00 40.25
13.42 0.01 1.33 0.00 1.33 1.25 0.75
sdb 0.00 314.50 15.50 72.00 122.00 17264.00
397.39 61.21 1013.30 740.00 1072.13 11.41 99.85
dm-0 0.00 0.00 10.00 427.00 78.00 27728.00
127.26 224.12 712.01 1147.00 701.82 2.28 99.85
avg-cpu: %user %nice %system %iowait %steal %idle
1.22 0.00 0.29 4.01 0.00 94.47
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.50 0.00 17.00
9.71 0.00 1.29 0.00 1.29 1.14 0.40
sdb 0.00 0.00 1.00 39.50 8.00 10112.00
499.75 78.19 1711.83 1294.50 1722.39 24.69 100.00
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Tyler Bishop

2018-12-11 01:43:40 UTC

Permalink

I don't think thats my issue here because I don't see any IO to justify the
latency. Unless the IO is minimal and its ceph issuing a bunch of discards
to the ssd and its causing it to slow down while doing that.

Log isn't showing anything useful and I have most debugging disabled.

Post by Mark Nelson
Hi Tyler,
I think we had a user a while back that reported they had background
deletion work going on after upgrading their OSDs from filestore to
bluestore due to PGs having been moved around. Is it possible that your
cluster is doing a bunch of work (deletion or otherwise) beyond the
regular client load? I don't remember how to check for this off the top
of my head, but it might be something to investigate. If that's what it
https://github.com/ceph/ceph/pull/24749
If the logs/admin socket don't tell you anything, you could also try
using our wallclock profiler to see what the OSD is spending it's time
https://github.com/markhpc/gdbpmp/
./gdbpmp -t 1000 -p`pidof ceph-osd` -o foo.gdbpmp
./gdbpmp -i foo.gdbpmp -t 1
Mark

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Christian Balzer

2018-12-11 01:57:19 UTC

Permalink

Hello,

Post by Tyler Bishop
I don't think thats my issue here because I don't see any IO to justify the
latency. Unless the IO is minimal and its ceph issuing a bunch of discards
to the ssd and its causing it to slow down while doing that.

Post by Tyler Bishop
Log isn't showing anything useful and I have most debugging disabled.

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Christian Balzer Network/Systems Engineer
***@gol.com Rakuten Communications

Tyler Bishop

2018-12-11 01:58:07 UTC

Permalink

Older Crucial/Micron M500/M600
_____________________________________________

*Tyler Bishop*
EST 2007

O: 513-299-7108 x1000
M: 513-646-5809
http://BeyondHosting.net <http://beyondhosting.net/>

This email is intended only for the recipient(s) above and/or
otherwise authorized personnel. The information contained herein and
attached is confidential and the property of Beyond Hosting. Any
unauthorized copying, forwarding, printing, and/or disclosing
any information related to this email is prohibited. If you received this
message in error, please contact the sender and destroy all copies of this
email and any attachment(s).

Post by Christian Balzer
Hello,

Post by Tyler Bishop
I don't think thats my issue here because I don't see any IO to justify

the

Post by Tyler Bishop
latency. Unless the IO is minimal and its ceph issuing a bunch of

discards

Post by Tyler Bishop
to the ssd and its causing it to slow down while doing that.

Post by Tyler Bishop
Log isn't showing anything useful and I have most debugging disabled.

your

Post by Tyler Bishop

Post by Mark Nelson
cluster is doing a bunch of work (deletion or otherwise) beyond the
regular client load? I don't remember how to check for this off the

top

Post by Tyler Bishop

Post by Mark Nelson
of my head, but it might be something to investigate. If that's what

Post by Tyler Bishop

Post by Mark Nelson
https://github.com/ceph/ceph/pull/24749
If the logs/admin socket don't tell you anything, you could also try
using our wallclock profiler to see what the OSD is spending it's time
https://github.com/markhpc/gdbpmp/
./gdbpmp -t 1000 -p`pidof ceph-osd` -o foo.gdbpmp
./gdbpmp -i foo.gdbpmp -t 1
Mark

Post by Tyler Bishop
Hi,
I have an SSD only cluster that I recently converted from filestore

Post by Tyler Bishop

Post by Mark Nelson

Post by Tyler Bishop
bluestore and performance has totally tanked. It was fairly decent
before, only having a little additional latency than expected. Now
since converting to bluestore the latency is extremely high, SECONDS.
I am trying to determine if it an issue with the SSD's or Bluestore
treating them differently than filestore... potential garbage
collection? 24+ hrs ???
I am now seeing constant 100% IO utilization on ALL of the devices

and

Post by Tyler Bishop

Post by Mark Nelson

Post by Tyler Bishop
performance is terrible!
IOSTAT
avg-cpu: %user %nice %system %iowait %steal %idle
1.37 0.00 0.34 18.59 0.00 79.70
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 9.50 0.00 64.00
13.47 0.01 1.16 0.00 1.16 1.11 1.05
sdb 0.00 96.50 4.50 46.50 34.00 11776.00
463.14 132.68 1174.84 782.67 1212.80 19.61 100.00
dm-0 0.00 0.00 5.50 128.00 44.00 8162.00
122.94 507.84 1704.93 674.09 1749.23 7.49 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.85 0.00 0.30 23.37 0.00 75.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.00 0.00 17.00
11.33 0.01 2.17 0.00 2.17 2.17 0.65
sdb 0.00 24.50 9.50 40.50 74.00 10000.00
402.96 83.44 2048.67 1086.11 2274.46 20.00 100.00
dm-0 0.00 0.00 10.00 33.50 78.00 2120.00
101.06 287.63 8590.47 1530.40 10697.96 22.99 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.81 0.00 0.30 11.40 0.00 87.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 6.00 0.00 40.25
13.42 0.01 1.33 0.00 1.33 1.25 0.75
sdb 0.00 314.50 15.50 72.00 122.00 17264.00
397.39 61.21 1013.30 740.00 1072.13 11.41 99.85
dm-0 0.00 0.00 10.00 427.00 78.00 27728.00
127.26 224.12 712.01 1147.00 701.82 2.28 99.85
avg-cpu: %user %nice %system %iowait %steal %idle
1.22 0.00 0.29 4.01 0.00 94.47
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s
avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 3.50 0.00 17.00
9.71 0.00 1.29 0.00 1.29 1.14 0.40
sdb 0.00 0.00 1.00 39.50 8.00 10112.00
499.75 78.19 1711.83 1294.50 1722.39 24.69 100.00
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Christian Balzer Network/Systems Engineer

Tyler Bishop

2018-12-11 02:01:31 UTC

Permalink

Post by Christian Balzer
Hello,

Post by Tyler Bishop
Log isn't showing anything useful and I have most debugging disabled.

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Christian Balzer Network/Systems Engineer