[ceph-users] What could cause mon_osd_full

Discussion:

[ceph-users] What could cause mon_osd_full_ratio to be exceeded?

Vladimir Brik

2018-11-26 15:28:38 UTC

Permalink

Vladimir Brik

2018-11-26 15:55:17 UTC

Permalink

Why didn't it stop at mon_osd_full_ratio (90%)

Should be 95%

Vlad

Hello
I am doing some Ceph testing on a near-full cluster, and I noticed that,
after I brought down a node, some OSDs' utilization reached
osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio
(90%) if mon_osd_backfillfull_ratio is 90%?
Thanks,
Vlad
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Gregory Farnum

2018-11-26 16:27:21 UTC

Permalink

On Mon, Nov 26, 2018 at 10:28 AM Vladimir Brik

Post by Vladimir Brik
Hello
I am doing some Ceph testing on a near-full cluster, and I noticed that,
after I brought down a node, some OSDs' utilization reached
osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio
(90%) if mon_osd_backfillfull_ratio is 90%?

While I believe the very newest Ceph source will do this, it can be
surprisingly difficult to identify the exact size a PG will take up on
disk (thanks to omap/RocksDB data), and so for a long time we pretty
much didn't try — these ratios were checked when starting a backfill,
but we didn't try to predict where they would end up and limit
ourselves based on that.
-Greg

Post by Vladimir Brik
Thanks,
Vlad
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com