Discussion:
[ceph-users] What could cause mon_osd_full_ratio to be exceeded?
Vladimir Brik
2018-11-26 15:28:38 UTC
Permalink
Hello

I am doing some Ceph testing on a near-full cluster, and I noticed that,
after I brought down a node, some OSDs' utilization reached
osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio
(90%) if mon_osd_backfillfull_ratio is 90%?


Thanks,

Vlad
Vladimir Brik
2018-11-26 15:55:17 UTC
Permalink
Why didn't it stop at mon_osd_full_ratio (90%)
Should be 95%

Vlad
Hello
I am doing some Ceph testing on a near-full cluster, and I noticed that,
after I brought down a node, some OSDs' utilization reached
osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio
(90%) if mon_osd_backfillfull_ratio is 90%?
Thanks,
Vlad
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Gregory Farnum
2018-11-26 16:27:21 UTC
Permalink
On Mon, Nov 26, 2018 at 10:28 AM Vladimir Brik
Post by Vladimir Brik
Hello
I am doing some Ceph testing on a near-full cluster, and I noticed that,
after I brought down a node, some OSDs' utilization reached
osd_failsafe_full_ratio (97%). Why didn't it stop at mon_osd_full_ratio
(90%) if mon_osd_backfillfull_ratio is 90%?
While I believe the very newest Ceph source will do this, it can be
surprisingly difficult to identify the exact size a PG will take up on
disk (thanks to omap/RocksDB data), and so for a long time we pretty
much didn't try — these ratios were checked when starting a backfill,
but we didn't try to predict where they would end up and limit
ourselves based on that.
-Greg
Post by Vladimir Brik
Thanks,
Vlad
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Loading...