Discussion:
[ceph-users] benchmarking ceph mds
Qing Zheng
2014-06-07 01:02:51 UTC
Permalink
Hi,

I'm not sure if this question makes sense, but ...
Will perform a client-side rate control (limiting the number of requests
sent per second) help in avoiding a MDS crash?

I'm currently trying to get a baseline metadata performance of cephfs with
multiple *active* mds servers and directory splitting. The plan is to
increase the number of mds servers from 8 to 16, or more. The problem I'm
having right now is that:

1) My testing program will keep creating empty files under multiple
directories in ceph fs, and some mds servers may crash in the middle of a
test run. However, I think it is kind of unfair to report the throughput
that I'm able to get before the crash, since in the beginning only 1 mds
server is doing the work. It takes time for ceph to fully balance all its
*active* mds servers. The problem is that cephfs might crash before load
balance is achieved. So If we let the fs clients to slowly create files
until all active mds servers get a share on the file system tree, would this
hopefully reduce the odds of a crash?

2) How do we know whether a set of active mds servers are load balanced?
What I'm doing currently is to check the CPU on these mds servers. Are there
any better ways to do this?

Cheers,

-- Qing
Yan, Zheng
2014-06-07 03:09:23 UTC
Permalink
Post by Qing Zheng
Hi,
I'm not sure if this question makes sense, but ...
Will perform a client-side rate control (limiting the number of requests
sent per second) help in avoiding a MDS crash?
I'm currently trying to get a baseline metadata performance of cephfs with
multiple *active* mds servers and directory splitting. The plan is to
increase the number of mds servers from 8 to 16, or more. The problem I'm
1) My testing program will keep creating empty files under multiple
directories in ceph fs, and some mds servers may crash in the middle of a
test run. However, I think it is kind of unfair to report the throughput
that I'm able to get before the crash, since in the beginning only 1 mds
server is doing the work. It takes time for ceph to fully balance all its
*active* mds servers. The problem is that cephfs might crash before load
balance is achieved. So If we let the fs clients to slowly create files
until all active mds servers get a share on the file system tree, would this
hopefully reduce the odds of a crash?
2) How do we know whether a set of active mds servers are load balanced?
What I'm doing currently is to check the CPU on these mds servers. Are there
any better ways to do this?
From my experience, the load balancer does not work well. One option
to benchmark multiple mds is disable the dynamic load balancer (make
MDBalancer::try_rebalance()
return at the very beginning), and use 'ceph mds tell \* export_dir
...' to static distribute directories to multiple MDS.

Regards
Yan, Zheng
Post by Qing Zheng
Cheers,
-- Qing
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Loading...