Lihang
2016-07-03 07:06:21 UTC
***@BoreNode2:~# ceph -v
ceph version 10.2.0
·¢ŒþÈË: lihang 12398 (RD)
·¢ËÍʱŒä: 2016Äê7ÔÂ3ÈÕ 14:47
ÊÕŒþÈË: ceph-***@lists.ceph.com
³ËÍ: Ceph Development; '***@gmail.com'; zhengbin 08747 (RD); xusangdi 11976 (RD)
Ö÷Ìâ: how to fix the mds damaged issue
Hi, my ceph cluster mds is damaged and the cluster is degraded after our machines library power down suddenly. then the cluster is ¡°HEALTH_ERR¡± and cann¡¯t be recovered to health by itself after my
Reboot the storage node system or restart the ceph cluster yet. After that I also use the following command to remove the damaged mds, but the damaged mds be removed failed and the issue exist still. The another two mds state is standby. Who can tell me how to fix this issue and find out what happened in my cluter?
the remove damaged mds process in my storage node as follows.
1> Execute ¡±stop ceph-mds-all¡± command in the damaged mds node
2> ceph mds rmfailed 0 --yes-i-really-mean-it
3> ***@BoreNode2:~# ceph mds rm 0
mds gid 0 dne
The detailed status of my cluster as following:
***@BoreNode2:~# ceph -s
cluster 98edd275-5df7-414f-a202-c3d4570f251c
health HEALTH_ERR
mds rank 0 is damaged
mds cluster is degraded
monmap e1: 3 mons at {BoreNode2=172.16.65.141:6789/0,BoreNode3=172.16.65.142:6789/0,BoreNode4=172.16.65.143:6789/0}
election epoch 1010, quorum 0,1,2 BoreNode2,BoreNode3,BoreNode4
fsmap e168: 0/1/1 up, 3 up:standby, 1 damaged
osdmap e338: 8 osds: 8 up, 8 in
flags sortbitwise
pgmap v17073: 1560 pgs, 5 pools, 218 kB data, 32 objects
423 MB used, 3018 GB / 3018 GB avail
1560 active+clean
***@BoreNode2:~# ceph mds dump
dumped fsmap epoch 168
fs_name TudouFS
epoch 156
flags 0
created 2016-04-02 02:48:11.150539
modified 2016-04-03 03:04:57.347064
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 83
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
max_mds 1
in 0
up {}
failed
damaged 0
stopped
data_pools 4
metadata_pool 3
inline_data disabled
-------------------------------------------------------------------------------------------------------------------------------------
±ŸÓÊŒþŒ°ÆäžœŒþº¬ÓкŒÖÝ»ªÈýÍšÐÅŒŒÊõÓÐÏÞ¹«ËŸµÄ±£ÃÜÐÅÏ¢£¬œöÏÞÓÚ·¢ËÍžøÉÏÃæµØÖ·ÖÐÁгö
µÄžöÈË»òȺ×é¡£œûÖ¹ÈκÎÆäËûÈËÒÔÈκÎÐÎʜʹÓãš°üÀšµ«²»ÏÞÓÚÈ«²¿»ò²¿·ÖµØй¶¡¢žŽÖÆ¡¢
»òÉ¢·¢£©±ŸÓÊŒþÖеÄÐÅÏ¢¡£Èç¹ûÄúŽíÊÕÁ˱ŸÓÊŒþ£¬ÇëÄúÁ¢ŒŽµç»°»òÓÊŒþÍšÖª·¢ŒþÈ˲¢ÉŸ³ý±Ÿ
ÓÊŒþ£¡
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!
ceph version 10.2.0
·¢ŒþÈË: lihang 12398 (RD)
·¢ËÍʱŒä: 2016Äê7ÔÂ3ÈÕ 14:47
ÊÕŒþÈË: ceph-***@lists.ceph.com
³ËÍ: Ceph Development; '***@gmail.com'; zhengbin 08747 (RD); xusangdi 11976 (RD)
Ö÷Ìâ: how to fix the mds damaged issue
Hi, my ceph cluster mds is damaged and the cluster is degraded after our machines library power down suddenly. then the cluster is ¡°HEALTH_ERR¡± and cann¡¯t be recovered to health by itself after my
Reboot the storage node system or restart the ceph cluster yet. After that I also use the following command to remove the damaged mds, but the damaged mds be removed failed and the issue exist still. The another two mds state is standby. Who can tell me how to fix this issue and find out what happened in my cluter?
the remove damaged mds process in my storage node as follows.
1> Execute ¡±stop ceph-mds-all¡± command in the damaged mds node
2> ceph mds rmfailed 0 --yes-i-really-mean-it
3> ***@BoreNode2:~# ceph mds rm 0
mds gid 0 dne
The detailed status of my cluster as following:
***@BoreNode2:~# ceph -s
cluster 98edd275-5df7-414f-a202-c3d4570f251c
health HEALTH_ERR
mds rank 0 is damaged
mds cluster is degraded
monmap e1: 3 mons at {BoreNode2=172.16.65.141:6789/0,BoreNode3=172.16.65.142:6789/0,BoreNode4=172.16.65.143:6789/0}
election epoch 1010, quorum 0,1,2 BoreNode2,BoreNode3,BoreNode4
fsmap e168: 0/1/1 up, 3 up:standby, 1 damaged
osdmap e338: 8 osds: 8 up, 8 in
flags sortbitwise
pgmap v17073: 1560 pgs, 5 pools, 218 kB data, 32 objects
423 MB used, 3018 GB / 3018 GB avail
1560 active+clean
***@BoreNode2:~# ceph mds dump
dumped fsmap epoch 168
fs_name TudouFS
epoch 156
flags 0
created 2016-04-02 02:48:11.150539
modified 2016-04-03 03:04:57.347064
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 83
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
max_mds 1
in 0
up {}
failed
damaged 0
stopped
data_pools 4
metadata_pool 3
inline_data disabled
-------------------------------------------------------------------------------------------------------------------------------------
±ŸÓÊŒþŒ°ÆäžœŒþº¬ÓкŒÖÝ»ªÈýÍšÐÅŒŒÊõÓÐÏÞ¹«ËŸµÄ±£ÃÜÐÅÏ¢£¬œöÏÞÓÚ·¢ËÍžøÉÏÃæµØÖ·ÖÐÁгö
µÄžöÈË»òȺ×é¡£œûÖ¹ÈκÎÆäËûÈËÒÔÈκÎÐÎʜʹÓãš°üÀšµ«²»ÏÞÓÚÈ«²¿»ò²¿·ÖµØй¶¡¢žŽÖÆ¡¢
»òÉ¢·¢£©±ŸÓÊŒþÖеÄÐÅÏ¢¡£Èç¹ûÄúŽíÊÕÁ˱ŸÓÊŒþ£¬ÇëÄúÁ¢ŒŽµç»°»òÓÊŒþÍšÖª·¢ŒþÈ˲¢ÉŸ³ý±Ÿ
ÓÊŒþ£¡
This e-mail and its attachments contain confidential information from H3C, which is
intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
by phone or email immediately and delete it!