Rodrigo Embeita
2018-11-21 18:04:42 UTC
Hi guys, maybe someone can help me.
I'm new with CephFS and I was testing the installation of Ceph Mimic with
ceph-deploy in 2 ubuntu 16.04 nodes.
These two nodes have 6 OSD disks each.
I've installed CephFS and 2 MDS service.
The problem is that I copied a lot of data (15 Millions of small files) and
I lost one of this 2 nodes.
The node lost had the MDS service working and is suppose to be moved to the
other Ceph host but the MDS service got stuck on rejoin status.
The problem is that the status of the cluster seems to be down and I'm not
able to connect CephFS.
***@pf-us1-dfs1:/var/log/ceph# ceph status
cluster:
id: 459cdedc-488e-49ed-8b16-36cf843cef76
health: HEALTH_WARN
1 filesystem is degraded
1 MDSs report slow metadata IOs
3 osds down
1 host (6 osds) down
5313/50445780 objects misplaced (0.011%)
Reduced data availability: 7 pgs inactive, 7 pgs down
Degraded data redundancy: 25192943/50445780 objects degraded
(49.941%), 265 pgs degraded, 283 pgs undersized
1/3 mons down, quorum pf-us1-dfs3,pf-us1-dfs1
services:
mon: 3 daemons, quorum pf-us1-dfs3,pf-us1-dfs1, out of quorum:
pf-us1-dfs2
mgr: pf-us1-dfs3(active)
mds: cephfs-1/1/1 up {0=pf-us1-dfs1=up:rejoin}, 1 up:standby
osd: 13 osds: 6 up, 9 in; 6 remapped pgs
rgw: 1 daemon active
data:
pools: 7 pools, 296 pgs
objects: 25.22 M objects, 644 GiB
usage: 2.0 TiB used, 42 TiB / 44 TiB avail
pgs: 2.365% pgs not active
25192943/50445780 objects degraded (49.941%)
5313/50445780 objects misplaced (0.011%)
265 active+undersized+degraded
18 active+undersized
7 down
6 active+clean+remapped
And the MDS service wrote the following on the log for over 14 hours and
never stop.
2018-11-21 10:06:12.585 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18421 from mon.2
2018-11-21 10:06:16.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18422 from mon.2
2018-11-21 10:06:20.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18423 from mon.2
2018-11-21 10:06:24.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18424 from mon.2
2018-11-21 10:06:32.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18425 from mon.2
2018-11-21 10:06:36.594 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18426 from mon.2
2018-11-21 10:06:40.606 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18427 from mon.2
2018-11-21 10:06:44.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18428 from mon.2
2018-11-21 10:06:52.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18429 from mon.2
2018-11-21 10:06:56.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18430 from mon.2
2018-11-21 10:07:00.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18431 from mon.2
2018-11-21 10:07:04.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18432 from mon.2
2018-11-21 10:07:12.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18433 from mon.2
2018-11-21 10:07:16.602 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18434 from mon.2
2018-11-21 10:07:20.602 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18435 from mon.2
2018-11-21 10:07:24.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18436 from mon.2
2018-11-21 10:07:32.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18437 from mon.2
2018-11-21 10:07:36.614 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18438 from mon.2
2018-11-21 10:07:40.626 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18439 from mon.2
Please someone help me.
I'm new with CephFS and I was testing the installation of Ceph Mimic with
ceph-deploy in 2 ubuntu 16.04 nodes.
These two nodes have 6 OSD disks each.
I've installed CephFS and 2 MDS service.
The problem is that I copied a lot of data (15 Millions of small files) and
I lost one of this 2 nodes.
The node lost had the MDS service working and is suppose to be moved to the
other Ceph host but the MDS service got stuck on rejoin status.
The problem is that the status of the cluster seems to be down and I'm not
able to connect CephFS.
***@pf-us1-dfs1:/var/log/ceph# ceph status
cluster:
id: 459cdedc-488e-49ed-8b16-36cf843cef76
health: HEALTH_WARN
1 filesystem is degraded
1 MDSs report slow metadata IOs
3 osds down
1 host (6 osds) down
5313/50445780 objects misplaced (0.011%)
Reduced data availability: 7 pgs inactive, 7 pgs down
Degraded data redundancy: 25192943/50445780 objects degraded
(49.941%), 265 pgs degraded, 283 pgs undersized
1/3 mons down, quorum pf-us1-dfs3,pf-us1-dfs1
services:
mon: 3 daemons, quorum pf-us1-dfs3,pf-us1-dfs1, out of quorum:
pf-us1-dfs2
mgr: pf-us1-dfs3(active)
mds: cephfs-1/1/1 up {0=pf-us1-dfs1=up:rejoin}, 1 up:standby
osd: 13 osds: 6 up, 9 in; 6 remapped pgs
rgw: 1 daemon active
data:
pools: 7 pools, 296 pgs
objects: 25.22 M objects, 644 GiB
usage: 2.0 TiB used, 42 TiB / 44 TiB avail
pgs: 2.365% pgs not active
25192943/50445780 objects degraded (49.941%)
5313/50445780 objects misplaced (0.011%)
265 active+undersized+degraded
18 active+undersized
7 down
6 active+clean+remapped
And the MDS service wrote the following on the log for over 14 hours and
never stop.
2018-11-21 10:06:12.585 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18421 from mon.2
2018-11-21 10:06:16.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18422 from mon.2
2018-11-21 10:06:20.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18423 from mon.2
2018-11-21 10:06:24.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18424 from mon.2
2018-11-21 10:06:32.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18425 from mon.2
2018-11-21 10:06:36.594 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18426 from mon.2
2018-11-21 10:06:40.606 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18427 from mon.2
2018-11-21 10:06:44.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18428 from mon.2
2018-11-21 10:06:52.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18429 from mon.2
2018-11-21 10:06:56.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18430 from mon.2
2018-11-21 10:07:00.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18431 from mon.2
2018-11-21 10:07:04.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18432 from mon.2
2018-11-21 10:07:12.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18433 from mon.2
2018-11-21 10:07:16.602 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18434 from mon.2
2018-11-21 10:07:20.602 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18435 from mon.2
2018-11-21 10:07:24.586 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18436 from mon.2
2018-11-21 10:07:32.590 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18437 from mon.2
2018-11-21 10:07:36.614 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18438 from mon.2
2018-11-21 10:07:40.626 7f1b80873700 1 mds.pf-us1-dfs1 Updating MDS map to
version 18439 from mon.2
Please someone help me.