[ceph-users] Slow requests blocked. No rebalancing

Discussion:

Jaime Ibar

2018-09-20 11:25:42 UTC

Hi all,

we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're
trying to migrate the

osd's to Bluestore following this document[0], however when I mark the
osd as out,

I'm getting warnings similar to these ones

2018-09-20 09:32:46.079630 mon.dri-ceph01 [WRN] Health check failed: 2
slow requests are blocked > 32 sec. Implicated osds 16,28 (REQUEST_SLOW)
2018-09-20 09:32:52.841123 mon.dri-ceph01 [WRN] Health check update: 7
slow requests are blocked > 32 sec. Implicated osds 10,16,28,32,59
(REQUEST_SLOW)
2018-09-20 09:32:57.842230 mon.dri-ceph01 [WRN] Health check update: 15
slow requests are blocked > 32 sec. Implicated osds
10,16,28,31,32,59,78,80 (REQUEST_SLOW)

2018-09-20 09:32:58.851142 mon.dri-ceph01 [WRN] Health check update:
244944/40100780 objects misplaced (0.611%) (OBJECT_MISPLACED)
2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249
PGs pending on creation (PENDING_CREATING_PGS)

which prevent ceph start rebalancing and the vm's running on ceph start
hanging and we have to mark the osd back in.

I tried to reweight the osd to 0.90 in order to minimize the impact on
the cluster but the warnings are the same.

I tried to increased these settings to

mds cache memory limit = 2147483648
rocksdb cache size = 2147483648

but with no luck, same warnings.

We also have cephfs for storing files from different projects(no
directory fragmentation enabled).

The problem here is that if one osd dies, all the services will be
blocked as ceph won't be able to

start rebalancing.

The cluster is

- 3 mons

- 3 mds(running on the same hosts as the mons). 2 mds active and 1 standby

- 3 mgr(running on the same hosts as the mons)

- 6 servers, 12 osd's each.

- 1GB private network

Does anyone know how to fix or where the problem could be?

Thanks a lot in advance.

Jaime

[0] http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/

--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ |***@tchpc.tcd.ie
Tel: +353-1-896-3725

Darius Kasparavičius

2018-09-20 11:43:28 UTC

Permalink

Hello,

2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update:
249 PGs pending on creation (PENDING_CREATING_PGS)

This error might indicate that you are hitting a PG limit per osd.
Here some information on it
https://ceph.com/community/new-luminous-pg-overdose-protection/ . You
might need to increase mon_max_pg_per_osd for OSD to start balancing
out.

Post by Jaime Ibar
Hi all,
we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're trying to migrate the
osd's to Bluestore following this document[0], however when I mark the osd as out,
I'm getting warnings similar to these ones
2018-09-20 09:32:46.079630 mon.dri-ceph01 [WRN] Health check failed: 2 slow requests are blocked > 32 sec. Implicated osds 16,28 (REQUEST_SLOW)
2018-09-20 09:32:52.841123 mon.dri-ceph01 [WRN] Health check update: 7 slow requests are blocked > 32 sec. Implicated osds 10,16,28,32,59 (REQUEST_SLOW)
2018-09-20 09:32:57.842230 mon.dri-ceph01 [WRN] Health check update: 15 slow requests are blocked > 32 sec. Implicated osds 10,16,28,31,32,59,78,80 (REQUEST_SLOW)
2018-09-20 09:32:58.851142 mon.dri-ceph01 [WRN] Health check update: 244944/40100780 objects misplaced (0.611%) (OBJECT_MISPLACED)
2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249 PGs pending on creation (PENDING_CREATING_PGS)
which prevent ceph start rebalancing and the vm's running on ceph start hanging and we have to mark the osd back in.
I tried to reweight the osd to 0.90 in order to minimize the impact on the cluster but the warnings are the same.
I tried to increased these settings to
mds cache memory limit = 2147483648
rocksdb cache size = 2147483648
but with no luck, same warnings.
We also have cephfs for storing files from different projects(no directory fragmentation enabled).
The problem here is that if one osd dies, all the services will be blocked as ceph won't be able to
start rebalancing.
The cluster is
- 3 mons
- 3 mds(running on the same hosts as the mons). 2 mds active and 1 standby
- 3 mgr(running on the same hosts as the mons)
- 6 servers, 12 osd's each.
- 1GB private network
Does anyone know how to fix or where the problem could be?
Thanks a lot in advance.
Jaime
[0] http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Tel: +353-1-896-3725
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Eugen Block

2018-09-20 11:51:04 UTC

Permalink

Hi,

to reduce impact on clients during migration I would set the OSD's
primary-affinity to 0 beforehand. This should prevent the slow
requests, at least this setting has helped us a lot with problematic
OSDs.

Regards
Eugen

Post by Jaime Ibar
Hi all,
we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're
trying to migrate the
osd's to Bluestore following this document[0], however when I mark
the osd as out,
I'm getting warnings similar to these ones
2 slow requests are blocked > 32 sec. Implicated osds 16,28
(REQUEST_SLOW)
7 slow requests are blocked > 32 sec. Implicated osds 10,16,28,32,59
(REQUEST_SLOW)
15 slow requests are blocked > 32 sec. Implicated osds
10,16,28,31,32,59,78,80 (REQUEST_SLOW)
244944/40100780 objects misplaced (0.611%) (OBJECT_MISPLACED)
249 PGs pending on creation (PENDING_CREATING_PGS)
which prevent ceph start rebalancing and the vm's running on ceph
start hanging and we have to mark the osd back in.
I tried to reweight the osd to 0.90 in order to minimize the impact
on the cluster but the warnings are the same.
I tried to increased these settings to
mds cache memory limit = 2147483648
rocksdb cache size = 2147483648
but with no luck, same warnings.
We also have cephfs for storing files from different projects(no
directory fragmentation enabled).
The problem here is that if one osd dies, all the services will be
blocked as ceph won't be able to
start rebalancing.
The cluster is
- 3 mons
- 3 mds(running on the same hosts as the mons). 2 mds active and 1 standby
- 3 mgr(running on the same hosts as the mons)
- 6 servers, 12 osd's each.
- 1GB private network
Does anyone know how to fix or where the problem could be?
Thanks a lot in advance.
Jaime
[0] http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Tel: +353-1-896-3725

Paul Emmerich

2018-09-20 13:25:49 UTC

Permalink

You can prevent creation of the PGs on the old filestore OSDs (which
seems to be the culprit here) during replacement by replacing the
disks the hard way:

* ceph osd destroy osd.X
* re-create with bluestore under the same id (ceph volume ... --osd-id X)

it will then just backfill onto the same disk without moving any PG.

Keep in mind that this means that you are running with one missing
copy during the recovery, so that's not the recommended way to do
that.

Paul

Post by Eugen Block
Hi,
to reduce impact on clients during migration I would set the OSD's
primary-affinity to 0 beforehand. This should prevent the slow requests, at
least this setting has helped us a lot with problematic OSDs.
Regards
Eugen

Post by Jaime Ibar
Hi all,
we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're
trying to migrate the
osd's to Bluestore following this document[0], however when I mark the osd
as out,
I'm getting warnings similar to these ones
2018-09-20 09:32:46.079630 mon.dri-ceph01 [WRN] Health check failed: 2
slow requests are blocked > 32 sec. Implicated osds 16,28 (REQUEST_SLOW)
2018-09-20 09:32:52.841123 mon.dri-ceph01 [WRN] Health check update: 7
slow requests are blocked > 32 sec. Implicated osds 10,16,28,32,59
(REQUEST_SLOW)
2018-09-20 09:32:57.842230 mon.dri-ceph01 [WRN] Health check update: 15
slow requests are blocked > 32 sec. Implicated osds 10,16,28,31,32,59,78,80
(REQUEST_SLOW)
244944/40100780 objects misplaced (0.611%) (OBJECT_MISPLACED)
2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249
PGs pending on creation (PENDING_CREATING_PGS)
which prevent ceph start rebalancing and the vm's running on ceph start
hanging and we have to mark the osd back in.
I tried to reweight the osd to 0.90 in order to minimize the impact on the
cluster but the warnings are the same.
I tried to increased these settings to
mds cache memory limit = 2147483648
rocksdb cache size = 2147483648
but with no luck, same warnings.
We also have cephfs for storing files from different projects(no directory
fragmentation enabled).
The problem here is that if one osd dies, all the services will be blocked
as ceph won't be able to
start rebalancing.
The cluster is
- 3 mons
- 3 mds(running on the same hosts as the mons). 2 mds active and 1 standby
- 3 mgr(running on the same hosts as the mons)
- 6 servers, 12 osd's each.
- 1GB private network
Does anyone know how to fix or where the problem could be?
Thanks a lot in advance.
Jaime
[0]
http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Tel: +353-1-896-3725

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

Jaime Ibar

2018-09-20 13:55:32 UTC

Permalink

Hi all,

after increasing mon_max_pg_per_osd number ceph starts rebalancing as usual.

However, the slow requests warnings are still there, even after setting

primary-affinity to 0 beforehand.

By the other hand, if I destroy the osd, ceph will start rebalancing unless

noout flag is set, am I right?

Thanks

Jaime

Post by Paul Emmerich
You can prevent creation of the PGs on the old filestore OSDs (which
seems to be the culprit here) during replacement by replacing the
* ceph osd destroy osd.X
* re-create with bluestore under the same id (ceph volume ... --osd-id X)
it will then just backfill onto the same disk without moving any PG.
Keep in mind that this means that you are running with one missing
copy during the recovery, so that's not the recommended way to do
that.
Paul

Post by Jaime Ibar
Hi all,
we recently upgrade from Jewel 10.2.10 to Luminous 12.2.7, now we're
trying to migrate the
osd's to Bluestore following this document[0], however when I mark the osd
as out,
I'm getting warnings similar to these ones
2018-09-20 09:32:46.079630 mon.dri-ceph01 [WRN] Health check failed: 2
slow requests are blocked > 32 sec. Implicated osds 16,28 (REQUEST_SLOW)
2018-09-20 09:32:52.841123 mon.dri-ceph01 [WRN] Health check update: 7
slow requests are blocked > 32 sec. Implicated osds 10,16,28,32,59
(REQUEST_SLOW)
2018-09-20 09:32:57.842230 mon.dri-ceph01 [WRN] Health check update: 15
slow requests are blocked > 32 sec. Implicated osds 10,16,28,31,32,59,78,80
(REQUEST_SLOW)
244944/40100780 objects misplaced (0.611%) (OBJECT_MISPLACED)
2018-09-20 09:32:58.851160 mon.dri-ceph01 [WRN] Health check update: 249
PGs pending on creation (PENDING_CREATING_PGS)
which prevent ceph start rebalancing and the vm's running on ceph start
hanging and we have to mark the osd back in.
I tried to reweight the osd to 0.90 in order to minimize the impact on the
cluster but the warnings are the same.
I tried to increased these settings to
mds cache memory limit = 2147483648
rocksdb cache size = 2147483648
but with no luck, same warnings.
We also have cephfs for storing files from different projects(no directory
fragmentation enabled).
The problem here is that if one osd dies, all the services will be blocked
as ceph won't be able to
start rebalancing.
The cluster is
- 3 mons
- 3 mds(running on the same hosts as the mons). 2 mds active and 1 standby
- 3 mgr(running on the same hosts as the mons)
- 6 servers, 12 osd's each.
- 1GB private network
Does anyone know how to fix or where the problem could be?
Thanks a lot in advance.
Jaime
[0]
http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Tel: +353-1-896-3725

_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | ***@tchpc.tcd.ie
Tel: +353-1-896-3725