Discussion:
[ceph-users] Automated Deep Scrub always inconsistent
Ashley Merrick
2018-11-08 08:23:18 UTC
Permalink
Have in the past few days noticed that every single automated deep scrub
comes back as inconsistent, once I run a manual deep-scrub it finishes fine
and the PG is marked as clean.

I am running the latest mimic but have noticed someone else under luminous
is facing the same issue :
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031166.html

I don't believe this is any form of hardware failure as every time it is
different OSD's and every time a manual started deep-scrub finishes without
issue.

Is there something that was released in the most recent mimic and luminous
patches that could be linked to this? Is it somehow linked with the main
issue with the 12.2.9 release?
Ashley Merrick
2018-11-12 15:39:17 UTC
Permalink
Is anyone else seeing this?

I have just setup another cluster to check on completely different hardware
and everything running EC still.

And getting inconsistent PG’s flagged after an auto deep scrub which can be
fixed by just running another deep-scrub.
Post by Ashley Merrick
Have in the past few days noticed that every single automated deep scrub
comes back as inconsistent, once I run a manual deep-scrub it finishes fine
and the PG is marked as clean.
I am running the latest mimic but have noticed someone else under luminous
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031166.html
I don't believe this is any form of hardware failure as every time it is
different OSD's and every time a manual started deep-scrub finishes without
issue.
Is there something that was released in the most recent mimic and luminous
patches that could be linked to this? Is it somehow linked with the main
issue with the 12.2.9 release?
Jonas Jelten
2018-11-12 15:56:44 UTC
Permalink
Maybe you are hitting the kernel bug worked around by https://github.com/ceph/ceph/pull/23273

-- Jonas
Post by Ashley Merrick
Is anyone else seeing this?
I have just setup another cluster to check on completely different hardware and everything running EC still.
And getting inconsistent PG’s flagged after an auto deep scrub which can be fixed by just running another deep-scrub.
Have in the past few days noticed that every single automated deep scrub comes back as inconsistent, once I run a
manual deep-scrub it finishes fine and the PG is marked as clean.
I am running the latest mimic but have noticed someone else under luminous is facing the same issue
: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031166.html
I don't believe this is any form of hardware failure as every time it is different OSD's and every time a manual
started deep-scrub finishes without issue.
Is there something that was released in the most recent mimic and luminous patches that could be linked to this? Is
it somehow linked with the main issue with the 12.2.9 release?
Ashley Merrick
2018-11-12 16:04:50 UTC
Permalink
Thanks does look like it ticks all the boxes.

As it’s been merged I’ll hold off till the next release than rebuilding
from source. As from what it seems it won’t cause an issue outside of just
re running the deep-scrub manually which is what the fix is basically doing
(but isolated to just the failed read)

Thanks!
Post by Jonas Jelten
Maybe you are hitting the kernel bug worked around by
https://github.com/ceph/ceph/pull/23273
-- Jonas
Post by Ashley Merrick
Is anyone else seeing this?
I have just setup another cluster to check on completely different
hardware and everything running EC still.
Post by Ashley Merrick
And getting inconsistent PG’s flagged after an auto deep scrub which can
be fixed by just running another deep-scrub.
Post by Ashley Merrick
Have in the past few days noticed that every single automated deep
scrub comes back as inconsistent, once I run a
Post by Ashley Merrick
manual deep-scrub it finishes fine and the PG is marked as clean.
I am running the latest mimic but have noticed someone else under
luminous is facing the same issue
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031166.html
Post by Ashley Merrick
I don't believe this is any form of hardware failure as every time
it is different OSD's and every time a manual
Post by Ashley Merrick
started deep-scrub finishes without issue.
Is there something that was released in the most recent mimic and
luminous patches that could be linked to this? Is
Post by Ashley Merrick
it somehow linked with the main issue with the 12.2.9 release?
Loading...