Discussion:
Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK
(too old to reply)
Marco Baldini - H.S. Amiata
2018-02-28 10:43:33 UTC
Permalink
Hello

I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
1x240GB SSD. I created this cluster after Luminous release, so all OSDs
are Bluestore. In my crush map I have two rules, one targeting the SSDs
and one targeting the HDDs. I have 4 pools, one using the SSD rule and
the others using the HDD rule, three pools are size=3 min_size=2, one is
size=2 min_size=1 (this one have content that it's ok to lose)

In the last 3 month I'm having a strange random problem. I planned my
osd scrubs during the night (osd scrub begin hour = 20, osd scrub end
hour = 7) when office is closed so there is low impact on the users.
Some mornings, when I ceph the cluster health, I find:

HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent

X and Y sometimes are 1, sometimes 2.

I issue a ceph health detail, check the damaged PGs, and run a ceph pg
repair for the damaged PGs, I get

instructing pg PG on osd.N to repair

PG are different, OSD that have to repair PG is different, even the node
hosting the OSD is different, I made a list of all PGs and OSDs. This
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair

(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair

(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed


I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had
to repair PG. Date is dd/mm/yyyy

21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]

18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]

22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]

29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair

07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair

09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair

15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair

pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair

17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair

22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair

28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair

pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair



If can be useful, my ceph.conf is here:

[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440

debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0


[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1

osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3

[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true


#[mgr]
#debug_mgr = 20


[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789

[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789

[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789


My ceph versions:

{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}



My ceph osd tree:

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000

My pools:

pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]


I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always on
the same OSD. Any ideas

Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
EMAIL: ***@hsamiata.it <mailto:***@hsamiata.it>
Paul Emmerich
2018-02-28 10:59:30 UTC
Permalink
Hi,

might be http://tracker.ceph.com/issues/22464

Can you check the OSD log file to see if the reported checksum is 0x6706be76?


Paul
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 1x240GB SSD. I created this cluster after Luminous release, so all OSDs are Bluestore. In my crush map I have two rules, one targeting the SSDs and one targeting the HDDs. I have 4 pools, one using the SSD rule and the others using the HDD rule, three pools are size=3 min_size=2, one is size=2 min_size=1 (this one have content that it's ok to lose)
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph pg repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's hardware, if I had a failed disk, then I should have problems always on the same OSD. Any ideas
Thanks
--
Marco Baldini
H.S. Amiata Srl
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it/>
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich

croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90

GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
Marco Baldini - H.S. Amiata
2018-02-28 13:48:00 UTC
Permalink
Hi

I read the bugtracker issue and it seems a lot like my problem, even if
I can't check the reported checksum because I don't have it in my logs,
perhaps it's because of debug osd = 0/0 in ceph.conf

I just raised the OSD log level

ceph tell osd.* injectargs --debug-osd 5/5

I'll check OSD logs in the next days...

Thanks
Post by Paul Emmerich
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum
is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
1x240GB SSD. I created this cluster after Luminous release, so all
OSDs are Bluestore. In my crush map I have two rules, one targeting
the SSDs and one targeting the HDDs. I have 4 pools, one using the
SSD rule and the others using the HDD rule, three pools are size=3
min_size=2, one is size=2 min_size=1 (this one have content that it's
ok to lose)
In the last 3 month I'm having a strange random problem. I planned my
osd scrubs during the night (osd scrub begin hour = 20, osd scrub end
hour = 7) when office is closed so there is low impact on the users.
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph
pg repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the
node hosting the OSD is different, I made a list of all PGs and OSDs.
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD
had to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always
on the same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it/>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
EMAIL: ***@hsamiata.it <mailto:***@hsamiata.it>
Marco Baldini - H.S. Amiata
2018-03-05 11:21:52 UTC
Permalink
Hi

After some days with debug_osd 5/5 I found [ERR] in different days,
different PGs, different OSDs, different hosts. This is what I get in
the OSD logs:

*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error

*OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error

*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error

I don't know what this error is meaning, and as always a ceph pg repair
fixes it. I don't think this is normal.

Ideas?

Thanks
Post by Marco Baldini - H.S. Amiata
Hi
I read the bugtracker issue and it seems a lot like my problem, even
if I can't check the reported checksum because I don't have it in my
logs, perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Post by Paul Emmerich
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum
is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
1x240GB SSD. I created this cluster after Luminous release, so all
OSDs are Bluestore. In my crush map I have two rules, one targeting
the SSDs and one targeting the HDDs. I have 4 pools, one using the
SSD rule and the others using the HDD rule, three pools are size=3
min_size=2, one is size=2 min_size=1 (this one have content that
it's ok to lose)
In the last 3 month I'm having a strange random problem. I planned
my osd scrubs during the night (osd scrub begin hour = 20, osd scrub
end hour = 7) when office is closed so there is low impact on the
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph
pg repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the
node hosting the OSD is different, I made a list of all PGs and
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD
had to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always
on the same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it/>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
EMAIL: ***@hsamiata.it <mailto:***@hsamiata.it>
Paul Emmerich
2018-03-05 12:36:51 UTC
Permalink
Hi,

yeah, the cluster that I'm seeing this on also has only one host that
reports that specific checksum. Two other hosts only report the same error
that you are seeing.

Could you post to the tracker issue that you are also seeing this?

Paul
Post by Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different days,
different PGs, different OSDs, different hosts. This is what I get in the
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700 2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*
OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg repair
fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my problem, even if I
can't check the reported checksum because I don't have it in my logs,
perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum
is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 1x240GB
SSD. I created this cluster after Luminous release, so all OSDs are
Bluestore. In my crush map I have two rules, one targeting the SSDs and one
targeting the HDDs. I have 4 pools, one using the SSD rule and the others
using the HDD rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I planned my osd
scrubs during the night (osd scrub begin hour = 20, osd scrub end hour = 7)
when office is closed so there is low impact on the users. Some mornings,
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph pg
repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the node
hosting the OSD is different, I made a list of all PGs and OSDs. This
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had to
repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always on the
same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90 <+49%2089%20189658590>
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
--
Paul Emmerich

croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90
Marco Baldini - H.S. Amiata
2018-03-05 15:53:48 UTC
Permalink
Hi

I just posted in the ceph tracker with my logs and my issue

Let's hope this will be fixed

Thanks
Post by Paul Emmerich
Hi,
yeah, the cluster that I'm seeing this on also has only one host that
reports that specific checksum. Two other hosts only report the same
error that you are seeing.
Could you post to the tracker issue that you are also seeing this?
Paul
2018-03-05 12:21 GMT+01:00 Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different
days, different PGs, different OSDs, different hosts. This is what
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg
repair fixes it. I don't think this is normal.
Ideas?
Thanks
Post by Marco Baldini - H.S. Amiata
Hi
I read the bugtracker issue and it seems a lot like my problem,
even if I can't check the reported checksum because I don't have
it in my logs, perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Post by Paul Emmerich
Hi,
might be http://tracker.ceph.com/issues/22464
<http://tracker.ceph.com/issues/22464>
Can you check the OSD log file to see if the reported checksum
is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD
and 1x240GB SSD. I created this cluster after Luminous release,
so all OSDs are Bluestore. In my crush map I have two rules,
one targeting the SSDs and one targeting the HDDs. I have 4
pools, one using the SSD rule and the others using the HDD
rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I
planned my osd scrubs during the night (osd scrub begin hour =
20, osd scrub end hour = 7) when office is closed so there is
low impact on the users. Some mornings, when I ceph the cluster
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a
ceph pg repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even
the node hosting the OSD is different, I made a list of all PGs
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what
OSD had to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network =10.10.10.0/24 <http://10.10.10.0/24>
public network =10.10.10.0/24 <http://10.10.10.0/24>
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr =10.10.10.251:6789 <http://10.10.10.251:6789>
[mon.pve-hs-2]
host = pve-hs-2
mon addr =10.10.10.252:6789 <http://10.10.10.252:6789>
[mon.pve-hs-3]
host = pve-hs-3
mon addr =10.10.10.253:6789 <http://10.10.10.253:6789>
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think
it's hardware, if I had a failed disk, then I should have
problems always on the same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it/>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90 <tel:+49%2089%20189658590>
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
--
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
EMAIL: ***@hsamiata.it <mailto:***@hsamiata.it>
Vladimir Prokofev
2018-03-05 12:40:27 UTC
Permalink
Post by Marco Baldini - H.S. Amiata
candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and relocated
sectors in SMART, just replaced the disk. But in your case it seems that
errors are on different OSDs? Are your OSDs all healthy?
You can use this command to see some details.
rados list-inconsistent-obj <pg.id> --format=json-pretty
pg.id is the PG that's reporting as inconsistent. My guess is that you'll
see read errors in this output, with OSD number that encountered error.
After that you have to check that OSD health - SMART details, etc.
Not always it's the disk itself that causing problems - for example we had
read errors because of a faulty backplane interface in a server; changing
the chassis resolved this issue.
Post by Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different days,
different PGs, different OSDs, different hosts. This is what I get in the
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700 2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*
OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg repair
fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my problem, even if I
can't check the reported checksum because I don't have it in my logs,
perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum
is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and 1x240GB
SSD. I created this cluster after Luminous release, so all OSDs are
Bluestore. In my crush map I have two rules, one targeting the SSDs and one
targeting the HDDs. I have 4 pools, one using the SSD rule and the others
using the HDD rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I planned my osd
scrubs during the night (osd scrub begin hour = 20, osd scrub end hour = 7)
when office is closed so there is low impact on the users. Some mornings,
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph pg
repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the node
hosting the OSD is different, I made a list of all PGs and OSDs. This
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had to
repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always on the
same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Marco Baldini - H.S. Amiata
2018-03-05 15:26:28 UTC
Permalink
Hi and thanks for reply

The OSDs are all healthy, in fact after a ceph pg repair <PG> the ceph
health is back to OK and in the OSD log I see <PG> repair ok, 0 fixed

The SMART data of the 3 OSDs seems fine

*OSD.5*

# ceph-disk list | grep osd.5
 /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2

# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 193297722
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 60
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1451132477
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13283
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 61
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 086 086 000 Old_age Always - 14
190 Airflow_Temperature_Cel 0x0022 071 055 040 Old_age Always - 29 (Min/Max 23/32)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 607
194 Temperature_Celsius 0x0022 029 014 000 Old_age Always - 29 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 193297722
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13211h+23m+08.363s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 53042120064
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 170788993187


*OSD.4*

# ceph-disk list | grep osd.4
/dev/sdc1 ceph data, active, cluster ceph, osd.4, block /dev/sdc2

# smartctl -a /dev/sdc
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1M1BW
LU WWN Device Id: 5 000c50 090c78d27
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:20:46 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 194906537
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 64
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1485899434
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13390
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 65
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 095 095 000 Old_age Always - 5
190 Airflow_Temperature_Cel 0x0022 074 051 040 Old_age Always - 26 (Min/Max 19/29)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 616
194 Temperature_Celsius 0x0022 026 014 000 Old_age Always - 26 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 194906537
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13315h+20m+30.974s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 52137467719
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 177227508503



*OSD.8*

# ceph-disk list | grep osd.8
/dev/sda1 ceph data, active, cluster ceph, osd.8, block /dev/sda2

# smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A2BEF2
LU WWN Device Id: 5 000c50 0910f5427
Firmware Version: CC43
User Capacity: 1,000,203,804,160 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:22:47 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 110) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 083 063 006 Pre-fail Always - 224621855
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 275
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 081 060 045 Pre-fail Always - 149383284
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6210
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 265
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2
190 Airflow_Temperature_Cel 0x0022 069 058 040 Old_age Always - 31 (Min/Max 21/35)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 516
194 Temperature_Celsius 0x0022 031 017 000 Old_age Always - 31 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 005 001 000 Old_age Always - 224621855
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 6154h+03m+35.126s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 24333847321
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 50261005553


However it's not only these 3 OSD to have PG with errors, these are
onlyl the most recent, in the last 3 months I had often OSD_SCRUB_ERRORS
in various OSDs, always solved by ceph pg repair <PG>, I don't think
it's an hardware issue.
Post by Vladimir Prokofev
 candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and relocated
sectors in SMART, just replaced the disk. But in your case it seems
that errors are on different OSDs? Are your OSDs all healthy?
You can use this command to see some details.
rados list-inconsistent-obj <pg.id <http://pg.id>> --format=json-pretty
pg.id <http://pg.id> is the PG that's reporting as inconsistent. My
guess is that you'll see read errors in this output, with OSD number
that encountered error. After that you have to check that OSD health -
SMART details, etc.
Not always it's the disk itself that causing problems - for example we
had read errors because of a faulty backplane interface in a server;
changing the chassis resolved this issue.
2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different
days, different PGs, different OSDs, different hosts. This is what
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg
repair fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my problem,
even if I can't check the reported checksum because I don't have
it in my logs, perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Post by Paul Emmerich
Hi,
might be http://tracker.ceph.com/issues/22464
<http://tracker.ceph.com/issues/22464>
Can you check the OSD log file to see if the reported checksum
is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD
and 1x240GB SSD. I created this cluster after Luminous release,
so all OSDs are Bluestore. In my crush map I have two rules,
one targeting the SSDs and one targeting the HDDs. I have 4
pools, one using the SSD rule and the others using the HDD
rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I
planned my osd scrubs during the night (osd scrub begin hour =
20, osd scrub end hour = 7) when office is closed so there is
low impact on the users. Some mornings, when I ceph the cluster
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a
ceph pg repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even
the node hosting the OSD is different, I made a list of all PGs
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what
OSD had to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network =10.10.10.0/24 <http://10.10.10.0/24>
public network =10.10.10.0/24 <http://10.10.10.0/24>
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr =10.10.10.251:6789 <http://10.10.10.251:6789>
[mon.pve-hs-2]
host = pve-hs-2
mon addr =10.10.10.252:6789 <http://10.10.10.252:6789>
[mon.pve-hs-3]
host = pve-hs-3
mon addr =10.10.10.253:6789 <http://10.10.10.253:6789>
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think
it's hardware, if I had a failed disk, then I should have
problems always on the same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it/>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
EMAIL: ***@hsamiata.it <mailto:***@hsamiata.it>
Vladimir Prokofev
2018-03-05 20:45:18 UTC
Permalink
Post by Marco Baldini - H.S. Amiata
always solved by ceph pg repair <PG>
That doesn't necessarily means that there's no hardware issue. In my case
repair also worked fine and returned cluster to OK state every time, but in
time faulty disk fail another scrub operation, and this repeated multiple
times before we replaced that disk.
One last thing to look into is dmesg at your OSD nodes. If there's a
hardware read error it will be logged in dmesg.
Post by Marco Baldini - H.S. Amiata
Hi and thanks for reply
The OSDs are all healthy, in fact after a ceph pg repair <PG> the ceph
health is back to OK and in the OSD log I see <PG> repair ok, 0 fixed
The SMART data of the 3 OSDs seems fine
*OSD.5*
# ceph-disk list | grep osd.5
/dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 193297722
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 60
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1451132477
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13283
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 61
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 086 086 000 Old_age Always - 14
190 Airflow_Temperature_Cel 0x0022 071 055 040 Old_age Always - 29 (Min/Max 23/32)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 607
194 Temperature_Celsius 0x0022 029 014 000 Old_age Always - 29 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 193297722
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13211h+23m+08.363s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 53042120064
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 170788993187
*OSD.4*
# ceph-disk list | grep osd.4
/dev/sdc1 ceph data, active, cluster ceph, osd.4, block /dev/sdc2
# smartctl -a /dev/sdc
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1M1BW
LU WWN Device Id: 5 000c50 090c78d27
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:20:46 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 194906537
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 64
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1485899434
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13390
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 65
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 095 095 000 Old_age Always - 5
190 Airflow_Temperature_Cel 0x0022 074 051 040 Old_age Always - 26 (Min/Max 19/29)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 616
194 Temperature_Celsius 0x0022 026 014 000 Old_age Always - 26 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 194906537
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13315h+20m+30.974s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 52137467719
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 177227508503
*OSD.8*
# ceph-disk list | grep osd.8
/dev/sda1 ceph data, active, cluster ceph, osd.8, block /dev/sda2
# smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A2BEF2
LU WWN Device Id: 5 000c50 0910f5427
Firmware Version: CC43
User Capacity: 1,000,203,804,160 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:22:47 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 110) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 083 063 006 Pre-fail Always - 224621855
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 275
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 081 060 045 Pre-fail Always - 149383284
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6210
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 265
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2
190 Airflow_Temperature_Cel 0x0022 069 058 040 Old_age Always - 31 (Min/Max 21/35)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 516
194 Temperature_Celsius 0x0022 031 017 000 Old_age Always - 31 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 005 001 000 Old_age Always - 224621855
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 6154h+03m+35.126s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 24333847321
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 50261005553
However it's not only these 3 OSD to have PG with errors, these are onlyl
the most recent, in the last 3 months I had often OSD_SCRUB_ERRORS in
various OSDs, always solved by ceph pg repair <PG>, I don't think it's an
hardware issue.
Post by Marco Baldini - H.S. Amiata
candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and relocated
sectors in SMART, just replaced the disk. But in your case it seems that
errors are on different OSDs? Are your OSDs all healthy?
You can use this command to see some details.
rados list-inconsistent-obj <pg.id> --format=json-pretty
pg.id is the PG that's reporting as inconsistent. My guess is that you'll
see read errors in this output, with OSD number that encountered error.
After that you have to check that OSD health - SMART details, etc.
Not always it's the disk itself that causing problems - for example we had
read errors because of a faulty backplane interface in a server; changing
the chassis resolved this issue.
2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different days,
different PGs, different OSDs, different hosts. This is what I get in the
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700 2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*
OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg repair
fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my problem, even if I
can't check the reported checksum because I don't have it in my logs,
perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
1x240GB SSD. I created this cluster after Luminous release, so all OSDs are
Bluestore. In my crush map I have two rules, one targeting the SSDs and one
targeting the HDDs. I have 4 pools, one using the SSD rule and the others
using the HDD rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I planned my osd
scrubs during the night (osd scrub begin hour = 20, osd scrub end hour = 7)
when office is closed so there is low impact on the users. Some mornings,
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph pg
repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the node
hosting the OSD is different, I made a list of all PGs and OSDs. This
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had to
repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always on the
same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Marco Baldini - H.S. Amiata
2018-03-06 07:26:03 UTC
Permalink
Hi

I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
the problem happens with various different OSDs in different nodes, for
me it is clear it's not an hardware problem.

Thanks for reply
Post by Vladimir Prokofev
Post by Marco Baldini - H.S. Amiata
always solved by ceph pg repair <PG>
That doesn't necessarily means that there's no hardware issue. In my
case repair also worked fine and returned cluster to OK state every
time, but in time faulty disk fail another scrub operation, and this
repeated multiple times before we replaced that disk.
One last thing to look into is dmesg at your OSD nodes. If there's a
hardware read error it will be logged in dmesg.
2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata
Hi and thanks for reply
The OSDs are all healthy, in fact after a ceph pg repair <PG> the
ceph health is back to OK and in the OSD log I see  <PG> repair
ok, 0 fixed
The SMART data of the 3 OSDs seems fine
*OSD.5*
# ceph-disk list | grep osd.5
 /dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org <http://www.smartmontools.org>
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 193297722
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 60
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1451132477
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13283
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 61
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 086 086 000 Old_age Always - 14
190 Airflow_Temperature_Cel 0x0022 071 055 040 Old_age Always - 29 (Min/Max 23/32)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 607
194 Temperature_Celsius 0x0022 029 014 000 Old_age Always - 29 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 193297722
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13211h+23m+08.363s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 53042120064
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 170788993187
*OSD.4*
# ceph-disk list | grep osd.4
/dev/sdc1 ceph data, active, cluster ceph, osd.4, block /dev/sdc2
# smartctl -a /dev/sdc
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org <http://www.smartmontools.org>
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1M1BW
LU WWN Device Id: 5 000c50 090c78d27
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:20:46 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 194906537
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 64
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1485899434
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13390
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 65
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 095 095 000 Old_age Always - 5
190 Airflow_Temperature_Cel 0x0022 074 051 040 Old_age Always - 26 (Min/Max 19/29)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 616
194 Temperature_Celsius 0x0022 026 014 000 Old_age Always - 26 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 194906537
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13315h+20m+30.974s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 52137467719
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 177227508503
*OSD.8*
# ceph-disk list | grep osd.8
/dev/sda1 ceph data, active, cluster ceph, osd.8, block /dev/sda2
# smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,www.smartmontools.org <http://www.smartmontools.org>
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A2BEF2
LU WWN Device Id: 5 000c50 0910f5427
Firmware Version: CC43
User Capacity: 1,000,203,804,160 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:22:47 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 110) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 083 063 006 Pre-fail Always - 224621855
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 275
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 081 060 045 Pre-fail Always - 149383284
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6210
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 265
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2
190 Airflow_Temperature_Cel 0x0022 069 058 040 Old_age Always - 31 (Min/Max 21/35)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 516
194 Temperature_Celsius 0x0022 031 017 000 Old_age Always - 31 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 005 001 000 Old_age Always - 224621855
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 6154h+03m+35.126s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 24333847321
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 50261005553
However it's not only these 3 OSD to have PG with errors, these
are onlyl the most recent, in the last 3 months I had often
OSD_SCRUB_ERRORS in various OSDs, always solved by ceph pg repair
<PG>, I don't think it's an hardware issue.
Post by Marco Baldini - H.S. Amiata
 candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and
relocated sectors in SMART, just replaced the disk. But in your
case it seems that errors are on different OSDs? Are your OSDs
all healthy?
You can use this command to see some details.
rados list-inconsistent-obj <pg.id <http://pg.id>>
--format=json-pretty
pg.id <http://pg.id> is the PG that's reporting as inconsistent.
My guess is that you'll see read errors in this output, with OSD
number that encountered error. After that you have to check that
OSD health - SMART details, etc.
Not always it's the disk itself that causing problems - for
example we had read errors because of a faulty backplane
interface in a server; changing the chassis resolved this issue.
2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different
days, different PGs, different OSDs, different hosts. This is
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph
pg repair fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my
problem, even if I can't check the reported checksum because
I don't have it in my logs, perhaps it's because of debug
osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Post by Paul Emmerich
Hi,
might be http://tracker.ceph.com/issues/22464
<http://tracker.ceph.com/issues/22464>
Can you check the OSD log file to see if the reported
checksum is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB
HDD and 1x240GB SSD. I created this cluster after Luminous
release, so all OSDs are Bluestore. In my crush map I have
two rules, one targeting the SSDs and one targeting the
HDDs. I have 4 pools, one using the SSD rule and the
others using the HDD rule, three pools are size=3
min_size=2, one is size=2 min_size=1 (this one have
content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I
planned my osd scrubs during the night (osd scrub begin
hour = 20, osd scrub end hour = 7) when office is closed
so there is low impact on the users. Some mornings, when I
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and
run a ceph pg repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different,
even the node hosting the OSD is different, I made a list
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and
what OSD had to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network =10.10.10.0/24 <http://10.10.10.0/24>
public network =10.10.10.0/24 <http://10.10.10.0/24>
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr =10.10.10.251:6789 <http://10.10.10.251:6789>
[mon.pve-hs-2]
host = pve-hs-2
mon addr =10.10.10.252:6789 <http://10.10.10.252:6789>
[mon.pve-hs-3]
host = pve-hs-3
mon addr =10.10.10.253:6789 <http://10.10.10.253:6789>
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't
think it's hardware, if I had a failed disk, then I should
have problems always on the same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it/>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it <https://www.hsamiata.it>
EMAIL: ***@hsamiata.it <mailto:***@hsamiata.it>
Brad Hubbard
2018-03-06 09:10:25 UTC
Permalink
On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi
I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
the problem happens with various different OSDs in different nodes, for me
it is clear it's not an hardware problem.
If you have osd_debug set to 25 or greater when you run the deep scrub you
should get more information about the nature of the read error in the
ReplicatedBackend::be_deep_scrub() function (assuming this is a replicated
pool).

This may create large logs so watch they don't exhaust storage.
Post by Marco Baldini - H.S. Amiata
Thanks for reply
Post by Marco Baldini - H.S. Amiata
always solved by ceph pg repair <PG>
That doesn't necessarily means that there's no hardware issue. In my case
repair also worked fine and returned cluster to OK state every time, but in
time faulty disk fail another scrub operation, and this repeated multiple
times before we replaced that disk.
One last thing to look into is dmesg at your OSD nodes. If there's a
hardware read error it will be logged in dmesg.
2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi and thanks for reply
The OSDs are all healthy, in fact after a ceph pg repair <PG> the ceph
health is back to OK and in the OSD log I see <PG> repair ok, 0 fixed
The SMART data of the 3 OSDs seems fine
*OSD.5*
# ceph-disk list | grep osd.5
/dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 193297722
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 60
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1451132477
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13283
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 61
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 086 086 000 Old_age Always - 14
190 Airflow_Temperature_Cel 0x0022 071 055 040 Old_age Always - 29 (Min/Max 23/32)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 607
194 Temperature_Celsius 0x0022 029 014 000 Old_age Always - 29 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 193297722
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13211h+23m+08.363s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 53042120064
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 170788993187
*OSD.4*
# ceph-disk list | grep osd.4
/dev/sdc1 ceph data, active, cluster ceph, osd.4, block /dev/sdc2
# smartctl -a /dev/sdc
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1M1BW
LU WWN Device Id: 5 000c50 090c78d27
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:20:46 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 194906537
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 64
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1485899434
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13390
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 65
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 095 095 000 Old_age Always - 5
190 Airflow_Temperature_Cel 0x0022 074 051 040 Old_age Always - 26 (Min/Max 19/29)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 616
194 Temperature_Celsius 0x0022 026 014 000 Old_age Always - 26 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 194906537
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13315h+20m+30.974s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 52137467719
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 177227508503
*OSD.8*
# ceph-disk list | grep osd.8
/dev/sda1 ceph data, active, cluster ceph, osd.8, block /dev/sda2
# smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A2BEF2
LU WWN Device Id: 5 000c50 0910f5427
Firmware Version: CC43
User Capacity: 1,000,203,804,160 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:22:47 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 110) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 083 063 006 Pre-fail Always - 224621855
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 275
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 081 060 045 Pre-fail Always - 149383284
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6210
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 265
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2
190 Airflow_Temperature_Cel 0x0022 069 058 040 Old_age Always - 31 (Min/Max 21/35)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 516
194 Temperature_Celsius 0x0022 031 017 000 Old_age Always - 31 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 005 001 000 Old_age Always - 224621855
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 6154h+03m+35.126s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 24333847321
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 50261005553
However it's not only these 3 OSD to have PG with errors, these are onlyl
the most recent, in the last 3 months I had often OSD_SCRUB_ERRORS in
various OSDs, always solved by ceph pg repair <PG>, I don't think it's an
hardware issue.
Post by Marco Baldini - H.S. Amiata
candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and relocated
sectors in SMART, just replaced the disk. But in your case it seems that
errors are on different OSDs? Are your OSDs all healthy?
You can use this command to see some details.
rados list-inconsistent-obj <pg.id> --format=json-pretty
pg.id is the PG that's reporting as inconsistent. My guess is that
you'll see read errors in this output, with OSD number that encountered
error. After that you have to check that OSD health - SMART details, etc.
Not always it's the disk itself that causing problems - for example we
had read errors because of a faulty backplane interface in a server;
changing the chassis resolved this issue.
2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different days,
different PGs, different OSDs, different hosts. This is what I get in the
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700 2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*
OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg repair
fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my problem, even if
I can't check the reported checksum because I don't have it in my logs,
perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
1x240GB SSD. I created this cluster after Luminous release, so all OSDs are
Bluestore. In my crush map I have two rules, one targeting the SSDs and one
targeting the HDDs. I have 4 pools, one using the SSD rule and the others
using the HDD rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I planned my
osd scrubs during the night (osd scrub begin hour = 20, osd scrub end hour
= 7) when office is closed so there is low impact on the users. Some
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph pg
repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the node
hosting the OSD is different, I made a list of all PGs and OSDs. This
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had
to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always on the
same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90 <+49%2089%20189658590>
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Cheers,
Brad
Brad Hubbard
2018-03-06 09:11:11 UTC
Permalink
debug_osd that is... :)
Post by Brad Hubbard
On Tue, Mar 6, 2018 at 5:26 PM, Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi
I monitor dmesg in each of the 3 nodes, no hardware issue reported. And
the problem happens with various different OSDs in different nodes, for me
it is clear it's not an hardware problem.
If you have osd_debug set to 25 or greater when you run the deep scrub you
should get more information about the nature of the read error in the
ReplicatedBackend::be_deep_scrub() function (assuming this is a
replicated pool).
This may create large logs so watch they don't exhaust storage.
Post by Marco Baldini - H.S. Amiata
Thanks for reply
Post by Marco Baldini - H.S. Amiata
always solved by ceph pg repair <PG>
That doesn't necessarily means that there's no hardware issue. In my case
repair also worked fine and returned cluster to OK state every time, but in
time faulty disk fail another scrub operation, and this repeated multiple
times before we replaced that disk.
One last thing to look into is dmesg at your OSD nodes. If there's a
hardware read error it will be logged in dmesg.
2018-03-05 18:26 GMT+03:00 Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi and thanks for reply
The OSDs are all healthy, in fact after a ceph pg repair <PG> the ceph
health is back to OK and in the OSD log I see <PG> repair ok, 0 fixed
The SMART data of the 3 OSDs seems fine
*OSD.5*
# ceph-disk list | grep osd.5
/dev/sdd1 ceph data, active, cluster ceph, osd.5, block /dev/sdd2
# smartctl -a /dev/sdd
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1MA1V
LU WWN Device Id: 5 000c50 090c7028b
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:17:22 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 193297722
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 60
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1451132477
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13283
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 61
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 086 086 000 Old_age Always - 14
190 Airflow_Temperature_Cel 0x0022 071 055 040 Old_age Always - 29 (Min/Max 23/32)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 607
194 Temperature_Celsius 0x0022 029 014 000 Old_age Always - 29 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 193297722
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13211h+23m+08.363s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 53042120064
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 170788993187
*OSD.4*
# ceph-disk list | grep osd.4
/dev/sdc1 ceph data, active, cluster ceph, osd.4, block /dev/sdc2
# smartctl -a /dev/sdc
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A1M1BW
LU WWN Device Id: 5 000c50 090c78d27
Firmware Version: CC43
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:20:46 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 109) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 082 063 006 Pre-fail Always - 194906537
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 64
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 091 060 045 Pre-fail Always - 1485899434
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13390
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 65
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 095 095 000 Old_age Always - 5
190 Airflow_Temperature_Cel 0x0022 074 051 040 Old_age Always - 26 (Min/Max 19/29)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 616
194 Temperature_Celsius 0x0022 026 014 000 Old_age Always - 26 (0 14 0 0 0)
195 Hardware_ECC_Recovered 0x001a 004 001 000 Old_age Always - 194906537
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 13315h+20m+30.974s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 52137467719
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 177227508503
*OSD.8*
# ceph-disk list | grep osd.8
/dev/sda1 ceph data, active, cluster ceph, osd.8, block /dev/sda2
# smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.13-6-pve] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST1000DM003-1SB10C
Serial Number: Z9A2BEF2
LU WWN Device Id: 5 000c50 0910f5427
Firmware Version: CC43
User Capacity: 1,000,203,804,160 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Mar 5 16:22:47 2018 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 110) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 083 063 006 Pre-fail Always - 224621855
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 275
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 081 060 045 Pre-fail Always - 149383284
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6210
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 265
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 098 098 000 Old_age Always - 2
190 Airflow_Temperature_Cel 0x0022 069 058 040 Old_age Always - 31 (Min/Max 21/35)
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 516
194 Temperature_Celsius 0x0022 031 017 000 Old_age Always - 31 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 005 001 000 Old_age Always - 224621855
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 6154h+03m+35.126s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 24333847321
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 50261005553
However it's not only these 3 OSD to have PG with errors, these are
onlyl the most recent, in the last 3 months I had often OSD_SCRUB_ERRORS in
various OSDs, always solved by ceph pg repair <PG>, I don't think it's an
hardware issue.
Post by Marco Baldini - H.S. Amiata
candidate had a read error
speaks for itself - while scrubbing it coudn't read data.
I had similar issue, and it was just OSD dying - errors and relocated
sectors in SMART, just replaced the disk. But in your case it seems that
errors are on different OSDs? Are your OSDs all healthy?
You can use this command to see some details.
rados list-inconsistent-obj <pg.id> --format=json-pretty
pg.id is the PG that's reporting as inconsistent. My guess is that
you'll see read errors in this output, with OSD number that encountered
error. After that you have to check that OSD health - SMART details, etc.
Not always it's the disk itself that causing problems - for example we
had read errors because of a faulty backplane interface in a server;
changing the chassis resolved this issue.
2018-03-05 14:21 GMT+03:00 Marco Baldini - H.S. Amiata <
Post by Marco Baldini - H.S. Amiata
Hi
After some days with debug_osd 5/5 I found [ERR] in different days,
different PGs, different OSDs, different hosts. This is what I get in the
*OSD.5 (host 3)*
2018-03-01 20:30:02.702269 7fdf4d515700 2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
*
OSD.4 (host 3)*
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
*OSD.8 (host 2)*
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
I don't know what this error is meaning, and as always a ceph pg repair
fixes it. I don't think this is normal.
Ideas?
Thanks
Hi
I read the bugtracker issue and it seems a lot like my problem, even if
I can't check the reported checksum because I don't have it in my logs,
perhaps it's because of debug osd = 0/0 in ceph.conf
I just raised the OSD log level
ceph tell osd.* injectargs --debug-osd 5/5
I'll check OSD logs in the next days...
Thanks
Hi,
might be http://tracker.ceph.com/issues/22464
Can you check the OSD log file to see if the reported checksum is 0x6706be76?
Paul
Am 28.02.2018 um 11:43 schrieb Marco Baldini - H.S. Amiata <
Hello
I have a little ceph cluster with 3 nodes, each with 3x1TB HDD and
1x240GB SSD. I created this cluster after Luminous release, so all OSDs are
Bluestore. In my crush map I have two rules, one targeting the SSDs and one
targeting the HDDs. I have 4 pools, one using the SSD rule and the others
using the HDD rule, three pools are size=3 min_size=2, one is size=2
min_size=1 (this one have content that it's ok to lose)
In the last 3 month I'm having a strange random problem. I planned my
osd scrubs during the night (osd scrub begin hour = 20, osd scrub end hour
= 7) when office is closed so there is low impact on the users. Some
HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
X and Y sometimes are 1, sometimes 2.
I issue a ceph health detail, check the damaged PGs, and run a ceph pg
repair for the damaged PGs, I get
instructing pg PG on osd.N to repair
PG are different, OSD that have to repair PG is different, even the
node hosting the OSD is different, I made a list of all PGs and OSDs. This
Post by Marco Baldini - H.S. Amiata
ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
Post by Marco Baldini - H.S. Amiata
ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair
(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700 0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
Post by Marco Baldini - H.S. Amiata
ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair
(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700 0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed
I made a list of when I got OSD_SCRUB_ERRORS, what PG and what OSD had
to repair PG. Date is dd/mm/yyyy
21/12/2017 -- pg 14.29 is active+clean+inconsistent, acting [6,2,4]
18/01/2018 -- pg 14.5a is active+clean+inconsistent, acting [6,4,1]
22/01/2018 -- pg 9.3a is active+clean+inconsistent, acting [2,7]
29/01/2018 -- pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
07/02/2018 -- pg 13.7e is active+clean+inconsistent, acting [8,2,5]
instructing pg 13.7e on osd.8 to repair
09/02/2018 -- pg 13.30 is active+clean+inconsistent, acting [7,3,2]
instructing pg 13.30 on osd.7 to repair
15/02/2018 -- pg 9.35 is active+clean+inconsistent, acting [1,8]
instructing pg 9.35 on osd.1 to repair
pg 13.3e is active+clean+inconsistent, acting [4,6,1]
instructing pg 13.3e on osd.4 to repair
17/02/2018 -- pg 9.2d is active+clean+inconsistent, acting [7,5]
instructing pg 9.2d on osd.7 to repair
22/02/2018 -- pg 9.24 is active+clean+inconsistent, acting [5,8]
instructing pg 9.24 on osd.5 to repair
28/02/2018 -- pg 13.65 is active+clean+inconsistent, acting [4,2,6]
instructing pg 13.65 on osd.4 to repair
pg 14.31 is active+clean+inconsistent, acting [8,3,1]
instructing pg 14.31 on osd.8 to repair
[global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440
debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1
osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3
[client]
rbd cache = true
rbd cache size = 268435456 # 256MB
rbd cache max dirty = 201326592 # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432 # 32MB
rbd cache writethrough until flush = true
#[mgr]
#debug_mgr = 20
[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789
[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789
[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789
{
"mon": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
},
"osd": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
},
"mds": {},
"overall": {
"ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
}
}
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 8.93686 root default
-6 2.94696 host pve-hs-2
3 hdd 0.90959 osd.3 up 1.00000 1.00000
4 hdd 0.90959 osd.4 up 1.00000 1.00000
5 hdd 0.90959 osd.5 up 1.00000 1.00000
10 ssd 0.21819 osd.10 up 1.00000 1.00000
-3 2.86716 host pve-hs-3
6 hdd 0.85599 osd.6 up 1.00000 1.00000
7 hdd 0.85599 osd.7 up 1.00000 1.00000
8 hdd 0.93700 osd.8 up 1.00000 1.00000
11 ssd 0.21819 osd.11 up 1.00000 1.00000
-7 3.12274 host pve-hs-main
0 hdd 0.96819 osd.0 up 1.00000 1.00000
1 hdd 0.96819 osd.1 up 1.00000 1.00000
2 hdd 0.96819 osd.2 up 1.00000 1.00000
9 ssd 0.21819 osd.9 up 1.00000 1.00000
pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
I can't understand where the problem comes from, I don't think it's
hardware, if I had a failed disk, then I should have problems always on the
same OSD. Any ideas
Thanks
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Mit freundlichen GrÌßen / Best Regards
Paul Emmerich
croit GmbH
Freseniusstr. 31h
81247 MÃŒnchen
www.croit.io
Tel: +49 89 1896585 90 <+49%2089%20189658590>
GeschÀftsfÌhrer: Martin Verges
Handelsregister: Amtsgericht MÃŒnchen
USt-IdNr: DE310638492
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
--
*Marco Baldini*
*H.S. Amiata Srl*
Ufficio: 0577-779396
Cellulare: 335-8765169
WEB: www.hsamiata.it
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Cheers,
Brad
--
Cheers,
Brad
Loading...