Gökhan Kocak
2018-11-14 15:32:39 UTC
Hello everyone,
we encountered an error with the Prometheus plugin for Ceph mgr:
One osd was down and (therefore) it had no class:
```
sudo ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
28 hdd 7.27539 osd.28 up 1.00000 1.00000
6 0 osd.6 down 0 1.00000
```
When we tried to curl the metrics, there was an error because the osd
had no class (see below "KeyError: 'class' ").
Anybody experience the same?
Isn't this an error on the Prometheus plugin's behalf? When an osd is down, the plugin should not stop working imo.
```
~> curl -v 127.0.0.1:9283/metrics
* Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 9283 (#0)
< Date: Wed, 14 Nov 2018 13:59:59 GMT
< Content-Length: 1663
< Content-Type: text/html;charset=utf-8
< Server: CherryPy/3.5.0
<
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8"></meta>
<title>500 Internal Server Error</title>
<style type="text/css">
#powered_by {
margin-top: 20px;
border-top: 2px solid black;
font-style: italic;
}
#traceback {
color: red;
}
</style>
</head>
<body>
<h2>500 Internal Server Error</h2>
<p>The server encountered an unexpected condition which
prevented it from fulfilling the request.</p>
<pre id="traceback">Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line
670, in respond
response.body = self.handler()
File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line
217, in __call__
self.body = self.oldhandler(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line
61, in __call__
return self.callable(*self.args, **self.kwargs)
File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
414, in metrics
metrics = global_instance().collect()
File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
351, in collect
self.get_metadata_and_osd_status()
File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
310, in get_metadata_and_osd_status
dev_class['class'],
KeyError: 'class'
</pre>
<div id="powered_by">
<span>
Powered by <a href="http://www.cherrypy.org">CherryPy 3.5.0</a>
</span>
</div>
</body>
</html>
* Connection #0 to host 127.0.0.1 left intact
```
Kind regards,
Gökhan
we encountered an error with the Prometheus plugin for Ceph mgr:
One osd was down and (therefore) it had no class:
```
sudo ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
28 hdd 7.27539 osd.28 up 1.00000 1.00000
6 0 osd.6 down 0 1.00000
```
When we tried to curl the metrics, there was an error because the osd
had no class (see below "KeyError: 'class' ").
Anybody experience the same?
Isn't this an error on the Prometheus plugin's behalf? When an osd is down, the plugin should not stop working imo.
```
~> curl -v 127.0.0.1:9283/metrics
* Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 9283 (#0)
GET /metrics HTTP/1.1
Host: 127.0.0.1:9283
User-Agent: curl/7.47.0
Accept: */*
< HTTP/1.1 500 Internal Server ErrorHost: 127.0.0.1:9283
User-Agent: curl/7.47.0
Accept: */*
< Date: Wed, 14 Nov 2018 13:59:59 GMT
< Content-Length: 1663
< Content-Type: text/html;charset=utf-8
< Server: CherryPy/3.5.0
<
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8"></meta>
<title>500 Internal Server Error</title>
<style type="text/css">
#powered_by {
margin-top: 20px;
border-top: 2px solid black;
font-style: italic;
}
#traceback {
color: red;
}
</style>
</head>
<body>
<h2>500 Internal Server Error</h2>
<p>The server encountered an unexpected condition which
prevented it from fulfilling the request.</p>
<pre id="traceback">Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line
670, in respond
response.body = self.handler()
File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line
217, in __call__
self.body = self.oldhandler(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line
61, in __call__
return self.callable(*self.args, **self.kwargs)
File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
414, in metrics
metrics = global_instance().collect()
File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
351, in collect
self.get_metadata_and_osd_status()
File "/usr/lib/x86_64-linux-gnu/ceph/mgr/prometheus/module.py", line
310, in get_metadata_and_osd_status
dev_class['class'],
KeyError: 'class'
</pre>
<div id="powered_by">
<span>
Powered by <a href="http://www.cherrypy.org">CherryPy 3.5.0</a>
</span>
</div>
</body>
</html>
* Connection #0 to host 127.0.0.1 left intact
```
Kind regards,
Gökhan