Discussion:
qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process
(too old to reply)
Oliver Francke
2013-08-02 10:22:55 UTC
Permalink
Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.

Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)

I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.

Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Mike Dawson
2013-08-02 21:47:47 UTC
Permalink
Oliver,

We've had a similar situation occur. For about three months, we've run
several Windows 2008 R2 guests with virtio drivers that record video
surveillance. We have long suffered an issue where the guest appears to
hang indefinitely (or until we intervene). For the sake of this
conversation, we call this state "wedged", because it appears something
(rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets
wedged, we see the following:

- the guest will not respond to pings
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]

We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
expected. At that point we can examine the guest. Each time we'll see:

- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc

We had so many issue with guests becoming wedged, we wrote a script to
'virsh screenshot' them via cron. Then we installed some updates and had
a month or so of higher stability (wedging happened maybe 1/10th as
often). Until today we couldn't figure out why.

Yesterday, I realized qemu was starting the instances without specifying
cache=writeback. We corrected that, and let them run overnight. With RBD
writeback re-enabled, wedging came back as often as we had seen in the
past. I've counted ~40 occurrences in the past 12-hour period. So I feel
like writeback caching in RBD certainly makes the deadlock more likely
to occur.

Joshd asked us to gather RBD client logs:

"joshd> it could very well be the writeback cache not doing a callback
at some point - if you could gather logs of a vm getting stuck with
debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be
great"

We'll do that over the weekend. If you could as well, we'd love the help!

[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Oliver Francke
2013-08-04 13:36:52 UTC
Permalink
Hi Mike,

you might be the guy StefanHa was referring to on the qemu-devel mailing-list.

I just made some more tests, so?
Post by Mike Dawson
Oliver,
- the guest will not respond to pings
If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action.
Post by Mike Dawson
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]
nothing special here:

5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}])
[pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 69632) = 115
[pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd=

and that for many, many threads.
Inside the VM I see 75% wait, but I can restart the spew-test in a second session.

All that tested with rbd_cache=false,cache=none.

I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast.
Post by Mike Dawson
- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc
We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why.
Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur.
"joshd> it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great"
We'll do that over the weekend. If you could as well, we'd love the help!
[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt
As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested.

Thnx for your report,

Oliver.
Post by Mike Dawson
Thanks,
Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Stefan Hajnoczi
2013-08-05 07:48:35 UTC
Permalink
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.

This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.

Now here is where the behavior varies between configurations:

On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.

On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.

Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.

I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).

Stefan
Mike Dawson
2013-08-05 20:08:47 UTC
Permalink
Josh,

Logs are uploaded to cephdrop with the file name
mikedawson-rbd-qemu-deadlock.

- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'


Environment is:

- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic

This issue is reproducible in my environment, and I'm willing to run any
wip branch you need. What else can I provide to help?

Thanks,
Mike Dawson
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Sage Weil
2013-08-13 21:26:07 UTC
Permalink
Josh,
Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock.
- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'
- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic
This issue is reproducible in my environment, and I'm willing to run any wip
branch you need. What else can I provide to help?
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug

http://tracker.ceph.com/issues/5955

and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and

debug objectcacher = 20
debug ms = 1
debug rbd = 20
debug finisher = 20

Thanks!
sage
Thanks,
Mike Dawson
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
James Harper
2013-08-13 22:00:03 UTC
Permalink
Post by Sage Weil
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug
http://tracker.ceph.com/issues/5955
and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and
Do you think this could be a bug in rbd? I'm seeing a bug in the tapdisk rbd code and if the completion was called twice it could cause the crash I'm seeing too.

Unfortunately I can't get gdb to work with pthreads so I can't get a backtrace.

James
James Harper
2013-08-13 22:00:03 UTC
Permalink
Post by Sage Weil
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug
http://tracker.ceph.com/issues/5955
and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and
Do you think this could be a bug in rbd? I'm seeing a bug in the tapdisk rbd code and if the completion was called twice it could cause the crash I'm seeing too.

Unfortunately I can't get gdb to work with pthreads so I can't get a backtrace.

James
James Harper
2013-08-13 22:00:03 UTC
Permalink
Post by Sage Weil
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug
http://tracker.ceph.com/issues/5955
and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and
Do you think this could be a bug in rbd? I'm seeing a bug in the tapdisk rbd code and if the completion was called twice it could cause the crash I'm seeing too.

Unfortunately I can't get gdb to work with pthreads so I can't get a backtrace.

James
James Harper
2013-08-13 22:00:03 UTC
Permalink
Post by Sage Weil
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug
http://tracker.ceph.com/issues/5955
and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and
Do you think this could be a bug in rbd? I'm seeing a bug in the tapdisk rbd code and if the completion was called twice it could cause the crash I'm seeing too.

Unfortunately I can't get gdb to work with pthreads so I can't get a backtrace.

James

Sage Weil
2013-08-13 21:26:07 UTC
Permalink
Josh,
Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock.
- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'
- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic
This issue is reproducible in my environment, and I'm willing to run any wip
branch you need. What else can I provide to help?
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug

http://tracker.ceph.com/issues/5955

and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and

debug objectcacher = 20
debug ms = 1
debug rbd = 20
debug finisher = 20

Thanks!
sage
Thanks,
Mike Dawson
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Sage Weil
2013-08-13 21:26:07 UTC
Permalink
Josh,
Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock.
- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'
- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic
This issue is reproducible in my environment, and I'm willing to run any wip
branch you need. What else can I provide to help?
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug

http://tracker.ceph.com/issues/5955

and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and

debug objectcacher = 20
debug ms = 1
debug rbd = 20
debug finisher = 20

Thanks!
sage
Thanks,
Mike Dawson
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Sage Weil
2013-08-13 21:26:07 UTC
Permalink
Josh,
Logs are uploaded to cephdrop with the file name mikedawson-rbd-qemu-deadlock.
- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'
- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic
This issue is reproducible in my environment, and I'm willing to run any wip
branch you need. What else can I provide to help?
This looks like a different issue than Oliver's. I see one anomaly in the
log, where a rbd io completion is triggered a second time for no apparent
reason. I opened a separate bug

http://tracker.ceph.com/issues/5955

and pushed wip-5955 that will hopefully shine some light on the weird
behavior I saw. Can you reproduce with this branch and

debug objectcacher = 20
debug ms = 1
debug rbd = 20
debug finisher = 20

Thanks!
sage
Thanks,
Mike Dawson
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Oliver Francke
2013-08-08 12:40:49 UTC
Permalink
Hi Josh,

I have a session logged with:

debug_ms=1:debug_rbd=20:debug_objectcacher=30

as you requested from Mike, even if I think, we do have another story
here, anyway.

Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...

Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)

Thnx in advance,

Oliver.
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Josh Durgin
2013-08-08 17:01:24 UTC
Permalink
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.

Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Oliver Francke
2013-08-09 09:22:00 UTC
Permalink
Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Sage Weil
2013-08-13 21:34:45 UTC
Permalink
Hi Oliver,

(Posted this on the bug too, but:)

Your last log revealed a bug in the librados aio flush. A fix is pushed
to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest
please (with caching off again)?

Thanks!
sage
Post by Oliver Francke
Hi Josh,
just opened
http://tracker.ceph.com/issues/5919
with all collected information incl. debug-log.
Hope it helps,
Oliver.
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the guest
hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke
filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh
Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Sage Weil
2013-08-13 21:34:45 UTC
Permalink
Hi Oliver,

(Posted this on the bug too, but:)

Your last log revealed a bug in the librados aio flush. A fix is pushed
to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest
please (with caching off again)?

Thanks!
sage
Post by Oliver Francke
Hi Josh,
just opened
http://tracker.ceph.com/issues/5919
with all collected information incl. debug-log.
Hope it helps,
Oliver.
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the guest
hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke
filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh
Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Sage Weil
2013-08-13 21:34:45 UTC
Permalink
Hi Oliver,

(Posted this on the bug too, but:)

Your last log revealed a bug in the librados aio flush. A fix is pushed
to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest
please (with caching off again)?

Thanks!
sage
Post by Oliver Francke
Hi Josh,
just opened
http://tracker.ceph.com/issues/5919
with all collected information incl. debug-log.
Hope it helps,
Oliver.
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the guest
hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke
filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh
Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Sage Weil
2013-08-13 21:34:45 UTC
Permalink
Hi Oliver,

(Posted this on the bug too, but:)

Your last log revealed a bug in the librados aio flush. A fix is pushed
to wip-librados-aio-flush (bobtail) and wip-5919 (master); can you retest
please (with caching off again)?

Thanks!
sage
Post by Oliver Francke
Hi Josh,
just opened
http://tracker.ceph.com/issues/5919
with all collected information incl. debug-log.
Hope it helps,
Oliver.
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the guest
hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke
filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh
Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Oliver Francke
2013-08-09 09:22:00 UTC
Permalink
Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Oliver Francke
2013-08-09 09:22:00 UTC
Permalink
Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Oliver Francke
2013-08-09 09:22:00 UTC
Permalink
Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Josh Durgin
2013-08-08 17:01:24 UTC
Permalink
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.

Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Josh Durgin
2013-08-08 17:01:24 UTC
Permalink
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.

Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Josh Durgin
2013-08-08 17:01:24 UTC
Permalink
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.

Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Mike Dawson
2013-08-05 20:08:47 UTC
Permalink
Josh,

Logs are uploaded to cephdrop with the file name
mikedawson-rbd-qemu-deadlock.

- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'


Environment is:

- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic

This issue is reproducible in my environment, and I'm willing to run any
wip branch you need. What else can I provide to help?

Thanks,
Mike Dawson
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Oliver Francke
2013-08-08 12:40:49 UTC
Permalink
Hi Josh,

I have a session logged with:

debug_ms=1:debug_rbd=20:debug_objectcacher=30

as you requested from Mike, even if I think, we do have another story
here, anyway.

Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...

Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)

Thnx in advance,

Oliver.
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Mike Dawson
2013-08-05 20:08:47 UTC
Permalink
Josh,

Logs are uploaded to cephdrop with the file name
mikedawson-rbd-qemu-deadlock.

- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'


Environment is:

- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic

This issue is reproducible in my environment, and I'm willing to run any
wip branch you need. What else can I provide to help?

Thanks,
Mike Dawson
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Oliver Francke
2013-08-08 12:40:49 UTC
Permalink
Hi Josh,

I have a session logged with:

debug_ms=1:debug_rbd=20:debug_objectcacher=30

as you requested from Mike, even if I think, we do have another story
here, anyway.

Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...

Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)

Thnx in advance,

Oliver.
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Mike Dawson
2013-08-05 20:08:47 UTC
Permalink
Josh,

Logs are uploaded to cephdrop with the file name
mikedawson-rbd-qemu-deadlock.

- At about 2013-08-05 19:46 or 47, we hit the issue, traffic went to 0
- At about 2013-08-05 19:53:51, ran a 'virsh screenshot'


Environment is:

- Ceph 0.61.7 (client is co-mingled with three OSDs)
- rbd cache = true and cache=writeback
- qemu 1.4.0 1.4.0+dfsg-1expubuntu4
- Ubuntu Raring with 3.8.0-25-generic

This issue is reproducible in my environment, and I'm willing to run any
wip branch you need. What else can I provide to help?

Thanks,
Mike Dawson
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
Oliver Francke
2013-08-08 12:40:49 UTC
Permalink
Hi Josh,

I have a session logged with:

debug_ms=1:debug_rbd=20:debug_objectcacher=30

as you requested from Mike, even if I think, we do have another story
here, anyway.

Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...

Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)

Thnx in advance,

Oliver.
Post by Stefan Hajnoczi
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Stefan Hajnoczi
2013-08-05 07:48:35 UTC
Permalink
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.

This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.

Now here is where the behavior varies between configurations:

On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.

On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.

Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.

I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).

Stefan
Stefan Hajnoczi
2013-08-05 07:48:35 UTC
Permalink
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.

This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.

Now here is where the behavior varies between configurations:

On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.

On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.

Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.

I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).

Stefan
Stefan Hajnoczi
2013-08-05 07:48:35 UTC
Permalink
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.

This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.

Now here is where the behavior varies between configurations:

On a Windows guest with 1 vCPU, you may see the symptom that the guest no
longer responds to ping.

On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.

Basically, the symptoms depend not just on how QEMU is behaving but also
on the guest kernel and how many vCPUs you have configured.

I think this can explain how both problems you are observing, Oliver and
Mike, are a result of the same bug. At least I hope they are :).

Stefan
Oliver Francke
2013-08-04 13:36:52 UTC
Permalink
Hi Mike,

you might be the guy StefanHa was referring to on the qemu-devel mailing-list.

I just made some more tests, so?
Post by Mike Dawson
Oliver,
- the guest will not respond to pings
If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action.
Post by Mike Dawson
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]
nothing special here:

5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}])
[pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 69632) = 115
[pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd=

and that for many, many threads.
Inside the VM I see 75% wait, but I can restart the spew-test in a second session.

All that tested with rbd_cache=false,cache=none.

I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast.
Post by Mike Dawson
- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc
We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why.
Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur.
"joshd> it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great"
We'll do that over the weekend. If you could as well, we'd love the help!
[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt
As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested.

Thnx for your report,

Oliver.
Post by Mike Dawson
Thanks,
Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Oliver Francke
2013-08-04 13:36:52 UTC
Permalink
Hi Mike,

you might be the guy StefanHa was referring to on the qemu-devel mailing-list.

I just made some more tests, so?
Post by Mike Dawson
Oliver,
- the guest will not respond to pings
If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action.
Post by Mike Dawson
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]
nothing special here:

5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}])
[pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 69632) = 115
[pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd=

and that for many, many threads.
Inside the VM I see 75% wait, but I can restart the spew-test in a second session.

All that tested with rbd_cache=false,cache=none.

I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast.
Post by Mike Dawson
- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc
We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why.
Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur.
"joshd> it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great"
We'll do that over the weekend. If you could as well, we'd love the help!
[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt
As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested.

Thnx for your report,

Oliver.
Post by Mike Dawson
Thanks,
Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Oliver Francke
2013-08-04 13:36:52 UTC
Permalink
Hi Mike,

you might be the guy StefanHa was referring to on the qemu-devel mailing-list.

I just made some more tests, so?
Post by Mike Dawson
Oliver,
- the guest will not respond to pings
If showing up the hung_task - message, I can ping and establish new ssh-sessions, just the session with a while loop does not accept any keyboard-action.
Post by Mike Dawson
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]
nothing special here:

5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=19, events=POLLIN}, {fd=15, events=POLLIN}, {fd=4, events=POLLIN}], 11, -1) = 1 ([{fd=12, revents=POLLIN}])
[pid 11793] read(5, 0x7fff16b61f00, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] read(12, "\2\0\0\0\0\0\0\0\0\0\0\0\0\361p\0\252\340\374\373\373!gH\10\0E\0\0Yq\374"..., 69632) = 115
[pid 11793] read(12, 0x7f0c1737fcec, 69632) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11793] poll([{fd=27, events=POLLIN|POLLERR|POLLHUP}, {fd=26, events=POLLIN|POLLERR|POLLHUP}, {fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=12, events=POLLIN|POLLERR|POLLHUP}, {fd=3, events=POLLIN|POLLERR|POLLHUP}, {fd=

and that for many, many threads.
Inside the VM I see 75% wait, but I can restart the spew-test in a second session.

All that tested with rbd_cache=false,cache=none.

I also test every qemu-version with a 2 CPU 2GiB mem Windows 7 VM with some high load, encountering no problem ATM. Running smooth and fast.
Post by Mike Dawson
- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc
We had so many issue with guests becoming wedged, we wrote a script to 'virsh screenshot' them via cron. Then we installed some updates and had a month or so of higher stability (wedging happened maybe 1/10th as often). Until today we couldn't figure out why.
Yesterday, I realized qemu was starting the instances without specifying cache=writeback. We corrected that, and let them run overnight. With RBD writeback re-enabled, wedging came back as often as we had seen in the past. I've counted ~40 occurrences in the past 12-hour period. So I feel like writeback caching in RBD certainly makes the deadlock more likely to occur.
"joshd> it could very well be the writeback cache not doing a callback at some point - if you could gather logs of a vm getting stuck with debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be great"
We'll do that over the weekend. If you could as well, we'd love the help!
[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt
As I wrote above, no cache so far, so omitting the verbose debugging at the moment. But will do if requested.

Thnx for your report,

Oliver.
Post by Mike Dawson
Thanks,
Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Andrei Mikhailovsky
2013-08-09 14:05:22 UTC
Permalink
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.

I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2

Andrei
----- Original Message -----

From: "Oliver Francke" <Oliver.Francke at filoo.de>
To: "Josh Durgin" <josh.durgin at inktank.com>
Cc: ceph-users at lists.ceph.com, "Mike Dawson" <mike.dawson at cloudapt.com>, "Stefan Hajnoczi" <stefanha at redhat.com>, qemu-devel at nongnu.org
Sent: Friday, 9 August, 2013 10:22:00 AM
Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the
guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but
also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver
and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20130809/b637d4f2/attachment.htm>
Stefan Hajnoczi
2013-08-09 15:03:54 UTC
Permalink
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
following events enabled:
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete

See docs/tracing.txt for details on usage.

Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.

This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.

Stefan
Josh Durgin
2013-08-10 07:30:23 UTC
Permalink
Post by Stefan Hajnoczi
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Oliver's logs show one aio_flush() never getting completed, which
means it's an issue with aio_flush in librados when rbd caching isn't
used.

Mike's log is from a qemu without aio_flush(), and with caching turned
on, and shows all flushes completing quickly, so it's a separate bug.
Post by Stefan Hajnoczi
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete
See docs/tracing.txt for details on usage.
Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.
Thanks for the info. That may be the best way to check what's happening
when caching is enabled. Mike, could you recompile qemu with tracing
enabled and get a trace of the hang you were seeing, in addition to
the ceph logs?
Post by Stefan Hajnoczi
This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.
It seems only one request hung from an important kernel thread in
Oliver's case, but it's good to be aware of the descriptor limit.

Josh
Josh Durgin
2013-08-10 07:30:23 UTC
Permalink
Post by Stefan Hajnoczi
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Oliver's logs show one aio_flush() never getting completed, which
means it's an issue with aio_flush in librados when rbd caching isn't
used.

Mike's log is from a qemu without aio_flush(), and with caching turned
on, and shows all flushes completing quickly, so it's a separate bug.
Post by Stefan Hajnoczi
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete
See docs/tracing.txt for details on usage.
Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.
Thanks for the info. That may be the best way to check what's happening
when caching is enabled. Mike, could you recompile qemu with tracing
enabled and get a trace of the hang you were seeing, in addition to
the ceph logs?
Post by Stefan Hajnoczi
This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.
It seems only one request hung from an important kernel thread in
Oliver's case, but it's good to be aware of the descriptor limit.

Josh
Josh Durgin
2013-08-10 07:30:23 UTC
Permalink
Post by Stefan Hajnoczi
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Oliver's logs show one aio_flush() never getting completed, which
means it's an issue with aio_flush in librados when rbd caching isn't
used.

Mike's log is from a qemu without aio_flush(), and with caching turned
on, and shows all flushes completing quickly, so it's a separate bug.
Post by Stefan Hajnoczi
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete
See docs/tracing.txt for details on usage.
Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.
Thanks for the info. That may be the best way to check what's happening
when caching is enabled. Mike, could you recompile qemu with tracing
enabled and get a trace of the hang you were seeing, in addition to
the ceph logs?
Post by Stefan Hajnoczi
This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.
It seems only one request hung from an important kernel thread in
Oliver's case, but it's good to be aware of the descriptor limit.

Josh
Josh Durgin
2013-08-10 07:30:23 UTC
Permalink
Post by Stefan Hajnoczi
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Oliver's logs show one aio_flush() never getting completed, which
means it's an issue with aio_flush in librados when rbd caching isn't
used.

Mike's log is from a qemu without aio_flush(), and with caching turned
on, and shows all flushes completing quickly, so it's a separate bug.
Post by Stefan Hajnoczi
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete
See docs/tracing.txt for details on usage.
Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.
Thanks for the info. That may be the best way to check what's happening
when caching is enabled. Mike, could you recompile qemu with tracing
enabled and get a trace of the hang you were seeing, in addition to
the ceph logs?
Post by Stefan Hajnoczi
This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.
It seems only one request hung from an important kernel thread in
Oliver's case, but it's good to be aware of the descriptor limit.

Josh
Stefan Hajnoczi
2013-08-09 15:03:54 UTC
Permalink
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
following events enabled:
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete

See docs/tracing.txt for details on usage.

Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.

This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.

Stefan
Stefan Hajnoczi
2013-08-09 15:03:54 UTC
Permalink
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
following events enabled:
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete

See docs/tracing.txt for details on usage.

Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.

This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.

Stefan
Stefan Hajnoczi
2013-08-09 15:03:54 UTC
Permalink
Post by Andrei Mikhailovsky
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.
I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2
Josh,
In addition to the Ceph logs you can also use QEMU tracing with the
following events enabled:
virtio_blk_handle_write
virtio_blk_handle_read
virtio_blk_rw_complete

See docs/tracing.txt for details on usage.

Inspecting the trace output will let you observe the I/O request
submission/completion from the virtio-blk device perspective. You'll be
able to see whether requests are never being completed in some cases.

This bug seems like a corner case or race condition since most requests
seem to complete just fine. The problem is that eventually the
virtio-blk device becomes unusable when it runs out of descriptors (it
has 128). And before that limit is reached the guest may become
unusable due to the hung I/O requests.

Stefan
Oliver Francke
2013-08-02 10:22:55 UTC
Permalink
Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.

Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)

I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.

Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Mike Dawson
2013-08-02 21:47:47 UTC
Permalink
Oliver,

We've had a similar situation occur. For about three months, we've run
several Windows 2008 R2 guests with virtio drivers that record video
surveillance. We have long suffered an issue where the guest appears to
hang indefinitely (or until we intervene). For the sake of this
conversation, we call this state "wedged", because it appears something
(rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets
wedged, we see the following:

- the guest will not respond to pings
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]

We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
expected. At that point we can examine the guest. Each time we'll see:

- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc

We had so many issue with guests becoming wedged, we wrote a script to
'virsh screenshot' them via cron. Then we installed some updates and had
a month or so of higher stability (wedging happened maybe 1/10th as
often). Until today we couldn't figure out why.

Yesterday, I realized qemu was starting the instances without specifying
cache=writeback. We corrected that, and let them run overnight. With RBD
writeback re-enabled, wedging came back as often as we had seen in the
past. I've counted ~40 occurrences in the past 12-hour period. So I feel
like writeback caching in RBD certainly makes the deadlock more likely
to occur.

Joshd asked us to gather RBD client logs:

"joshd> it could very well be the writeback cache not doing a callback
at some point - if you could gather logs of a vm getting stuck with
debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be
great"

We'll do that over the weekend. If you could as well, we'd love the help!

[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Andrei Mikhailovsky
2013-08-09 14:05:22 UTC
Permalink
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.

I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2

Andrei
----- Original Message -----

From: "Oliver Francke" <Oliver.Francke at filoo.de>
To: "Josh Durgin" <josh.durgin at inktank.com>
Cc: ceph-users at lists.ceph.com, "Mike Dawson" <mike.dawson at cloudapt.com>, "Stefan Hajnoczi" <stefanha at redhat.com>, qemu-devel at nongnu.org
Sent: Friday, 9 August, 2013 10:22:00 AM
Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the
guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but
also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver
and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20130809/b637d4f2/attachment-0002.htm>
Oliver Francke
2013-08-02 10:22:55 UTC
Permalink
Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.

Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)

I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.

Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Mike Dawson
2013-08-02 21:47:47 UTC
Permalink
Oliver,

We've had a similar situation occur. For about three months, we've run
several Windows 2008 R2 guests with virtio drivers that record video
surveillance. We have long suffered an issue where the guest appears to
hang indefinitely (or until we intervene). For the sake of this
conversation, we call this state "wedged", because it appears something
(rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets
wedged, we see the following:

- the guest will not respond to pings
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]

We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
expected. At that point we can examine the guest. Each time we'll see:

- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc

We had so many issue with guests becoming wedged, we wrote a script to
'virsh screenshot' them via cron. Then we installed some updates and had
a month or so of higher stability (wedging happened maybe 1/10th as
often). Until today we couldn't figure out why.

Yesterday, I realized qemu was starting the instances without specifying
cache=writeback. We corrected that, and let them run overnight. With RBD
writeback re-enabled, wedging came back as often as we had seen in the
past. I've counted ~40 occurrences in the past 12-hour period. So I feel
like writeback caching in RBD certainly makes the deadlock more likely
to occur.

Joshd asked us to gather RBD client logs:

"joshd> it could very well be the writeback cache not doing a callback
at some point - if you could gather logs of a vm getting stuck with
debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be
great"

We'll do that over the weekend. If you could as well, we'd love the help!

[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Andrei Mikhailovsky
2013-08-09 14:05:22 UTC
Permalink
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.

I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2

Andrei
----- Original Message -----

From: "Oliver Francke" <Oliver.Francke at filoo.de>
To: "Josh Durgin" <josh.durgin at inktank.com>
Cc: ceph-users at lists.ceph.com, "Mike Dawson" <mike.dawson at cloudapt.com>, "Stefan Hajnoczi" <stefanha at redhat.com>, qemu-devel at nongnu.org
Sent: Friday, 9 August, 2013 10:22:00 AM
Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the
guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but
also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver
and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20130809/b637d4f2/attachment-0003.htm>
Oliver Francke
2013-08-02 10:22:55 UTC
Permalink
Well,

I believe, I'm the winner of buzzwords-bingo for today.

But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.

Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)

I just opened a bug for qemu per:

https://bugs.launchpad.net/qemu/+bug/1207686

with all dirty details.

Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.

Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,

Oliver.
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
Mike Dawson
2013-08-02 21:47:47 UTC
Permalink
Oliver,

We've had a similar situation occur. For about three months, we've run
several Windows 2008 R2 guests with virtio drivers that record video
surveillance. We have long suffered an issue where the guest appears to
hang indefinitely (or until we intervene). For the sake of this
conversation, we call this state "wedged", because it appears something
(rbd, qemu, virtio, etc) gets stuck on a deadlock. When a guest gets
wedged, we see the following:

- the guest will not respond to pings
- the qemu-system-x86_64 process drops to 0% cpu
- graphite graphs show the interface traffic dropping to 0bps
- the guest will stay wedged forever (or until we intervene)
- strace of qemu-system-x86_64 shows QEMU is making progress [1][2]

We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs as
expected. At that point we can examine the guest. Each time we'll see:

- No Windows error logs whatsoever while the guest is wedged
- A time sync typically occurs right after the guest gets un-wedged
- Scheduled tasks do not run while wedged
- Windows error logs do not show any evidence of suspend, sleep, etc

We had so many issue with guests becoming wedged, we wrote a script to
'virsh screenshot' them via cron. Then we installed some updates and had
a month or so of higher stability (wedging happened maybe 1/10th as
often). Until today we couldn't figure out why.

Yesterday, I realized qemu was starting the instances without specifying
cache=writeback. We corrected that, and let them run overnight. With RBD
writeback re-enabled, wedging came back as often as we had seen in the
past. I've counted ~40 occurrences in the past 12-hour period. So I feel
like writeback caching in RBD certainly makes the deadlock more likely
to occur.

Joshd asked us to gather RBD client logs:

"joshd> it could very well be the writeback cache not doing a callback
at some point - if you could gather logs of a vm getting stuck with
debug rbd = 20, debug ms = 1, and debug objectcacher = 30 that would be
great"

We'll do that over the weekend. If you could as well, we'd love the help!

[1] http://www.gammacode.com/kvm/wedged-with-timestamps.txt
[2] http://www.gammacode.com/kvm/not-wedged.txt

Thanks,

Mike Dawson
Co-Founder & Director of Cloud Architecture
Cloudapt LLC
6330 East 75th Street, Suite 170
Indianapolis, IN 46250
Post by Oliver Francke
Well,
I believe, I'm the winner of buzzwords-bingo for today.
But seriously speaking... as I don't have this particular problem with
qcow2 with kernel 3.2 nor qemu-1.2.2 nor newer kernels, I hope I'm not
alone here?
We have a raising number of tickets from people reinstalling from ISO's
with 3.2-kernel.
Fast fallback is to start all VM's with qemu-1.2.2, but we then lose
some features ala latency-free-RBD-cache ;)
https://bugs.launchpad.net/qemu/+bug/1207686
with all dirty details.
Installing a backport-kernel 3.9.x or upgrade Ubuntu-kernel to 3.8.x
"fixes" it. So we have a bad combination for all distros with 3.2-kernel
and rbd as storage-backend, I assume.
Any similar findings?
Any idea of tracing/debugging ( Josh? ;) ) very welcome,
Oliver.
Andrei Mikhailovsky
2013-08-09 14:05:22 UTC
Permalink
I can confirm that I am having similar issues with ubuntu vm guests using fio with bs=4k direct=1 numjobs=4 iodepth=16. Occasionally i see hang tasks, occasionally guest vm stops responding without leaving anything in the logs and sometimes i see kernel panic on the console. I typically leave the runtime of the fio test for 60 minutes and it tends to stop responding after about 10-30 mins.

I am on ubuntu 12.04 with 3.5 kernel backport and using ceph 0.61.7 with qemu 1.5.0 and libvirt 1.0.2

Andrei
----- Original Message -----

From: "Oliver Francke" <Oliver.Francke at filoo.de>
To: "Josh Durgin" <josh.durgin at inktank.com>
Cc: ceph-users at lists.ceph.com, "Mike Dawson" <mike.dawson at cloudapt.com>, "Stefan Hajnoczi" <stefanha at redhat.com>, qemu-devel at nongnu.org
Sent: Friday, 9 August, 2013 10:22:00 AM
Subject: Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

Hi Josh,

just opened

http://tracker.ceph.com/issues/5919

with all collected information incl. debug-log.

Hope it helps,

Oliver.
Post by Josh Durgin
Post by Oliver Francke
Hi Josh,
debug_ms=1:debug_rbd=20:debug_objectcacher=30
as you requested from Mike, even if I think, we do have another story
here, anyway.
Host-kernel is: 3.10.0-rc7, qemu-client 1.6.0-rc2, client-kernel is
3.2.0-51-amd...
Do you want me to open a ticket for that stuff? I have about 5MB
compressed logfile waiting for you ;)
Yes, that'd be great. If you could include the time when you saw the
guest hang that'd be ideal. I'm not sure if this is one or two bugs,
but it seems likely it's a bug in rbd and not qemu.
Thanks!
Josh
Post by Oliver Francke
Thnx in advance,
Oliver.
Post by Stefan Hajnoczi
Post by Mike Dawson
We can "un-wedge" the guest by opening a NoVNC session or running a
'virsh screenshot' command. After that, the guest resumes and runs
as expected. At that point we can examine the guest. Each time we'll
If virsh screenshot works then this confirms that QEMU itself is still
responding. Its main loop cannot be blocked since it was able to
process the screendump command.
This supports Josh's theory that a callback is not being invoked. The
virtio-blk I/O request would be left in a pending state.
On a Windows guest with 1 vCPU, you may see the symptom that the
guest no
longer responds to ping.
On a Linux guest with multiple vCPUs, you may see the hung task message
from the guest kernel because other vCPUs are still making progress.
Just the vCPU that issued the I/O request and whose task is in
UNINTERRUPTIBLE state would really be stuck.
Basically, the symptoms depend not just on how QEMU is behaving but
also
on the guest kernel and how many vCPUs you have configured.
I think this can explain how both problems you are observing, Oliver
and
Mike, are a result of the same bug. At least I hope they are :).
Stefan
--
Oliver Francke

filoo GmbH
Moltkestra?e 25a
33330 G?tersloh
HRB4355 AG G?tersloh

Gesch?ftsf?hrer: J.Rehp?hler | C.Kunz

Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh

_______________________________________________
ceph-users mailing list
ceph-users at lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20130809/b637d4f2/attachment-0004.htm>
Continue reading on narkive:
Loading...