Comment 4 for bug 1898200

Revision history for this message
Mohammed Naser (mnaser) wrote :

This is pretty critical for us, the problem is that we are seeing VMs crash with the following error:

Oct 24 15:55:19 kvm3 kernel: [1972520.039122] tp_librbd[65017]: segfault at 3f21 ip 00007f5ea8796266 sp 00007f5e82ffc500 error 4 in librbd.so.1.12.0[7f5ea85b7000+5a3000]
Oct 25 12:20:24 kvm3 kernel: [2046023.073585] fn-radosclient[59082]: segfault at 14ea9a ip 00007facfa108266 sp 00007facd17f9470 error 4 in librbd.so.1.12.0[7facf9f29000+5a3000]
Oct 31 11:25:29 kvm7 kernel: [3411178.435755] fn-radosclient[5460]: segfault at 3f21 ip 00007fbc686afe26 sp 00007fbc297f9470 error 4 in librbd.so.1.12.0[7fbc684d1000+5a0000]
Nov 2 11:57:34 kvm7 kernel: [3585900.195308] fn-radosclient[15921]: segfault at 3f21 ip 00007f76fa4a9e26 sp 00007f76af7fd470 error 4 in librbd.so.1.12.0[7f76fa2cb000+5a0000]

Upon researching this, this lead me to the following bug:

https://tracker.ceph.com/issues/47456

Which was closed by the following PR:

https://github.com/ceph/ceph/pull/36331

Which has been included as per the release notes:

https://docs.ceph.com/en/latest/releases/octopus/
"rbd: librbd: potential race conditions handling API IO completions (pr#36331, Jason Dillaman)"