Comment 12 for bug 1570195

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Breaking on the two check functions and the calling one to see where things break:

b virtnet_send_command
# virtqueue_get_buf gets hit by __do_softirq -> napi_poll -> virtnet_poll -> virtnet_receive -> virtqueue_get_buf all the time.
Need to keep that disabled and step INTO from virtnet_send_command.
b virtqueue_get_buf
b virtqueue_is_broken

Here is what we see in the two checkers then
virtqueue_get_buf (_vq=0xffff8801b6b7d000, len=0xffff8801b7f17b64) at /build/linux-XwpX40/linux-4.4.0/drivers/virtio/virtio_ring.c:478
p *(_vq)
$12 = {list = {next = 0xffff8801b69c8b00, prev = 0xffff8801b640d000}, callback = 0x0 <irq_stack_union>, name = 0xffffffff81d094d7 "control", vdev = 0xffff8801b69c8800,
  index = 8, num_free = 63, priv = 0x1c010}

 if (unlikely(!vq->data[i])) {
         BAD_RING(vq, "id %u is not a head!\n", i);
         return NULL;
 }
 ret = vq->data[i];
 [...]
 return ret;
So this should for sure be valid when returning or we would see the BAD_RING.
But then it is looping on after returning on
 while !virtqueue_get_buf(vi->cvq, &tmp) && !virtqueue_is_broken(vi->cvq)

So we "should be" (tm) safe to assume that we always get a good buffer back, but then lack one?

Too much is optimized out by default to take a much deeper look.
I need to understand more what happens there, so I'm going to recompile the kernel with extra stuff, more debug and less optimization.