------- Comment From <email address hidden> 2016-08-08 15:09 EDT-------
(In reply to comment #13)
> (In reply to comment #9)
> > I Have Tested this with , Test Kernel available at
> > http://people.canonical.com/~rtg/eeh-lp1602724/ . on Ubuntu 16.04.1
> >
> > Test Kernel :
> > root@everest-lp13-leaf:~# uname -a
> > Linux everest-lp13-leaf 4.4.0-32-generic #51 SMP Tue Jul 19 21:41:04 UTC
> > 2016 ppc64le ppc64le ppc64le GNU/Linux
> >
> >
> >
> > Nvme (Leaf) is getting recovered till 5 times on triggering the EEH, But
> > "hitting a kernel crash" after on 6th time trigger of EEH.
> >
>
> This is most likely fixed by
>
> http://lists.infradead.org/pipermail/linux-nvme/2016-August/005670.html
> ("[PATCH v2] nvme: Suspend all queues before deletion")
>
> Which is not upstream yet. Once it gets accepted, it should be pushed to
> Ubuntu via another bugzilla. When that happens, we'll need a new test
> kernel for this one.
Canonical,
For a little more context, we have identified an issue in the test kernel you provided. After a sequence of 6 EEHs, DD will attempt to remove the adapter, which ends up hitting a BUG_ON.
We think the above patch is a fix, but it's still not confirmed. Can you provide a kernel with that patch also applied for verification? It's still not upstream yet, but it has already been ack-ed by the driver maintainer, Keith Busch.
------- Comment From <email address hidden> 2016-08-08 15:09 EDT------- people. canonical. com/~rtg/ eeh-lp1602724/ . on Ubuntu 16.04.1 lp13-leaf: ~# uname -a lists.infradead .org/pipermail/ linux-nvme/ 2016-August/ 005670. html
(In reply to comment #13)
> (In reply to comment #9)
> > I Have Tested this with , Test Kernel available at
> > http://
> >
> > Test Kernel :
> > root@everest-
> > Linux everest-lp13-leaf 4.4.0-32-generic #51 SMP Tue Jul 19 21:41:04 UTC
> > 2016 ppc64le ppc64le ppc64le GNU/Linux
> >
> >
> >
> > Nvme (Leaf) is getting recovered till 5 times on triggering the EEH, But
> > "hitting a kernel crash" after on 6th time trigger of EEH.
> >
>
> This is most likely fixed by
>
> http://
> ("[PATCH v2] nvme: Suspend all queues before deletion")
>
> Which is not upstream yet. Once it gets accepted, it should be pushed to
> Ubuntu via another bugzilla. When that happens, we'll need a new test
> kernel for this one.
Canonical,
For a little more context, we have identified an issue in the test kernel you provided. After a sequence of 6 EEHs, DD will attempt to remove the adapter, which ends up hitting a BUG_ON.
We think the above patch is a fix, but it's still not confirmed. Can you provide a kernel with that patch also applied for verification? It's still not upstream yet, but it has already been ack-ed by the driver maintainer, Keith Busch.