~tyhicks/ubuntu/+source/linux/+git/xenial:spectrersb

Last commit made on 2018-08-28
Get this branch:
git clone -b spectrersb https://git.launchpad.net/~tyhicks/ubuntu/+source/linux/+git/xenial
Only Tyler Hicks can upload to this branch. If you are Tyler Hicks please log in for upload directions.

Branch merges

Branch information

Name:
spectrersb
Repository:
lp:~tyhicks/ubuntu/+source/linux/+git/xenial

Recent commits

e8fe8cd... by Tyler Hicks

Bump version

a75f196... by Jiri Kosina <email address hidden>

x86/speculation: Protect against userspace-userspace spectreRSB

The article "Spectre Returns! Speculation Attacks using the Return Stack
Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks,
making use solely of the RSB contents even on CPUs that don't fallback to
BTB on RSB underflow (Skylake+).

Mitigate userspace-userspace attacks by always unconditionally filling RSB on
context switch when the generic spectrev2 mitigation has been enabled.

[1] https://arxiv.org/pdf/1807.07940.pdf

Signed-off-by: Jiri Kosina <email address hidden>
Signed-off-by: Thomas Gleixner <email address hidden>
Reviewed-by: Josh Poimboeuf <email address hidden>
Acked-by: Tim Chen <email address hidden>
Cc: Konrad Rzeszutek Wilk <email address hidden>
Cc: Borislav Petkov <email address hidden>
Cc: David Woodhouse <email address hidden>
Cc: Peter Zijlstra <email address hidden>
Cc: Linus Torvalds <email address hidden>
Cc: <email address hidden>
Link: https://<email address hidden>

CVE-2017-5715 (SpectreRSB sub-variant)

(backported from commit fdf82a7856b32d905c39afc85e34364491e46346)
Signed-off-by: Tyler Hicks <email address hidden>

86f421e... by Khaled El Mously

UBUNTU: Ubuntu-4.4.0-135.161

Signed-off-by: Khalid Elmously <email address hidden>

b2f8a8e... by Heiner Kallweit

net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT

BugLink: https://bugs.launchpad.net/bugs/1785739

This condition wasn't adjusted when PHY_IGNORE_INTERRUPT (-2) was added
long ago. In case of PHY_IGNORE_INTERRUPT the MAC interrupt indicates
also PHY state changes and we should do what the symbol says.

Fixes: 84a527a41f38 ("net: phylib: fix interrupts re-enablement in phy_start")
Signed-off-by: Heiner Kallweit <email address hidden>
Reviewed-by: Florian Fainelli <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(cherry picked from commit 08f5138512180a479ce6b9d23b825c9f4cd3be77)
Signed-off-by: dann frazier <email address hidden>
Acked-by: Kleber Souza <email address hidden>
Acked-by: Stefan Bader <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>

67285f8... by Shaohui Xie <email address hidden>

net: phylib: fix interrupts re-enablement in phy_start

BugLink: https://bugs.launchpad.net/bugs/1785739

If phy was suspended and is starting, current driver always enable
phy's interrupts, if phy works in polling, phy can raise unexpected
interrupt which will not be handled, the interrupt will block system
enter suspend again. So interrupts should only be re-enabled if phy
works in interrupt.

Signed-off-by: Shaohui Xie <email address hidden>
Reviewed-by: Florian Fainelli <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(cherry picked from commit 84a527a41f38a80353f185d05e41b021e1ff672b)
Signed-off-by: dann frazier <email address hidden>
Acked-by: Kleber Souza <email address hidden>
Acked-by: Stefan Bader <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>

bfa2e6e... by Julian Wiedmann <email address hidden>

s390/qeth: don't clobber buffer on async TX completion

BugLink: https://bugs.launchpad.net/bugs/1786057

If qeth_qdio_output_handler() detects that a transmit requires async
completion, it replaces the pending buffer's metadata object
(qeth_qdio_out_buffer) so that this queue buffer can be re-used while
the data is pending completion.

Later when the CQ indicates async completion of such a metadata object,
qeth_qdio_cq_handler() tries to free any data associated with this
object (since HW has now completed the transfer). By calling
qeth_clear_output_buffer(), it erronously operates on the queue buffer
that _previously_ belonged to this transfer ... but which has been
potentially re-used several times by now.
This results in double-free's of the buffer's data, and failing
transmits as the buffer descriptor is scrubbed in mid-air.

The correct way of handling this situation is to
1. scrub the queue buffer when it is prepared for re-use, and
2. later obtain the data addresses from the async-completion notifier
   (ie. the AOB), instead of the queue buffer.

All this only affects qeth devices used for af_iucv HiperTransport.

Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(backported from commit ce28867fd20c23cd769e78b4d619c4755bf71a1c)
Signed-off-by: Joseph Salisbury <email address hidden>
Acked-by: Khalid Elmously <email address hidden>
Acked-by: Kleber Souza <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>

7edef28... by Marta Rybczynska <email address hidden>

nvme: avoid cqe corruption when update at the same time as read

BugLink: http://bugs.launchpad.net/bugs/1788035

Make sure the CQE phase (validity) is read before the rest of the
structure. The phase bit is the highest address and the CQE
read will happen on most platforms from lower to upper addresses
and will be done by multiple non-atomic loads. If the structure
is updated by PCI during the reads from the processor, the
processor may get a corrupted copy.

The addition of the new nvme_cqe_valid function that verifies
the validity bit also allows refactoring of the other CQE read
sequences.

Signed-off-by: Marta Rybczynska <email address hidden>
Reviewed-by: Johannes Thumshirn <email address hidden>
Reviewed-by: Christoph Hellwig <email address hidden>
Reviewed-by: Keith Busch <email address hidden>
Signed-off-by: Jens Axboe <email address hidden>
(backported from commit d783e0bd02e700e7a893ef4fa71c69438ac1c276)
Signed-off-by: Kamal Mostafa <email address hidden>
Acked-by: Khalid Elmously <email address hidden>
Acked-by: Kleber Souza <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>

c6bdb60... by Kiran M

cachefiles: Wait rather than BUG'ing on "Unexpected object collision"

BugLink: https://bugs.launchpad.net/bugs/1776254

If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
the active object tree, we have been emitting a BUG after logging
information about it and the new object.

Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
on the old object (or return an error). The ACTIVE flag should be cleared
after it has been removed from the active object tree. A timeout of 60s is
used in the wait, so we shouldn't be able to get stuck there.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <email address hidden>
Signed-off-by: David Howells <email address hidden>
(cherry picked from commit c2412ac45a8f8f1cd582723c1a139608694d410d)
Signed-off-by: Daniel Axtens <email address hidden>
Acked-by: Khalid Elmously <email address hidden>
Acked-by: Kamal Mostafa <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>

8c4aa7b... by Kiran M

cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag

BugLink: https://bugs.launchpad.net/bugs/1776254

In cachefiles_mark_object_active(), the new object is marked active and
then we try to add it to the active object tree. If a conflicting object
is already present, we want to wait for that to go away. After the wait,
we go round again and try to re-mark the object as being active - but it's
already marked active from the first time we went through and a BUG is
issued.

Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.

Analysis from Kiran Kumar Modukuri:

[Impact]
Oops during heavy NFS + FSCache + Cachefiles

CacheFiles: Error: Overlong wait for old active object to go away.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000002

CacheFiles: Error: Object already active kernel BUG at
fs/cachefiles/namei.c:163!

[Cause]
In a heavily loaded system with big files being read and truncated, an
fscache object for a cookie is being dropped and a new object being
looked. The new object being looked for has to wait for the old object
to go away before the new object is moved to active state.

[Fix]
Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
retrying the object lookup.

[Testcase]
Have run ~100 hours of NFS stress tests and have not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri <email address hidden>
Signed-off-by: David Howells <email address hidden>
(backported from commit 5ce83d4bb7d8e11e8c1c687d09f4b5ae67ef3ce3)
Signed-off-by: Daniel Axtens <email address hidden>
Acked-by: Khalid Elmously <email address hidden>
Acked-by: Kamal Mostafa <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>

6a36dd9... by Kiran M

fscache: Fix reference overput in fscache_attach_object() error handling

BugLink: https://bugs.launchpad.net/bugs/1776277

When a cookie is allocated that causes fscache_object structs to be
allocated, those objects are initialised with the cookie pointer, but
aren't blessed with a ref on that cookie unless the attachment is
successfully completed in fscache_attach_object().

If attachment fails because the parent object was dying or there was a
collision, fscache_attach_object() returns without incrementing the cookie
counter - but upon failure of this function, the object is released which
then puts the cookie, whether or not a ref was taken on the cookie.

Fix this by taking a ref on the cookie when it is assigned in
fscache_object_init(), even when we're creating a root object.

Analysis from Kiran Kumar:

This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277

fscache cookie ref count updated incorrectly during fscache object
allocation resulting in following Oops.

kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!

[Cause]
Two threads are trying to do operate on a cookie and two objects.

(1) One thread tries to unmount the filesystem and in process goes over a
    huge list of objects marking them dead and deleting the objects.
    cookie->usage is also decremented in following path:

      nfs_fscache_release_super_cookie
       -> __fscache_relinquish_cookie
        ->__fscache_cookie_put
        ->BUG_ON(atomic_read(&cookie->usage) <= 0);

(2) A second thread tries to lookup an object for reading data in following
    path:

    fscache_alloc_object
    1) cachefiles_alloc_object
        -> fscache_object_init
           -> assign cookie, but usage not bumped.
    2) fscache_attach_object -> fails in cant_attach_object because the
         cookie's backing object or cookie's->parent object are going away
    3) fscache_put_object
        -> cachefiles_put_object
          ->fscache_object_destroy
            ->fscache_cookie_put
               ->BUG_ON(atomic_read(&cookie->usage) <= 0);

[NOTE from dhowells] It's unclear as to the circumstances in which (2) can
take place, given that thread (1) is in nfs_kill_super(), however a
conflicting NFS mount with slightly different parameters that creates a
different superblock would do it. A backtrace from Kiran seems to show
that this is a possibility:

    kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
    ...
    RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
    Call Trace:
     __fscache_relinquish_cookie+0x87/0x120 [fscache]
     nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
     nfs_kill_super+0x29/0x40 [nfs]
     deactivate_locked_super+0x48/0x80
     deactivate_super+0x5c/0x60
     cleanup_mnt+0x3f/0x90
     __cleanup_mnt+0x12/0x20
     task_work_run+0x86/0xb0
     exit_to_usermode_loop+0xc2/0xd0
     syscall_return_slowpath+0x4e/0x60
     int_ret_from_sys_call+0x25/0x9f

[Fix] Bump up the cookie usage in fscache_object_init, when it is first
being assigned a cookie atomically such that the cookie is added and bumped
up if its refcount is not zero. Remove the assignment in
fscache_attach_object().

[Testcase]
I have run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API")
Signed-off-by: Kiran Kumar Modukuri <email address hidden>
Signed-off-by: David Howells <email address hidden>
(backported from commit f29507ce66701084c39aeb1b0ae71690cbff3554)
Signed-off-by: Daniel Axtens <email address hidden>
Acked-by: Khalid Elmously <email address hidden>
Acked-by: Kamal Mostafa <email address hidden>
Signed-off-by: Khalid Elmously <email address hidden>