~gpiccoli/+git/linux:nvme-md

Last commit made on 2018-08-01
Get this branch:
git clone -b nvme-md https://git.launchpad.net/~gpiccoli/+git/linux
Only Guilherme G. Piccoli can upload to this branch. If you are Guilherme G. Piccoli please log in for upload directions.

Branch merges

Branch information

Name:
nvme-md
Repository:
lp:~gpiccoli/+git/linux

Recent commits

3d005f5... by Guilherme G. Piccoli

md/raid0: Introduce emergency stop for raid0 arrays

Currently the raid0 driver is not provided with any health checking
mechanism to verify its members are fine. So, if suddenly a member
is removed, for example, a STOP_ARRAY ioctl will be triggered from
userspace, i.e., all the logic for stopping an array relies in the
userspace tools, like mdadm/udev. Particularly, if a raid0 array is
mounted, this stop procedure will fail, since mdadm tries to open
the md block device with O_EXCL flag, which isn't allowed if md is
mounted.

That leads to the following situation: if a raid0 array member is
removed and the array is mounted, some user writing to this array
won't realize errors are happening unless they check kernel log.
In other words, no -EIO is returned and writes (except direct I/Os)
appear normal. Meaning this user might think the wrote data is stored
in the array, but instead garbage was written since raid0 does stripping
and require all members to be working in order not corrupt data.

This patch propose a change in this behavior: to emergency stop a
raid0 array in case one of its members are gone. The check happens
when I/O is queued to raid0 driver, so the driver will confirm if
the block device it plans to read/write has its queue healthy; in
case it's not fine (like a dying or dead queue), raid0 driver will
invoke an emergency removal routine that will mark the md device as
broken and trigger a delayed stop procedure. Also, raid0 will start
refusing new BIOs from this point, returning -EIO.
The emergency stop routine will mark the md request queue as dying
too, as a "flag" to indicate failure in case of a nested raid0 array
configuration (a raid0 composed of raid0 devices).

The delayed stop procedure then will perform the basic stop of the
md device, and will take care in case it holds mounted filesystems,
allowing the stop of a mounted raid0 array - which is common in
other regular block devices like NVMe and SCSI.

This emergency stop mechanism only affects raid0 arrays.

Signed-off-by: Guilherme G. Piccoli <email address hidden>

d25bebe... by Guilherme G. Piccoli

Revert "md/raid0: Allow stop mounted arrays (draft title)"

This reverts commit d5a75c032dd7162561891483ec3880892edb0425.

169d1dd... by Linus Torvalds <email address hidden>

squashfs: more metadata hardening

Anatoly reports another squashfs fuzzing issue, where the decompression
parameters themselves are in a compressed block.

This causes squashfs_read_data() to be called in order to read the
decompression options before the decompression stream having been set
up, making squashfs go sideways.

Reported-by: Anatoly Trosinenko <email address hidden>
Acked-by: Phillip Lougher <email address hidden>
Cc: <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>

4a018b2... by Thomas Petazzoni <email address hidden>

sparc: use asm-generic version of msi.h

This is necessary to be able to include <linux/msi.h> when
CONFIG_GENERIC_MSI_IRQ_DOMAIN is enabled. Without this, a build with
CONFIG_GENERIC_MSI_IRQ_DOMAIN fails with:

   In file included from drivers//ata/ahci.c:45:0:
>> include/linux/msi.h:226:10: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
             msi_alloc_info_t *arg);
             ^~~~~~~~~~~~~~~~
             sg_alloc_fn
   include/linux/msi.h:230:9: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
            msi_alloc_info_t *arg);
            ^~~~~~~~~~~~~~~~
            sg_alloc_fn
   include/linux/msi.h:239:12: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
               msi_alloc_info_t *arg);
               ^~~~~~~~~~~~~~~~
               sg_alloc_fn
   include/linux/msi.h:240:22: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
     void (*msi_finish)(msi_alloc_info_t *arg, int retval);
                         ^~~~~~~~~~~~~~~~
                         sg_alloc_fn
   include/linux/msi.h:241:20: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
     void (*set_desc)(msi_alloc_info_t *arg,
                       ^~~~~~~~~~~~~~~~
                       sg_alloc_fn
   include/linux/msi.h:316:18: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
           int nvec, msi_alloc_info_t *args);
                     ^~~~~~~~~~~~~~~~
                     sg_alloc_fn
   include/linux/msi.h:318:29: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
            int virq, int nvec, msi_alloc_info_t *args);
                                ^~~~~~~~~~~~~~~~
                                sg_alloc_fn

Signed-off-by: Thomas Petazzoni <email address hidden>
Signed-off-by: David S. Miller <email address hidden>

44e3a0a... by Thomas Petazzoni <email address hidden>

sparc: move MSI related definitions to where they are used

The definitions in arch/sparc/include/asm/msi.h are only used in
arch/sparc/mm/srmmu.c, so it makes sense to have them in the C file
directly.

In addition, having a custom arch/sparc/include/asm/msi.h prevents
from using the asm-generic version of this header, which is necessary
to be able to include <linux/msi.h> when CONFIG_GENERIC_MSI_IRQ_DOMAIN
is enabled.

Signed-off-by: Thomas Petazzoni <email address hidden>
Acked-by: Sam Ravnborg <email address hidden>
Signed-off-by: David S. Miller <email address hidden>

ab81cbf... by rostedt

sparc/time: Add missing __init to init_tick_ops()

Code that was added to force gcc not to inline any function that isn't
explicitly declared as inline uncovered that init_tick_ops() isn't
marked as "__init". It is only called by __init functions and more
importantly it too calls an __init function which would require it to be
__init as well.

Link: http://lkml.kernel.org/r/201806060444.hdHcKOBy%<email address hidden>

Reported-by: kbuild test robot <email address hidden>
Signed-off-by: Steven Rostedt (VMware) <email address hidden>
Signed-off-by: David S. Miller <email address hidden>

7c1640f... by Dmitry Safonov <email address hidden>

netlink: Don't shift with UB on nlk->ngroups

On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in
hang during boot.
Check for 0 ngroups and use (unsigned long long) as a type to shift.

Fixes: 7acf9d4237c4 ("netlink: Do not subscribe to non-existent groups").
Reported-by: kernel test robot <email address hidden>
Signed-off-by: Dmitry Safonov <email address hidden>
Signed-off-by: David S. Miller <email address hidden>

fb6e36a... by Sabrina Dubroca <email address hidden>

net/ipv6: fix metrics leak

Since commit d4ead6b34b67 ("net/ipv6: move metrics from dst to
rt6_info"), ipv6 metrics are shared and refcounted. rt6_set_from()
assigns the rt->from pointer and increases the refcount on from's
metrics. This reference is never released.

Introduce the fib6_metrics_release() helper and use it to release the
metrics.

Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Signed-off-by: Sabrina Dubroca <email address hidden>
Signed-off-by: David S. Miller <email address hidden>

9845136... by Xiao Liang <email address hidden>

xen-netfront: wait xenbus state change when load module manually

When loading module manually, after call xenbus_switch_state to initializes
the state of the netfront device, the driver state did not change so fast
that may lead no dev created in latest kernel. This patch adds wait to make
sure xenbus knows the driver is not in closed/unknown state.

Current state:
[vm]# ethtool eth0
Settings for eth0:
 Link detected: yes
[vm]# modprobe -r xen_netfront
[vm]# modprobe xen_netfront
[vm]# ethtool eth0
Settings for eth0:
Cannot get device settings: No such device
Cannot get wake-on-lan settings: No such device
Cannot get message level: No such device
Cannot get link status: No such device
No data available

With the patch installed.
[vm]# ethtool eth0
Settings for eth0:
 Link detected: yes
[vm]# modprobe -r xen_netfront
[vm]# modprobe xen_netfront
[vm]# ethtool eth0
Settings for eth0:
 Link detected: yes

Signed-off-by: Xiao Liang <email address hidden>
Signed-off-by: David S. Miller <email address hidden>

2509a35... by Jiang Biao <email address hidden>

virtio_balloon: fix another race between migration and ballooning

Kernel panic when with high memory pressure, calltrace looks like,

PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java"
 #0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb
 #1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942
 #2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30
 #3 [ffff881ec7ed7778] oops_end at ffffffff816902c8
 #4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46
 #5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc
 #6 [ffff881ec7ed7838] __node_set at ffffffff81680300
 #7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f
 #8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5
 #9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8
    [exception RIP: _raw_spin_lock_irqsave+47]
    RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046
    RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8
    RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008
    RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098
    R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018

It happens in the pagefault and results in double pagefault
during compacting pages when memory allocation fails.

Analysed the vmcore, the page leads to second pagefault is corrupted
with _mapcount=-256, but private=0.

It's caused by the race between migration and ballooning, and lock
missing in virtballoon_migratepage() of virtio_balloon driver.
This patch fix the bug.

Fixes: e22504296d4f64f ("virtio_balloon: introduce migration primitives to balloon pages")
Cc: <email address hidden>
Signed-off-by: Jiang Biao <email address hidden>
Signed-off-by: Huang Chong <email address hidden>
Signed-off-by: Michael S. Tsirkin <email address hidden>