~ilasc/ubuntu/+source/linux/+git/xenial:generic-build

Last commit made on 2016-08-19
Get this branch:
git clone -b generic-build https://git.launchpad.net/~ilasc/ubuntu/+source/linux/+git/xenial
Only Ioana Lasc can upload to this branch. If you are Ioana Lasc please log in for upload directions.

Branch merges

Branch information

Name:
generic-build
Repository:
lp:~ilasc/ubuntu/+source/linux/+git/xenial

Recent commits

448ee53... by Tobias Brunner

xfrm: Ignore socket policies when rebuilding hash tables

BugLink: http://bugs.launchpad.net/bugs/1613787

Whenever thresholds are changed the hash tables are rebuilt. This is
done by enumerating all policies and hashing and inserting them into
the right table according to the thresholds and direction.

Because socket policies are also contained in net->xfrm.policy_all but
no hash tables are defined for their direction (dir + XFRM_POLICY_MAX)
this causes a NULL or invalid pointer dereference after returning from
policy_hash_bysel() if the rebuild is done while any socket policies
are installed.

Since the rebuild after changing thresholds is scheduled this crash
could even occur if the userland sets thresholds seemingly before
installing any socket policies.

Fixes: 53c2e285f970 ("xfrm: Do not hash socket policies")
Signed-off-by: Tobias Brunner <email address hidden>
Acked-by: Herbert Xu <email address hidden>
Signed-off-by: Steffen Klassert <email address hidden>
(cherry picked from linux-next commit 6916fb3b10b3cbe3b1f9f5b680675f53e4e299eb)
Signed-off-by: Joseph Salisbury <email address hidden>
Acked-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

14536ec... by Jan Kara <email address hidden>

writeback: Write dirty times for WB_SYNC_ALL writeback

BugLink: http://bugs.launchpad.net/bugs/1614565

Currently we take care to handle I_DIRTY_TIME in vfs_fsync() and
queue_io() so that inodes which have only dirty timestamps are properly
written on fsync(2) and sync(2). However there are other call sites -
most notably going through write_inode_now() - which expect inode to be
clean after WB_SYNC_ALL writeback. This is not currently true as we do
not clear I_DIRTY_TIME in __writeback_single_inode() even for
WB_SYNC_ALL writeback in all the cases. This then resulted in the
following oops because bdev_write_inode() did not clean the inode and
writeback code later stumbled over a dirty inode with detached wb.

  general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
  Modules linked in:
  CPU: 3 PID: 32 Comm: kworker/u10:1 Not tainted 4.6.0-rc3+ #349
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Workqueue: writeback wb_workfn (flush-11:0)
  task: ffff88006ccf1840 ti: ffff88006cda8000 task.ti: ffff88006cda8000
  RIP: 0010:[<ffffffff818884d2>] [<ffffffff818884d2>]
  locked_inode_to_wb_and_lock_list+0xa2/0x750
  RSP: 0018:ffff88006cdaf7d0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88006ccf2050
  RDX: 0000000000000000 RSI: 000000114c8a8484 RDI: 0000000000000286
  RBP: ffff88006cdaf820 R08: ffff88006ccf1840 R09: 0000000000000000
  R10: 000229915090805f R11: 0000000000000001 R12: ffff88006a72f5e0
  R13: dffffc0000000000 R14: ffffed000d4e5eed R15: ffffffff8830cf40
  FS: 0000000000000000(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000003301bf8 CR3: 000000006368f000 CR4: 00000000000006e0
  DR0: 0000000000001ec9 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
  Stack:
   ffff88006a72f680 ffff88006a72f768 ffff8800671230d8 03ff88006cdaf948
   ffff88006a72f668 ffff88006a72f5e0 ffff8800671230d8 ffff88006cdaf948
   ffff880065b90cc8 ffff880067123100 ffff88006cdaf970 ffffffff8188e12e
  Call Trace:
   [< inline >] inode_to_wb_and_lock_list fs/fs-writeback.c:309
   [<ffffffff8188e12e>] writeback_sb_inodes+0x4de/0x1250 fs/fs-writeback.c:1554
   [<ffffffff8188efa4>] __writeback_inodes_wb+0x104/0x1e0 fs/fs-writeback.c:1600
   [<ffffffff8188f9ae>] wb_writeback+0x7ce/0xc90 fs/fs-writeback.c:1709
   [< inline >] wb_do_writeback fs/fs-writeback.c:1844
   [<ffffffff81891079>] wb_workfn+0x2f9/0x1000 fs/fs-writeback.c:1884
   [<ffffffff813bcd1e>] process_one_work+0x78e/0x15c0 kernel/workqueue.c:2094
   [<ffffffff813bdc2b>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2228
   [<ffffffff813cdeef>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [<ffffffff867bc5d2>] ret_from_fork+0x22/0x50 arch/x86/entry/entry_64.S:392
  Code: 05 94 4a a8 06 85 c0 0f 85 03 03 00 00 e8 07 15 d0 ff 41 80 3e
  00 0f 85 64 06 00 00 49 8b 9c 24 88 01 00 00 48 89 d8 48 c1 e8 03 <42>
  80 3c 28 00 0f 85 17 06 00 00 48 8b 03 48 83 c0 50 48 39 c3
  RIP [< inline >] wb_get include/linux/backing-dev-defs.h:212
  RIP [<ffffffff818884d2>] locked_inode_to_wb_and_lock_list+0xa2/0x750
  fs/fs-writeback.c:281
   RSP <ffff88006cdaf7d0>
  ---[ end trace 986a4d314dcb2694 ]---

Fix the problem by making sure __writeback_single_inode() writes inode
only with dirty times in WB_SYNC_ALL mode.

Reported-by: Dmitry Vyukov <email address hidden>
Tested-by: Laurent Dufour <email address hidden>
Signed-off-by: Jan Kara <email address hidden>
Signed-off-by: Jens Axboe <email address hidden>
(cherry picked from commit dc5ff2b1d66f21c27a4c37236636dff6946437e4)
Signed-off-by: Tim Gardner <email address hidden>
Acked-by: Christopher Arges <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

65d5275... by Tim Gardner

UBUNTU: [Config] CONFIG_IBMEBUS=y for powerpc

BugLink: http://bugs.launchpad.net/bugs/1612725

Signed-off-by: Tim Gardner <email address hidden>
Acked-by: Christopher Arges <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

4704ba2... by Frederic Barrat <email address hidden>

cxl: Set psl_fir_cntl to production environment value

BugLink: http://bugs.launchpad.net/bugs/1612431

Switch the setting of psl_fir_cntl from debug to production
environment recommended value. It mostly affects the PSL behavior when
an error is raised in psl_fir1/2.

Tested with cxlflash.

Signed-off-by: Frederic Barrat <email address hidden>
Reviewed-by: Uma Krishnan <email address hidden>
Signed-off-by: Michael Ellerman <email address hidden>
(back ported from linux-next commit c6d2ee09c2fffd3efdd31be2b2811d081a45bb99)
Signed-off-by: Tim Gardner <email address hidden>

 Conflicts:
 drivers/misc/cxl/pci.c

Acked-by: Christopher Arges <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

3442aaa... by Greg Kroah-Hartman <email address hidden>

Linux 4.4.18

BugLink: http://bugs.launchpad.net/bugs/1614560

Signed-off-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

dd39e36... by Vladimir Davydov <email address hidden>

mm: memcontrol: fix memcg id ref counter on swap charge move

BugLink: http://bugs.launchpad.net/bugs/1614560

commit 615d66c37c755c49ce022c9e5ac0875d27d2603d upstream.

Since commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure
after many small jobs") swap entries do not pin memcg->css.refcnt
directly. Instead, they pin memcg->id.ref. So we should adjust the
reference counters accordingly when moving swap charges between cgroups.

Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Link: http://lkml.kernel.org/r/<email address hidden>
Signed-off-by: Vladimir Davydov <email address hidden>
Acked-by: Michal Hocko <email address hidden>
Acked-by: Johannes Weiner <email address hidden>
Signed-off-by: Andrew Morton <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>
Signed-off-by: Michal Hocko <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>

Signed-off-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

50276bc... by Vladimir Davydov <email address hidden>

mm: memcontrol: fix swap counter leak on swapout from offline cgroup

BugLink: http://bugs.launchpad.net/bugs/1614560

commit 1f47b61fb4077936465dcde872a4e5cc4fe708da upstream.

An offline memory cgroup might have anonymous memory or shmem left
charged to it and no swap. Since only swap entries pin the id of an
offline cgroup, such a cgroup will have no id and so an attempt to
swapout its anon/shmem will not store memory cgroup info in the swap
cgroup map. As a result, memcg->swap or memcg->memsw will never get
uncharged from it and any of its ascendants.

Fix this by always charging swapout to the first ancestor cgroup that
hasn't released its id yet.

[<email address hidden>: add comment to mem_cgroup_swapout]
[<email address hidden>: use WARN_ON_ONCE() in mem_cgroup_id_get_online()]
  Link: http://lkml.kernel.org/r/20160803123445.GJ13263@esperanza
Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
Link: http://lkml.kernel.org/r/<email address hidden>
Signed-off-by: Vladimir Davydov <email address hidden>
Acked-by: Johannes Weiner <email address hidden>
Acked-by: Michal Hocko <email address hidden>
Cc: <email address hidden> [3.19+]
Signed-off-by: Andrew Morton <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>
Signed-off-by: Michal Hocko <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>

Signed-off-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

45c2b48... by Johannes Weiner <email address hidden>

mm: memcontrol: fix cgroup creation failure after many small jobs

BugLink: http://bugs.launchpad.net/bugs/1614560

commit 73f576c04b9410ed19660f74f97521bee6e1c546 upstream.

The memory controller has quite a bit of state that usually outlives the
cgroup and pins its CSS until said state disappears. At the same time
it imposes a 16-bit limit on the CSS ID space to economically store IDs
in the wild. Consequently, when we use cgroups to contain frequent but
small and short-lived jobs that leave behind some page cache, we quickly
run into the 64k limitations of outstanding CSSs. Creating a new cgroup
fails with -ENOSPC while there are only a few, or even no user-visible
cgroups in existence.

Although pinning CSSs past cgroup removal is common, there are only two
instances that actually need an ID after a cgroup is deleted: cache
shadow entries and swapout records.

Cache shadow entries reference the ID weakly and can deal with the CSS
having disappeared when it's looked up later. They pose no hurdle.

Swap-out records do need to pin the css to hierarchically attribute
swapins after the cgroup has been deleted; though the only pages that
remain swapped out after offlining are tmpfs/shmem pages. And those
references are under the user's control, so they are manageable.

This patch introduces a private 16-bit memcg ID and switches swap and
cache shadow entries over to using that. This ID can then be recycled
after offlining when the CSS remains pinned only by objects that don't
specifically need it.

This script demonstrates the problem by faulting one cache page in a new
cgroup and deleting it again:

  set -e
  mkdir -p pages
  for x in `seq 128000`; do
    [ $((x % 1000)) -eq 0 ] && echo $x
    mkdir /cgroup/foo
    echo $$ >/cgroup/foo/cgroup.procs
    echo trex >pages/$x
    echo $$ >/cgroup/cgroup.procs
    rmdir /cgroup/foo
  done

When run on an unpatched kernel, we eventually run out of possible IDs
even though there are no visible cgroups:

  [root@ham ~]# ./cssidstress.sh
  [...]
  65000
  mkdir: cannot create directory '/cgroup/foo': No space left on device

After this patch, the IDs get released upon cgroup destruction and the
cache and css objects get released once memory reclaim kicks in.

[<email address hidden>: init the IDR]
  Link: http://<email address hidden>
Fixes: b2052564e66d ("mm: memcontrol: continue cache reclaim from offlined groups")
Link: http://<email address hidden>
Signed-off-by: Johannes Weiner <email address hidden>
Reported-by: John Garcia <email address hidden>
Reviewed-by: Vladimir Davydov <email address hidden>
Acked-by: Tejun Heo <email address hidden>
Cc: Nikolay Borisov <email address hidden>
Signed-off-by: Andrew Morton <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>
Signed-off-by: Michal Hocko <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>

Signed-off-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

a965074... by Vegard Nossum

ext4: fix reference counting bug on block allocation error

BugLink: http://bugs.launchpad.net/bugs/1614560

commit 554a5ccc4e4a20c5f3ec859de0842db4b4b9c77e upstream.

If we hit this error when mounted with errors=continue or
errors=remount-ro:

    EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata

then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
continue. However, ext4_mb_release_context() is the wrong thing to call
here since we are still actually using the allocation context.

Instead, just error out. We could retry the allocation, but there is a
possibility of getting stuck in an infinite loop instead, so this seems
safer.

[ Fixed up so we don't return EAGAIN to userspace. --tytso ]

Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
Signed-off-by: Vegard Nossum <email address hidden>
Signed-off-by: Theodore Ts'o <email address hidden>
Cc: Aneesh Kumar K.V <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>

Signed-off-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>

1e1bd33... by Vegard Nossum

ext4: short-cut orphan cleanup on error

BugLink: http://bugs.launchpad.net/bugs/1614560

commit c65d5c6c81a1f27dec5f627f67840726fcd146de upstream.

If we encounter a filesystem error during orphan cleanup, we should stop.
Otherwise, we may end up in an infinite loop where the same inode is
processed again and again.

    EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended
    EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters
    Aborting journal on device loop0-8.
    EXT4-fs (loop0): Remounting filesystem read-only
    EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted
    EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
    EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
    EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure
    EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted
    EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted
    EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
    EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed!
    [...]
    EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed!
    [...]
    EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed!
    [...]

See-also: c9eb13a9105 ("ext4: fix hang when processing corrupted orphaned inode list")
Cc: Jan Kara <email address hidden>
Signed-off-by: Vegard Nossum <email address hidden>
Signed-off-by: Theodore Ts'o <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>

Signed-off-by: Tim Gardner <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>