On arm64 ACPI systems, we unconditionally reconfigure the entire PCI
hierarchy at boot. This is a departure from what is customary on ACPI
systems, and may break assumptions in some places (e.g., EFIFB), that
the kernel will leave BARs of enabled PCI devices where they are.
Given that PCI already specifies a device specific ACPI method (_DSM)
for PCI root bridge nodes that tells us whether the firmware thinks
the configuration should be left alone, let's sidestep the entire
policy debate about whether the PCI configuration should be preserved
or not, and put it under the control of the firmware instead.
Add driver for Amazon's Annapurna Labs PCIe host controller. The
controller is based on DesignWare's IP.
The controller doesn't support accessing the Root Port's config space via
ECAM, so we obtain its base address via an AMZN0001 device.
Furthermore, the DesignWare PCIe controller doesn't filter out config
transactions sent to devices 1 and up on its bus, so they are filtered by
the driver.
All subordinate buses do support ECAM access.
Implementing specific PCI config access functions involves:
- Adding an init function to obtain the Root Port's base address from
an AMZN0001 device.
- Adding a new entry in the MCFG quirk array.
[bhelgaas: Note that there is no Kconfig option for this driver because it
is only intended for use with the generic ACPI host bridge driver. This
driver is only needed because the DesignWare IP doesn't completely support
ECAM access to the root bus.]
Link: https://<email address hidden>
Co-developed-by: Vladimir Aerov <email address hidden>
Signed-off-by: Jonathan Chocron <email address hidden>
Signed-off-by: Vladimir Aerov <email address hidden>
Signed-off-by: Bjorn Helgaas <email address hidden>
Reviewed-by: David Woodhouse <email address hidden>
Reviewed-by: Benjamin Herrenschmidt <email address hidden>
Acked-by: Lorenzo Pieralisi <email address hidden>
(backported from commit 4166bfe53093b687a0b1b22e5d943e143b8089b2)
Signed-off-by: Kamal Mostafa <email address hidden>
As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe
only to first 32 groups.
The check for correctness of .bind() userspace supplied parameter
is done by applying mask made from ngroups shift. Which broke Android
as they have 64 groups and the shift for mask resulted in an overflow.
Fixes: 61f4b23769f0 ("netlink: Don't shift with UB on nlk->ngroups")
Cc: "David S. Miller" <email address hidden>
Cc: Herbert Xu <email address hidden>
Cc: Steffen Klassert <email address hidden>
Cc: <email address hidden>
Cc: <email address hidden>
Reported-and-Tested-by: Nathan Chancellor <email address hidden>
Signed-off-by: Dmitry Safonov <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(cherry picked from commit 91874ecf32e41b5d86a4cb9d60e0bee50d828058)
Acked-by: Brad Figg <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
Improve swapoff performance at the expense of the entire system
performance by avoiding to sleep on lock_page() in try_to_unuse().
This allows to trigger a read_swap_cache_async() on all the swapped out
pages and strongly increase swapoff performance (at the risk of
completely killing interactive performance).
Test case: swapoff called on a swap file containing about 32G of data in
a VM with 8 cpus, 64G RAM.
Result:
- stock kernel:
# time swapoff /swap-hibinit
real 40m13.072s
user 0m0.000s
sys 17m18.971s
- with this patch applied:
# time swapoff /swap-hibinit
real 1m59.496s
user 0m0.000s
sys 0m21.370s
Signed-off-by: Andrea Righi <email address hidden>
Acked-by: Brad Figg <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>
The igrab() in shmem_unuse() looks good, but we forgot that it gives no
protection against concurrent unmounting: a point made by Konstantin
Khlebnikov eight years ago, and then fixed in 2.6.39 by 778dd893ae78
("tmpfs: fix race between umount and swapoff"). The current 5.1-rc
swapoff is liable to hit "VFS: Busy inodes after unmount of tmpfs.
Self-destruct in 5 seconds. Have a nice day..." followed by GPF.
Once again, give up on using igrab(); but don't go back to making such
heavy-handed use of shmem_swaplist_mutex as last time: that would spoil
the new design, and I expect could deadlock inside shmem_swapin_page().
Instead, shmem_unuse() just raise a "stop_eviction" count in the shmem-
specific inode, and shmem_evict_inode() wait for that to go down to 0.
Call it "stop_eviction" rather than "swapoff_busy" because it can be put
to use for others later (huge tmpfs patches expect to use it).
That simplifies shmem_unuse(), protecting it from both unlink and
unmount; and in practice lets it locate all the swap in its first try.
But do not rely on that: there's still a theoretical case, when
shmem_writepage() might have been preempted after its get_swap_page(),
before making the swap entry visible to swapoff.
As a replacement for the wait_on_atomic_t() API provide the
wait_var_event() API.
The wait_var_event() API is based on the very same hashed-waitqueue
idea, but doesn't care about the type (atomic_t) or the specific
condition (atomic_read() == 0). IOW. it's much more widely
applicable/flexible.
It shares all the benefits/disadvantages of a hashed-waitqueue
approach with the existing wait_on_atomic_t/wait_on_bit() APIs.
The API is modeled after the existing wait_event() API, but instead of
taking a wait_queue_head, it takes an address. This addresses is
hashed to obtain a wait_queue_head from the bit_wait_table.
Similar to the wait_event() API, it takes a condition expression as
second argument and will wait until this expression becomes true.
The following are (mostly) identical replacements:
The only difference is that wake_up_var() is an unconditional wakeup
and doesn't check the previously hard-coded (atomic_read() == 0)
condition here. This is of little concequence, since most callers are
already conditional on atomic_dec_and_test() and the ones that are
not, are trivial to make so.
Tested-by: Dan Williams <email address hidden>
Signed-off-by: Peter Zijlstra (Intel) <email address hidden>
Cc: David Howells <email address hidden>
Cc: Linus Torvalds <email address hidden>
Cc: Mike Galbraith <email address hidden>
Cc: Peter Zijlstra <email address hidden>
Cc: Thomas Gleixner <email address hidden>
Cc: <email address hidden>
Signed-off-by: Ingo Molnar <email address hidden>
(cherry picked from commit 6b2bb7265f0b62605e8caee3613449ed0db270b9)
Signed-off-by: Andrea Righi <email address hidden>
Acked-by: Brad Figg <email address hidden>
Signed-off-by: Kamal Mostafa <email address hidden>