~ubuntu-kernel-test/ubuntu/+source/linux/+git/mainline-crack:x86-uaccess-cleanup

Last commit made on 2023-05-03
Get this branch:
git clone -b x86-uaccess-cleanup https://git.launchpad.net/~ubuntu-kernel-test/ubuntu/+source/linux/+git/mainline-crack
Members of Ubuntu Kernel Test can upload to this branch. Log in for directions.

Branch merges

Branch information

Recent commits

798dec3... by Linus Torvalds <email address hidden>

x86-64: mm: clarify the 'positive addresses' user address rules

Dave Hansen found the "(long) addr >= 0" code in the x86-64 access_ok
checks somewhat confusing, and suggested using a helper to clarify what
the code is doing.

So this does exactly that: clarifying what the sign bit check is all
about, by adding a helper macro that makes it clear what it is testing.

This also adds some explicit comments talking about how even with LAM
enabled, any addresses with the sign bit will still GP-fault in the
non-canonical region just above the sign bit.

This is all what allows us to do the user address checks with just the
sign bit, and furthermore be a bit cavalier about accesses that might be
done with an additional offset even past that point.

(And yes, this talks about 'positive' even though zero is also a valid
user address and so technically we should call them 'non-negative'. But
I don't think using 'non-negative' ends up being more understandable).

Suggested-by: Dave Hansen <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>

1dbc0a9... by Linus Torvalds <email address hidden>

x86: mm: remove 'sign' games from LAM untagged_addr*() macros

The intent of the sign games was to not modify kernel addresses when
untagging them. However, that had two issues:

 (a) it didn't actually work as intended, since the mask was calculated
     as 'addr >> 63' on an _unsigned_ address. So instead of getting a
     mask of all ones for kernel addresses, you just got '1'.

 (b) untagging a kernel address isn't actually a valid operation anyway.

Now, (a) had originally been true for both 'untagged_addr()' and the
remote version of it, but had accidentally been fixed for the regular
version of untagged_addr() by commit e0bddc19ba95 ("x86/mm: Reduce
untagged_addr() overhead for systems without LAM"). That one rewrote
the shift to be part of the alternative asm code, and in the process
changed the unsigned shift into a signed 'sar' instruction.

And while it is true that we don't want to turn what looks like a kernel
address into a user address by masking off the high bit, that doesn't
need these sign masking games - all it needs is that the mm context
'untag_mask' value has the high bit set.

Which it always does.

So simplify the code by just removing the superfluous (and in the case
of untagged_addr_remote(), still buggy) sign bit games in the address
masking.

Acked-by: Dave Hansen <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>

b9bd9f6... by Linus Torvalds <email address hidden>

x86: uaccess: move 32-bit and 64-bit parts into proper <asm/uaccess_N.h> header

The x86 <asm/uaccess.h> file has grown features that are specific to
x86-64 like LAM support and the related access_ok() changes. They
really should be in the <asm/uaccess_64.h> file and not pollute the
generic x86 header.

Signed-off-by: Linus Torvalds <email address hidden>

6ccdc91... by Linus Torvalds <email address hidden>

x86: mm: remove architecture-specific 'access_ok()' define

There's already a generic definition of 'access_ok()' in the
asm-generic/access_ok.h header file, and the only difference bwteen that
and the x86-specific one is the added check for WARN_ON_IN_IRQ().

And it turns out that the reason for that check is long gone: it used to
use a "user_addr_max()" inline function that depended on the current
thread, and caused problems in non-thread contexts.

For details, see commits 7c4788950ba5 ("x86/uaccess, sched/preempt:
Verify access_ok() context") and in particular commit ae31fe51a3cc
("perf/x86: Restore TASK_SIZE check on frame pointer") about how and why
this came to be.

But that "current task" issue was removed in the big set_fs() removal by
Christoph Hellwig in commit 47058bb54b57 ("x86: remove address space
overrides using set_fs()").

So the reason for the test and the architecture-specific access_ok()
define no longer exists, and is actually harmful these days. For
example, it led various 'copy_from_user_nmi()' games (eg using
__range_not_ok() instead, and then later converted to __access_ok() when
that became ok).

And that in turn meant that LAM was broken for the frame following
before this series, because __access_ok() used to not do the address
untagging.

Accessing user state still needs care in many contexts, but access_ok()
is not the place for this test.

Acked-by: Peter Zijlstra (Intel) <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>>

6014bc2... by Linus Torvalds <email address hidden>

x86-64: make access_ok() independent of LAM

The linear address masking (LAM) code made access_ok() more complicated,
in that it now needs to untag the address in order to verify the access
range. See commit 74c228d20a51 ("x86/uaccess: Provide untagged_addr()
and remove tags before address check").

We were able to avoid that overhead in the get_user/put_user code paths
by simply using the sign bit for the address check, and depending on the
GP fault if the address was non-canonical, which made it all independent
of LAM.

And we can do the same thing for access_ok(): simply check that the user
pointer range has the high bit clear. No need to bother with any
address bit masking.

In fact, we can go a bit further, and just check the starting address
for known small accesses ranges: any accesses that overflow will still
be in the non-canonical area and will still GP fault.

To still make syzkaller catch any potentially unchecked user addresses,
we'll continue to warn about GP faults that are caused by accesses in
the non-canonical range. But we'll limit that to purely "high bit set
and past the one-page 'slop' area".

We could probably just do that "check only starting address" for any
arbitrary range size: realistically all kernel accesses to user space
will be done starting at the low address. But let's leave that kind of
optimization for later. As it is, this already allows us to generate
simpler code and not worry about any tag bits in the address.

The one thing to look out for is the GUP address check: instead of
actually copying data in the virtual address range (and thus bad
addresses being caught by the GP fault), GUP will look up the page
tables manually. As a result, the page table limits need to be checked,
and that was previously implicitly done by the access_ok().

With the relaxed access_ok() check, we need to just do an explicit check
for TASK_SIZE_MAX in the GUP code instead. The GUP code already needs
to do the tag bit unmasking anyway, so there this is all very
straightforward, and there are no LAM issues.

Cc: Kirill A. Shutemov <email address hidden>
Cc: Dave Hansen <email address hidden>
Cc: Peter Zijlstra (Intel) <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>

348551d... by Linus Torvalds <email address hidden>

Merge tag 'pinctrl-v6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "Mostly drivers! Nothing special: some new Qualcomm chips as usual, and
  the new NXP S32 and nVidia BlueField-3.

  Core changes:

   - Make a lot of pin controllers with GPIO and irqchips immutable,
     i.e. not living structs, but const structs. This is driving a
     changed initiated by the irqchip maintainers.

  New drivers:

   - New driver for the NXP S32 SoC pin controller

   - As part of a thorough cleanup and restructuring of the
     Ralink/Mediatek drivers, the Ralink MIPS pin control drivers were
     folded into the Mediatek directory and the family is renamed
     "mtmips". The Ralink chips live on as Mediatek MIPS family where
     new variants can be added. As part of this work also the device
     tree bindings were reworked.

   - New subdriver for the Qualcomm SM7150 SoC.

   - New subdriver for the Qualcomm IPQ9574 SoC.

   - New driver for the nVidia BlueField-3 SoC.

   - Support for the Qualcomm PMM8654AU mixed signal circuit GPIO.

   - Support for the Qualcomm PMI632 mixed signal circuit GPIO.

  Improvements:

   - Add some missing pins and generic cleanups on the Renesas r8a779g0
     and r8a779g0 pin controllers. Generic Renesas extension for power
     source selection on several SoCs.

   - Misc cleanups for the Atmel AT91 and AT91-PIO4 pin controllers

   - Make the GPIO mode work on the Qualcomm SM8550-lpass-lpi driver.

   - Several device tree binding cleanups as the binding YAML syntax is
     solidifying"

* tag 'pinctrl-v6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (153 commits)
  pinctrl-bcm2835.c: fix race condition when setting gpio dir
  dt-bindings: pinctrl: qcom,sm8150: Drop duplicate function value "atest_usb2"
  dt-bindings: pinctrl: qcom: Add few missing functions
  pinctrl: qcom: spmi-gpio: Add PMI632 support
  dt-bindings: pinctrl: qcom,pmic-gpio: add PMI632
  pinctrl: wpcm450: select MFD_SYSCON
  pinctrl: qcom ssbi-gpio: Convert to immutable irq_chip
  pinctrl: qcom ssbi-mpp: Convert to immutable irq_chip
  pinctrl: qcom spmi-mpp: Convert to immutable irq_chip
  pinctrl: plgpio: Convert to immutable irq_chip
  pinctrl: pistachio: Convert to immutable irq_chip
  pinctrl: pic32: Convert to immutable irq_chip
  pinctrl: sx150x: Convert to immutable irq_chip
  pinctrl: stmfx: Convert to immutable irq_chip
  pinctrl: st: Convert to immutable irq_chip
  pinctrl: mcp23s08: Convert to immutable irq_chip
  pinctrl: equilibrium: Convert to immutable irq_chip
  pinctrl: npcm7xx: Convert to immutable irq_chip
  pinctrl: armada-37xx: Convert to immutable irq_chip
  pinctrl: nsp: Convert to immutable irq_chip
  ...

7df047b... by Linus Torvalds <email address hidden>

Merge tag 'vfio-v6.4-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Expose and allow R/W access to the PCIe DVSEC capability through
   vfio-pci, as we already do with the legacy vendor capability
   (K V P Satyanarayana)

 - Fix kernel-doc issues with structure definitions (Simon Horman)

 - Clarify ordering of operations relative to the kvm-vfio device for
   driver dependencies against the kvm pointer (Yi Liu)

* tag 'vfio-v6.4-rc1' of https://github.com/awilliam/linux-vfio:
  docs: kvm: vfio: Suggest KVM_DEV_VFIO_GROUP_ADD vs VFIO_GROUP_GET_DEVICE_FD ordering
  vfio: correct kdoc for ops structures
  vfio/pci: Add DVSEC PCI Extended Config Capability to user visible list.

21d2be6... by Linus Torvalds <email address hidden>

Merge tag 'afs-fixes-20230502' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS updates from David Howells:
 "Three fixes to AFS directory handling:

   - Make sure that afs_read_dir() sees any increase in file size if the
     file unexpectedly changed on the server (e.g. due to another client
     making a change).

   - Make afs_getattr() always return the server's dir file size, not
     the locally edited one, so that pagecache eviction doesn't cause
     the dir file size to change unexpectedly.

   - Prevent afs_read_dir() from getting into an endless loop if the
     server indicates that the directory file size is larger than
     expected"

* tag 'afs-fixes-20230502' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Avoid endless loop if file is larger than expected
  afs: Fix getattr to report server i_size on dirs, not local size
  afs: Fix updating of i_size with dv jump from server

d7b3ffe... by Linus Torvalds <email address hidden>

Merge tag 'backlight-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight

Pull backlight updates from Lee Jones:
 "Fix-ups:
   - Add / improve Device Tree bindings
   - Convert (int) .remove functions to (void) .remove_new
   - Rid 'defined but not used' warnings
   - Remove ineffective casts and pointer stubs
   - Use specifically crafted API for testing DT property presence"

* tag 'backlight-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
  backlight: as3711: Use of_property_read_bool() for boolean properties
  backlight: hx8357: Use of_property_present() for testing DT property presence
  backlight: arcxcnn_bl: Drop of_match_ptr for ID table
  backlight: lp855x: Mark OF related data as maybe unused
  backlight: sky81452-backlight: Convert to platform remove callback returning void
  backlight: rt4831-backlight: Convert to platform remove callback returning void
  backlight: qcom-wled: Convert to platform remove callback returning void
  backlight: pwm_bl: Convert to platform remove callback returning void
  backlight: mt6370-backlight: Convert to platform remove callback returning void
  backlight: lp8788_bl: Convert to platform remove callback returning void
  backlight: lm3533_bl: Convert to platform remove callback returning void
  backlight: led_bl: Convert to platform remove callback returning void
  backlight: hp680_bl: Convert to platform remove callback returning void
  backlight: da9052_bl: Convert to platform remove callback returning void
  backlight: cr_bllcd: Convert to platform remove callback returning void
  backlight: adp5520_bl: Convert to platform remove callback returning void
  backlight: aat2870_bl: Convert to platform remove callback returning void
  backlight: qcom-wled: Add PMI8950 compatible

3af4906... by Linus Torvalds <email address hidden>

Merge tag 'mfd-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
 "New Drivers:
   - Add support for Renesas RZ/G2L MTU3

  New Device Support:
   - Add support for Lenovo Yoga Book X90F to Intel CHT WC
   - Add support for MAX5970 and MAX5978 to Simple MFD (I2C)
   - Add support for Meteor Lake PCH-S LPSS PCI to Intel LPSS PCI
   - Add support for AXP15060 PMIC to X-Powers PMIC collection

  Remove Device Support:
   - Remove support for Samsung 5M8751 and S5M8763 PMIC devices

  New Functionality:
   - Convert deprecated QCOM IRQ Chip to config registers
   - Add support for 32-bit address spaces to Renesas SMUs

  Fix-ups:
   - Make use of APIs / MACROs designed to simplify and demystify
   - Add / improve Device Tree bindings
   - Memory saving struct layout optimisations
   - Remove old / deprecated functionality
   - Factor out unassigned register addresses from ranges
   - Trivial: Spelling fixes, renames and coding style fixes
   - Rid 'defined but not used' warnings
   - Remove ineffective casts and pointer stubs

  Bug Fixes:
   - Fix incorrectly non-inverted mask/unmask IRQs on QCOM platforms
   - Remove MODULE_*() helpers from non-tristate drivers
   - Do not attempt to use out-of-range memory addresses associated with io_base
   - Provide missing export helpers
   - Fix remap bulk read optimisation fallout
   - Fix memory leak issues in error paths"

* tag 'mfd-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (88 commits)
  dt-bindings: mfd: ti,j721e-system-controller: Add SoC chip ID
  leds: bd2606mvv: Driver for the Rohm 6 Channel i2c LED driver
  dt-bindings: mfd: qcom,spmi-pmic: Document flash LED controller
  dt-bindings: mfd: x-powers,axp152: Document the AXP15060 variant
  mfd: axp20x: Add support for AXP15060 PMIC
  dt-bindings: mfd: x-powers,axp152: Document the AXP313a variant
  counter: rz-mtu3-cnt: Unlock on error in rz_mtu3_count_ceiling_write()
  dt-bindings: mfd: dlg,da9063: Document voltage monitoring
  dt-bindings: mfd: stm32: Remove unnecessary blank lines
  dt-bindings: mfd: qcom,spmi-pmic: Use generic ADC node name in examples
  dt-bindings: mfd: syscon: Add nuvoton,ma35d1-sys compatible
  MAINTAINERS: Add entries for Renesas RZ/G2L MTU3a counter driver
  counter: Add Renesas RZ/G2L MTU3a counter driver
  Documentation: ABI: sysfs-bus-counter: add cascade_counts_enable and external_input_phase_clock_select
  mfd: Add Renesas RZ/G2L MTU3a core driver
  dt-bindings: timer: Document RZ/G2L MTU3a bindings
  mfd: rsmu_i2c: Convert to i2c's .probe_new() again
  mfd: intel-lpss: Add Intel Meteor Lake PCH-S LPSS PCI IDs
  mfd: dln2: Fix memory leak in dln2_probe()
  mfd: axp20x: Fix axp288 writable-ranges
  ...