~sunforce/ubuntu/+source/linux/+git/mainline-crack:WIP-syscall

Last commit made on 2018-02-27
Get this branch:
git clone -b WIP-syscall https://git.launchpad.net/~sunforce/ubuntu/+source/linux/+git/mainline-crack
Only Richard Korbini can upload to this branch. If you are Richard Korbini please log in for upload directions.

Branch merges

Branch information

Recent commits

de25c71... by Linus Torvalds <email address hidden>

Broken, but working, ptregs system call conversion for x86-64

Not-yet-signed-off-by: Linus Torvalds <email address hidden>

3621644... by Linus Torvalds <email address hidden>

x86: don't pointlessly reload the system call number

We have it in a register in the low-level asm, just pass it in as an
argument rather than have do_syscall_64() load it back in from the
ptregs pointer.

Signed-off-by: Linus Torvalds <email address hidden>

fd46804... by Linus Torvalds <email address hidden>

x86: avoid per-cpu system call trampoline

The per-cpu system call trampoline was a clever trick, and allows us to
have percpu data even before swapgs is done by just doing %rip-relative
addressing. And that was important, because syscall doesn't have a
kernel stack, so we needed that percpu data very very early, just to get
a temporary register to switch the page tables around.

However, it turns out to be unnecessary. Because we actually have a
temporary register that we can use: %r11 is destroyed by the 'syscall'
instruction anyway.

Ok, technically it contains the user mode flags register, but we *have*
that information anyway: it's still in %rflags, we've just masked off a
few unimportant bits. We'll destroy the rest too when we do the "and"
of the CR3 value, but who cares? It's a system call.

Btw, there are a few bits in eflags that might matter to user space: DF
and AC. Right now this clears them, but that is fixable by just
changing the MSR_SYSCALL_MASK value to not include them, and clearing
them by hand the way we do for all other kernel entry points anyway.

So the only _real_ flags we'd destroy are IF and the arithmetic flags
that get trampled on by the arithmetic instructions that are part of the
%cr3 reload logic.

However, if we really end up caring, we can save off even those: we'd
take advantage of the fact that %rcx - which contains the returning IP
of the system call - also has 8 bits free.

Why 8? Even with 5-level paging, we only have 57 bits of virtual address
space, and the high address space is for the kernel (and vsyscall, but
we'd just disable native vsyscall). So the %rip value saved in %rcx can
have only 56 valid bits, which means that we have 8 bits free.

So *if* we care about IF and the arithmetic flags being saved over a
system call, we'd do:

        shlq $8,%rcx
        movb %r11b,%cl
        shrl $8,%r11d
        andl $8,%r11d
        orb %r11b,%cl

to save those bits off before we then user %r11 as a temporary register
(we'd obviously need to then undo that as we save the user space state
on the stack).

Signed-off-by: Linus Torvalds <email address hidden>

6f70eb2... by Linus Torvalds <email address hidden>

Merge branch 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax

Pull idr fixes from Matthew Wilcox:
 "One test-suite build fix for you and one run-time regression fix.

  The regression fix includes new tests to make sure they don't pop back
  up."

* 'idr-2018-02-06' of git://git.infradead.org/users/willy/linux-dax:
  idr: Fix handling of IDs above INT_MAX
  radix tree test suite: Fix build

4b0ad07... by Matthew Wilcox <email address hidden>

idr: Fix handling of IDs above INT_MAX

Khalid reported that the kernel selftests are currently failing:

selftests: test_bpf.sh
========================================
test_bpf: [FAIL]
not ok 1..8 selftests: test_bpf.sh [FAIL]

He bisected it to 6ce711f2750031d12cec91384ac5cfa0a485b60a ("idr: Make
1-based IDRs more efficient").

The root cause is doing a signed comparison in idr_alloc_u32() instead
of an unsigned comparison. I went looking for any similar problems and
found a couple (which would each result in the failure to warn in two
situations that aren't supposed to happen).

I knocked up a few test-cases to prove that I was right and added them
to the test-suite.

Reported-by: Khalid Aziz <email address hidden>
Tested-by: Khalid Aziz <email address hidden>
Signed-off-by: Matthew Wilcox <email address hidden>

4c3579f... by Linus Torvalds <email address hidden>

Merge tag 'edac_fixes_for_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fix from Borislav Petkov:
 "sb_edac: Prevent memory corruption on KNL (from Anna Karbownik)"

* tag 'edac_fixes_for_4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL

85a2d93... by Linus Torvalds <email address hidden>

Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "Yet another pile of melted spectrum related changes:

   - sanitize the array_index_nospec protection mechanism: Remove the
     overengineered array_index_nospec_mask_check() magic and allow
     const-qualified types as index to avoid temporary storage in a
     non-const local variable.

   - make the microcode loader more robust by properly propagating error
     codes. Provide information about new feature bits after micro code
     was updated so administrators can act upon.

   - optimizations of the entry ASM code which reduce code footprint and
     make the code simpler and faster.

   - fix the {pmd,pud}_{set,clear}_flags() implementations to work
     properly on paravirt kernels by removing the address translation
     operations.

   - revert the harmful vmexit_fill_RSB() optimization

   - use IBRS around firmware calls

   - teach objtool about retpolines and add annotations for indirect
     jumps and calls.

   - explicitly disable jumplabel patching in __init code and handle
     patching failures properly instead of silently ignoring them.

   - remove indirect paravirt calls for writing the speculation control
     MSR as these calls are obviously proving the same attack vector
     which is tried to be mitigated.

   - a few small fixes which address build issues with recent compiler
     and assembler versions"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
  KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
  objtool, retpolines: Integrate objtool with retpoline support more closely
  x86/entry/64: Simplify ENCODE_FRAME_POINTER
  extable: Make init_kernel_text() global
  jump_label: Warn on failed jump_label patching attempt
  jump_label: Explicitly disable jump labels in __init code
  x86/entry/64: Open-code switch_to_thread_stack()
  x86/entry/64: Move ASM_CLAC to interrupt_entry()
  x86/entry/64: Remove 'interrupt' macro
  x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry()
  x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry
  x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function
  x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
  objtool: Add module specific retpoline rules
  objtool: Add retpoline validation
  objtool: Use existing global variables for options
  x86/mm/sme, objtool: Annotate indirect call in sme_encrypt_execute()
  x86/boot, objtool: Annotate indirect jump in secondary_startup_64()
  x86/paravirt, objtool: Annotate indirect calls
  ...

d4858aa... by Linus Torvalds <email address hidden>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "s390:
   - optimization for the exitless interrupt support that was merged in 4.16-rc1
   - improve the branch prediction blocking for nested KVM
   - replace some jump tables with switch statements to improve expoline performance
   - fixes for multiple epoch facility

  ARM:
   - fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs
   - make sure we can build 32-bit KVM/ARM with gcc-8.

  x86:
   - fixes for AMD SEV
   - fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup
   - fixes for async page fault migration
   - small optimization to PV TLB flush (new in 4.16-rc1)
   - syzkaller fixes

  Generic:
   - compiler warning fixes
   - syzkaller fixes
   - more improvements to the kvm_stat tool

  Two more small Spectre fixes are going to reach you via Ingo"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (40 commits)
  KVM: SVM: Fix SEV LAUNCH_SECRET command
  KVM: SVM: install RSM intercept
  KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command
  include: psp-sev: Capitalize invalid length enum
  crypto: ccp: Fix sparse, use plain integer as NULL pointer
  KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled
  x86/kvm: Make parse_no_xxx __init for kvm
  KVM: x86: fix backward migration with async_PF
  kvm: fix warning for non-x86 builds
  kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds
  tools/kvm_stat: print 'Total' line for multiple events only
  tools/kvm_stat: group child events indented after parent
  tools/kvm_stat: separate drilldown and fields filtering
  tools/kvm_stat: eliminate extra guest/pid selection dialog
  tools/kvm_stat: mark private methods as such
  tools/kvm_stat: fix debugfs handling
  tools/kvm_stat: print error on invalid regex
  tools/kvm_stat: fix crash when filtering out all non-child trace events
  tools/kvm_stat: avoid 'is' for equality checks
  tools/kvm_stat: use a more pythonic way to iterate over dictionaries
  ...

4a3928c... by Linus Torvalds <email address hidden>

Linux 4.16-rc3

e1171ac... by Linus Torvalds <email address hidden>

Merge tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fixes from Max Filippov:
 "Two fixes for reserved memory/DMA buffers allocation in high memory on
  xtensa architecture

   - fix memory accounting when reserved memory is in high memory region

   - fix DMA allocation from high memory"

* tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: support DMA buffers in high memory
  xtensa: fix high memory/reserved memory collision