Code review comment for ~paelzer/ubuntu/+source/qemu:fix-1832622-ppc-spectre-eoan

Rafael David Tinoco (rafaeldtinoco) wrote :

######################

My initial notes

We are taking care of:

    The spapr_cap SPAPR_CAP_IBS is used to indicate the level of
    capability for mitigations for indirect branch speculation.
    Currently the available values are broken (default), fixed-ibs
    (fixed by serialising indirect branches) and fixed-ccd (fixed by
    diabling the count cache).

cap-ibs={broken,workaround,fixed-ibs,fixed-ccd}
cap-ibs=workaround,cap-ccf-assist={on,off}

 broken = default
 workaround (cap-ccf-assist = count cache flush assist - hw assist)
 fixed-ibs = fixed by serialising indirect branches
 fixed-ccd = fixed by disabling the count cache

----
cap-ibs=workaround,cap-ccf-assist=on:

$ dmesg | grep cache-flush

  [ 0.000000] count-cache-flush: hardware assisted flush sequence
  enabled

cap-ibs=workaround,cap-ccf-assist=off:

$ dmesg | grep cache-flush

  [ 0.000000] count-cache-flush: full software flush sequence
  enabled.

----

migrations path accepted:

broken -> fixed-ccd, broken -> workaround, workaround -> fixed-ccd

######################

My summary:

Where is the support for cap-[cfpc/sbbc/ibs] ?

For mitigating spectre/meltdown cpu vulnerability, qemu implements
the machine capabilities cfpc,sbbc,ibs, which are present in the
current qemu, but the default values of it would be broken(no
mitigation) even fixes in hw/fw/sw is available.

1) CFPC - Cache Flush on Privilege Change

2) SBBC - Speculation Barrier Bounds Checking

(Looks like both were solved by: LP: #1761372 with commits bellow)
(Kernel changes: LP: #1822870)

813f3cf655 ppc/spapr-caps: Define the pseries-2.12-sxxm machine type
c76c0d3090 ppc/spapr-caps: Convert cap-ibs to custom spapr-cap
aaf265ffde ppc/spapr-caps: Convert cap-sbbc to custom spapr-cap
f27aa81e72 ppc/spapr-caps: Convert cap-cfpc to custom spapr-cap
87175d1bc5 ppc/spapr-caps: Add support for custom spapr_capabilities
cb931c2108 target/ppc: Check mask when setting ap_ppc_safe_indirect_branch
From 1761371 merged into this bug also
4f5b039d2b ppc/spapr-caps: Disallow setting workaround for spapr-cap-ibs

pseries-2.12-sxxm set by default the capabilities:

+ smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND;
+ smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND;
+ smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_FIXED_CCD;

** SPAPR_CAP_FIXED_CCD needs fw-count-cache-{enabled,disabled} in
   fw-features device tree **
** SPAPR_CAP_IBS is going to be broken as default if device tree is
   not configured **

Should we document this ^ ? Together with the SRU ?

3) IBS - Indirect Branch Serialisation

For the "workaround" (HW or SW) to work:

-> workaround needs "cap-ccf-assist"
-> without "cap-ccf-assist" 2.3 will stay "broken" by default
-> manually setting "cap-ibs=workaround,cap-ccf-assist={on,off}" is
   needed.

########################

NOTE / QUESTION / OBSERVATION

(0)

Should we care about TCG fatal (or not) warnings for cap-XXXX added ?

commit 006e9d3618698eeef2f3e07628d22cb6f5c2a039
Author: Suraj Jitindar Singh <email address hidden>
Date: Fri Mar 1 01:46:08 2019

    target/ppc/tcg: make spapr_caps apply cap-[cfpc/sbbc/ibs]
    non-fatal for tcg

    The spapr_caps cap-cfpc, cap-sbbc and cap-ibs are used to control
    the availability of certain mitigations to the guest. These
    haven't been implemented under TCG, it is unlikely they ever will
    be, and it is unclear as to whether they even need to be.

    As such, make failure to apply these capabilities under TCG
    non-fatal. Instead we print a warning message to the user but
    still allow the guest to continue.

    Signed-off-by: Suraj Jitindar Singh <email address hidden>
    Message-Id: <email address hidden>
    [dwg: Small style fix]
    Signed-off-by: David Gibson <email address hidden>

If you specify the CPU in TCG you might get a fatal error. I'm afraid
of 2nd-level in test environments (with same cmdline as 1st level).
(Assuming that 2nd-level in PPC is similar to s390 and no PHY<->VIRT
conversion is accelerated)

(1)

With commit:

commit 2782ad4c4102d57f7f8e135dce0c1adb0149de77 (v3.1.0-2774-g2782ad4c41)
Author: Suraj Jitindar Singh <email address hidden>
Date: Fri Mar 1 01:46:09 2019

    target/ppc/spapr: Enable mitigations by default for pseries-4.0
    machine type

    There are currently 3 mitigations the availability of which is
    controlled by the spapr-caps mechanism, cap-cfpc, cap-sbbc, and
    cap-ibs. Enable these mitigations by default for the pseries-4.0
    machine type.

    By now machine firmware should have been upgraded to allow these
    settings.

    Signed-off-by: Suraj Jitindar Singh <email address hidden>
    Message-Id: <email address hidden>
    Signed-off-by: David Gibson <email address hidden>

The default starts to be:

  smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND;
  smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND;
  smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_WORKAROUND;

inside spapr_machine_class_init(). So it considers PSeries-4 all to be
workarounded either by SW or HW (and assumes firmware is updated).

Funny thing is that it marks:

COMPAT to 3.1 as:

  smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_BROKEN;
  smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_BROKEN;
  smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_BROKEN;

but 2.12, our case for Bionic, has:

  smc->default_caps.caps[SPAPR_CAP_CFPC] = SPAPR_CAP_WORKAROUND;
  smc->default_caps.caps[SPAPR_CAP_SBBC] = SPAPR_CAP_WORKAROUND;
   smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_FIXED_CCD;

??? I'm unsure why we would be workarounded in 2.12 but not in 3.1 ???

(2)

Enabling mitigations by default (1) would have to be fixed by:

commit ba3b40de424304c497ca4ff990dd24e37aff5f5b
Author: David Gibson <email address hidden>
Date: Tue Mar 12 02:07:14 2019

    Suppress test warnings about missing Spectre/Meltdown mitigations
    with TCG

    The new pseries-4.0 machine type defaults to enabling
    Spectre/Meltdown mitigations. Unfortunately those mitigations
    aren't implemented for TCG because we're not yet sure if they're
    necessary or how to implement them. We don't fail fatally, but we
    do warn in this case, because it is quite plausible that
    Spectre/Meltdown can be exploited through TCG (at least for the
    guest to get access to the qemu address space).

    This create noise in our testcases though. So, modify the
    affected tests to explicitly disable the mitigations to suppress
    these warnings.

    Signed-off-by: David Gibson <email address hidden>

for the TCG case.

« Back to merge proposal