curtin dname for bcache uses unstable devname instead of UUID

Bug #1728742 reported by Ryan Harper
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
bcache-tools
New
Undecided
Unassigned
curtin
Fix Released
Undecided
Unassigned
bcache-tools (Debian)
Confirmed
Unknown
bcache-tools (Ubuntu)
Trusty
Invalid
Medium
Unassigned
Xenial
Invalid
Medium
Unassigned
Artful
Invalid
Medium
Unassigned
Bionic
Invalid
Medium
Unassigned
curtin (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Won't Fix
Undecided
Unassigned
Xenial
New
Undecided
Unassigned
Artful
New
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned

Bug Description

[Impact]

 * Current users of bcache devices may encounter unreliable device
   numbering as the Linux kernel does not guarantee that bcache
   minor numbers are assigned to the same devices at each boot.
   Users who may have used /dev/bcacheN in paths to a specific
   device could possible be pointing to a different dataset
   altogether. bcache udev rules do provide some mechanism to
   generate persistent symlinks in /dev/bcache/by-uuid or
   /dev/bcache/by-label which is based on superblock data on
   the underlying device. However, the Linux kernel does not
   always generate an kernel uevent to trigger the udev rules
   to create the symlink.

 * The fix adds a udev program which will read bcache superblock
   of slave devices and extract the UUID and LABEL, exporting them
   to udev for use in the bcache rule files.

 * This is affected in upstream bcache-tools, the owning package
   of the udev rules. This affects all releases of bcache-tools
   as the rules rely upon the kernel to trigger these events,
   though that is not a requirement to resolve the lack of
   persistent links.

[Test Case]

 * Launch and Ubuntu Cloud Image with 3 unused disks
   - apt install bcache-tools tree
   - make-bcache -C /dev/vdb
   - make-bcache -B /dev/vdc
   - make-bcache -B /dev/vdd
   - echo "vdc" > /sys/class/block/bcache0/bcache/label
   - echo "vdd" > /sys/class/block/bcache1/bcache/label
   - reboot

   - Run this test:

    #!/bin/bash
    FAIL=0
    [ ! -d /dev/bcache ] && {
        echo "FAIL: /dev/bcache is not a directory";
        exit 1
    }
    for label in /dev/bcache/by-label/*; do
        LABEL_TARGET="$(ls -1 /sys/class/block/`basename $label`/holders/)"
        DEVNAME=`readlink -f $label`;
        KNAME="${DEVNAME#*/dev/}"
        if [ "$LABEL_TARGET" != "$KNAME" ]; then
            echo "FAIL: label points to $LABEL_TARGET but symlink points to $DEVNAME";
            FAIL=1
        fi;
    done
    if [ "$FAIL" == "0" ]; then
        echo "PASS";
        exit 0
    fi
    exit 1

[Regression Potential]

 * As bcache minor numbers and these symlinks have been unreliable in
   the past there may be code that makes assumptions about
   /dev/bcache* expanded only to the block devices, versus
   /dev/bcache which is a directory.

[Original Description]

Bcache device names like /dev/bcache0 are unstable. Bcache does not use any predictable ordering when assembling bcache devices, so on systems with multiple bcache devices, a symlink to /dev/bcache0 may end up pointing do a different device.

the bcache dname symlink should point to the /dev/bcache/by-uuid/<UUID> which matches the backing device UUID that's set at creation time.

Related bugs:
 * bug 1729145: [kernel] /dev/bcache/by-uuid links not created after reboot

Related branches

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (4.2 KiB)

Copied from a private bug:

Currently we have no stability in /dev/bcache<n> device names:

  * minor numbers for bcache devices are not guaranteed to stay the same across reboots because there is no guaranteed enumeration;
  * uevent details for bcache devices do not propagate an underlying disk's serial number
  * serial numbers of disks are driver-specific device attributes - there is no guarantee that this is exposed

  ====

  /dev/disk/by-dname/<device-name> symlinks provided by curtin are not
  reliable as they merely depend on kernel-provided name which is
  unstable:

  cat /etc/udev/rules.d/bcache0.rules.rules
  SUBSYSTEM=="block", ACTION=="add|change", ENV{DEVNAME}=="/dev/bcache0", SYMLINK+="disk/by-dname/bcache0"

  dname symlink rules for block devices depend on a partition uuid - if a device doesn't have any partition pre-created a symlink will not be created:

  cat /etc/udev/rules.d/sda.rules.rules
  SUBSYSTEM=="block", ACTION=="add|change", ENV{DEVTYPE}=="disk", ENV{ID_PART_TABLE_UUID}=="5a492040", SYMLINK+="disk/by-dname/sda"

  There is no way in MAAS to pre-create a GUID Partition Table without a
  partition and a file system for a bcache device (no isolated API call
  for partition table creation - only for file systems).

  ====

  Why is this important for bcache usage?

  Raw block devices need to be used by ceph-disk in cases where it needs
  a device without a file system or partition table, namely, ceph
  journal (used without a file system normally), ceph bluestore (for
  both data and metadata journal. Bluestore is important especially
  because it was designated to work with a raw block device. Using
  bluestore on top of a pre-created file system is an improper usage
  scenario.

  ====

  Ways to mitigate:

  1. Introduce a new udev rule which sets up /dev/by-backing/<backing-
  device-name> symlinks to bcache devices:

  cat /etc/udev/rules.d/bcache-by-backing.rules.rules
  SUBSYSTEM=="block", ACTION=="add|change", ENV{DEVNAME}=="/dev/bcache*", PROGRAM="/lib/udev/bcache-name-helper.sh $kernel", SYMLINK+="disk/by-backing/$result"

  cat /lib/udev/bcache-name-helper.sh
  #!/bin/sh -e
  logger Getting a backing device for a bcache device $1 by sysfs file creation timestamp
  ls -c -1t /sys/block/$1/slaves/ | tail -n1

  tree /dev/disk/by-backing/
  /dev/disk/by-backing/
  ├── sdc -> ../../bcache2
  ├── sdd -> ../../bcache1
  ├── sde -> ../../bcache0
  └── sdf -> ../../bcache3

  lsblk
  NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
  sdf 8:80 0 64G 0 disk
  └─bcache3 252:48 0 64G 0 disk
  sdd 8:48 0 64G 0 disk
  └─bcache1 252:16 0 64G 0 disk
  sdb 8:16 0 64G 0 disk
  ├─bcache0 252:0 0 64G 0 disk
  ├─bcache3 252:48 0 64G 0 disk
  ├─bcache1 252:16 0 64G 0 disk
  └─bcache2 252:32 0 64G 0 disk
  sde 8:64 0 64G 0 disk
  └─bcache0 252:0 0 64G 0 disk
  sdc 8:32 0 64G 0 disk
  └─bcache2 252:32 0 64G 0 disk
  sda 8:0 0 64G 0 disk
  └─sda1 8:1 0 64G 0 part /

  2. Modify the Linux kernel source code to include a way to identify ...

Read more...

Revision history for this message
Ryan Harper (raharper) wrote :

In working on getting a udev rule fixed for bcache dname, I've filed this bug as well:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729145

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Download full text (4.4 KiB)

I think the kernel side of it needs consideration but inherently persistent /dev/bcache<i> cannot be provided due to the fact that there is no deterministic enumeration and /dev/bcache<i> names cannot be changed afterwards (only symlinks can be created).

man 7 udev is clear about that:
"udev supplies the system software with device events, manages permissions of device nodes and may create additional symlinks in the /dev directory, or renames network interfaces. The kernel usually just assigns unpredictable device names based on the order of discovery."

bcache code uses alloc_disk from gendisk to get a struct gendisk which means no custom uevent variables on coldplug - it just gets whatever any block device gets (with exception of partitions).

http://elixir.free-electrons.com/linux/v4.14/source/drivers/base/core.c#L865 (other settings set from bcache code and genhd.c added to uevent)
http://elixir.free-electrons.com/linux/v4.14/source/block/genhd.c#L1234 (DEVTYPE = disk)

http://elixir.free-electrons.com/linux/v4.14/source/block/genhd.c#L1357 (alloc_disk -> alloc_disk_node -> device_initialize)

http://elixir.free-electrons.com/linux/v4.14/source/drivers/md/bcache/super.c#L788
     !(d->disk = alloc_disk(BCACHE_MINORS))) { // <--- alloc_disk
 snprintf(d->disk->disk_name, DISK_NAME_LEN, "bcache%i",
   minor / BCACHE_MINORS);

As a result, we have this on coldplug:

cat /sys/class/block/bcache0/uevent
MAJOR=252
MINOR=0
DEVNAME=bcache0
DEVTYPE=disk

hotplug uevents with necessary UUIDs only happen when init script runs in initramfs and are not handled as there is no udev to listen for those (they are just ignored) so no symlinks are ever set up.

Fixing the kernel side is not trivial as this would imply changing the DEVTYPE (userspace breakage) and adding custom uevents to that file - essentially rewriting alloc_disk.

So, given that this will realistically not land for 4.15 as there is no patch and the idea of changing uevent file is not perfect I think we should handle that from userspace.

We can read a bcache superblock of a backing device during cold-plug as we know that a device name will contain the word "bcache" and we can set up those symlinks with a small overhead of running several processes.

We just need a mapping from superblock UUID -> dname.

/dev/by-dname/<link> symlinks are respected by our user-space (MAAS API returns by-dname names and Juju storage uses that instead of /dev/<name>).

bcache-tools provide a way to see the "version" of a superblock (cache device or backing device) and are present in our cloud images.

The simplest way to do it is by using timestamp:

sudo bcache-super-show /dev/`ls -c -1t /sys/block/bcache0/slaves/ | tail -n1` | grep dev.uuid | cut -f 3
fb7c070c-5f96-4add-8565-1398bd831b1f

A more robust way with superblock version parsing (we can use a python script fwiw).

(for i in `ls -1 /sys/block/bcache0/slaves` ; do sudo bcache-super-show /dev/$i ; done) | grep -Pzo '(?s)(?<=sb.version\t\t1).*?dev.uuid.*?\n' | grep -a uuid | cut -f 3
fb7c070c-5f96-4add-8565-1398bd831b1f

We could also use bcache label to hold a dname and use that instead (just need to set it on creation but UUID is way more reliab...

Read more...

Ante Karamatić (ivoks)
tags: added: cpe-onsite
Revision history for this message
Ryan Harper (raharper) wrote :

We've been poking at this for a bit and I'll summarize some things.

- The kernel itself has a bug w.r.t not emitting a CHANGE event during coldplug due to bcache driver only emitting that if the backing devices is not already registered.
- This first happens in the initramfs and /dev inside the initramfs is correctly populated with /deb/bcache/{by-uuid,by-label} symlinks
- systemd-udevd will examine current udev database of coldplug events and replay them; specifically an 'ADD' event on each /dev/bcacheX device; however due to the kernel bug in bcache driver, it does not generate a 'CHANGE' event which includes the CACHED_UUID value parsed from the backing device of the bcacheX device.
- When comparing old database links with the current device state (of bcacheX) it finds only the /dev/disk/by-uuid symlink associated with bcacheX and then removes the /dev/bcache/by-uuid/<dev.uuid> -> bcacheX symlink ; effectively remove a valid symlink. This may also be a udev bug; but that requires further investigation of systemd/udev/udev-node.c 's logic around purging links where the name doesn't match but the path does.

This leaves us with valid links that get purged until the kernel or systemd-udevd are fixed.

As suggested in this bug, we can workaround this issue by examing the backing device of a bcacheX during the 'ADD' event for /dev/bcacheX

I've added a script /lib/udev/bcache-export-cached which accepts the devname ($tempnode from udev) and uses sysfs to examine the list of slaves (there should be only one) and then uses bcache-super-show to extract the dev.uuid and dev.label values (CACHED_UUID, CACHED_LABEL respectively), and then emit them into the environment for 69-bcache.rules to import and then apply the correct symlinks.

W.r.t the suggestions w.r.t the sb.version; note that bcache devices are only slaves to backing devices; that is you can create a bcache device without a cache set device; bcache needs to continue to run even if the caching device has failed or is not available.

Revision history for this message
Ryan Harper (raharper) wrote :

A version of bcache-tools for xenial with the above workaround is available here:

https://launchpad.net/~raharper/+archive/ubuntu/bugfixes/+sourcepub/8536671/+listing-archive-extra

Revision history for this message
Ryan Harper (raharper) wrote :

For the curtin side, the plan is to do two things.

1) update the dname for bcache to use the backing device's dev.uuid value and emit a symlink pointing to a /dev/bcache/by-uuid path. This gives us

/dev/disk/by-dname/<name> -> \
   ../../bcache/by-uuid/<dev.uuid> -> \
        ../../bcacheX

2) curtin will now echo the dname value into /sys/class/block/bcacheX/bcache/label which will
then enable /dev/bcache/by-label/<dname> -> ../../bcacheX

This work needs to be tested on Trusty -> Bionic; I'll work on updating the vmtests for bcache devices to exercise this path.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Tested #6:

https://paste.ubuntu.com/26471097/

1) while dname symlinks point to the respective /dev/bcache<n> devices (n -> n is always maintained) the underlying /dev/sd<x> devices change for dnames (provided that they are stable in this test)

https://paste.ubuntu.com/26471097/

2) only one /dev/bcache/by-uuid symlink is ever created (should be two with per-superblock uuids). The UUID used is different from slave superblock UUIDs (it is a cache dev's superblock UUID):

https://paste.ubuntu.com/26471151/

cache device uuid -> dev.uuid 90707207-91e6-4def-9f39-b9fdbb3540ab
cset.uuid 4e21a32d-d60e-42a6-bc99-8ae7f5b21c99

sdc -> dev.uuid 4fb3f632-6e1a-43f4-9f68-e10db4d6d612
sdd -> dev.uuid 2c9b7bca-d30a-44c5-afcd-3b3aadfb0be0

ubuntu@maas-xenial2:~$ tree /dev/bcache
/dev/bcache
└── by-uuid
    └── 90707207-91e6-4def-9f39-b9fdbb3540ab -> ../../bcache0 # <---- 907...

instead, it should be:

/dev/disk/by-uuid/4fb3f632-6e1a-43f4-9f68-e10db4d6d612 -> /dev/bcache<whatever0>
/dev/disk/by-uuid/4fb3f632-6e1a-43f4-9f68-e10db4d6d612 -> /dev/bcache<whatever1>

IoW: bcache numbers for /dev/bcache<n> may flip but numbers at /dev/bcache/by-dname/bcache<n> should stay coherent with MAAS'/curtin's naming by using slave device superblock UUIDs.

Revision history for this message
Ryan Harper (raharper) wrote : Re: [Bug 1728742] Re: curtin dname for bcache uses unstable devname instead of UUID

.

On Sat, Jan 27, 2018 at 10:35 AM, Dmitrii Shcherbakov <
<email address hidden>> wrote:

> Tested #6:
>
> https://paste.ubuntu.com/26471097/
>
> 1) while dname symlinks point to the respective /dev/bcache<n> devices
> (n -> n is always maintained) the underlying /dev/sd<x> devices change
> for dnames (provided that they are stable in this test)
>
> https://paste.ubuntu.com/26471097/
>
> 2) only one /dev/bcache/by-uuid symlink is ever created (should be two
> with per-superblock uuids). The UUID used is different from slave
> superblock UUIDs (it is a cache dev's superblock UUID):
>
> https://paste.ubuntu.com/26471151/
>
> cache device uuid -> dev.uuid 90707207-91e6-4def-9f39-
> b9fdbb3540ab
> cset.uuid 4e21a32d-d60e-42a6-bc99-8ae7f5b21c99
>
> sdc -> dev.uuid 4fb3f632-6e1a-43f4-9f68-e10db4d6d612
> sdd -> dev.uuid 2c9b7bca-d30a-44c5-afcd-3b3aadfb0be0
>
> ubuntu@maas-xenial2:~$ tree /dev/bcache
> /dev/bcache
> └── by-uuid
> └── 90707207-91e6-4def-9f39-b9fdbb3540ab -> ../../bcache0 # <----
> 907...
>
> instead, it should be:
>
> /dev/disk/by-uuid/4fb3f632-6e1a-43f4-9f68-e10db4d6d612 ->
> /dev/bcache<whatever0>
> /dev/disk/by-uuid/4fb3f632-6e1a-43f4-9f68-e10db4d6d612 ->
> /dev/bcache<whatever1>
>
> IoW: bcache numbers for /dev/bcache<n> may flip but numbers at
> /dev/bcache/by-dname/bcache<n> should stay coherent with MAAS'/curtin's
> naming by using slave device superblock UUIDs.
>

This still needs curtin changes to emit dname that points to a backing UUID;
Currently the rule only matches dname to bcacheN; which is going to break
as you
can see; instead the dname should always be bound to the underlying backing
device.

Since that curtin change hasn't been done, this won't fully work.

Please attach your curtin configs for the above scenarios and we can work
on getting a
curtin with the dname bcache backing-uuid rule added to a ppa and then we
can validate
your needed scenarios.

> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1728742
>
> Title:
> curtin dname for bcache uses unstable devname instead of UUID
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/bcache-tools/+bug/1728742/+subscriptions
>

Revision history for this message
Ryan Harper (raharper) wrote :

Untested version of a patch to curtin to emit bcache dname's using the backing device uuid.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

curtin-install-cfg.yaml
https://paste.ubuntu.com/26484968/

Happy to test from a ppa.

Revision history for this message
Ryan Harper (raharper) wrote :

The bcache-export-cached udev script did not skip cache devices and would emit a cache device dev.uuid; I'm testing a change to ignore those and that appears to get things working on my single cache, multiple backing device test-case.

I've also some changes on the curtin side which will emit the udev dname rule based on the backing device uuid, and also write out the dname value into the backing device 'label' so there will be those symlinks as well.

Revision history for this message
Ryan Harper (raharper) wrote :

I've updated bcache-tools and curtin in ppa:raharper/bugfixes , please give those a test and see if that fixes the remaining issues.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Ran a few tests.

/etc/maas/preseeds/curtin_userdata
https://paste.ubuntu.com/26501712/
early_commands:
  10_add_ppa: ['sh', '-xc', 'DEBIAN_FRONTEND=noninteractive add-apt-repository --yes ppa:raharper/bugfixes']
  # update & upgrade what is there already
  97_update: ['apt-get', 'update']
  98_upgrade: ['sh', '-xc', 'DEBIAN_FRONTEND=noninteractive apt-get upgrade --yes']
...
system_upgrade:
  enabled: True

It seems like /dev/bcache/by-uuid links are set up correctly but labels are not written/persisted

https://paste.ubuntu.com/26502111/

It appears to be that there are also no dname rules written in rules.d hence dname symlinks are not correct when minor numbers change:

ubuntu@maas-xenial2:~$ grep -RiP dname /lib/udev/rules.d/ ; echo $?
1

Not sure if that's my curtin_userdata that is incorrect or something wrong happens during the rule rendering.

root@maas-xenial2:~# pastebinit curtin-install-cfg.yaml
http://paste.ubuntu.com/26502171/

root@maas-xenial2:~# pastebinit curtin-install.log
http://paste.ubuntu.com/26502172/

The package seems to be coming from a ppa correctly based on curtin-install.log.

Revision history for this message
Ryan Harper (raharper) wrote :

On Thu, Feb 1, 2018 at 1:40 PM, Dmitrii Shcherbakov <
<email address hidden>> wrote:

> Ran a few tests.
>
> /etc/maas/preseeds/curtin_userdata
> https://paste.ubuntu.com/26501712/
> early_commands:
> 10_add_ppa: ['sh', '-xc', 'DEBIAN_FRONTEND=noninteractive
> add-apt-repository --yes ppa:raharper/bugfixes']
> # update & upgrade what is there already
> 97_update: ['apt-get', 'update']
> 98_upgrade: ['sh', '-xc', 'DEBIAN_FRONTEND=noninteractive apt-get
> upgrade --yes']
> ...
> system_upgrade:
> enabled: True
>

Your early commands only affect the ephemeral environment, instead you
need to install this in the target like this:

 late_commands:
  01-bugfix-ppa: [curtin, in-target, --, add-apt-repository, -y,
"ppa:raharper/bugfixes"]
  02-update: [curtin, in-target, --, apt-get, update]
  03_install_package: [curtin, in-target, --, apt-get, install,
--assume-yes, bcache-tools]

>
> It seems like /dev/bcache/by-uuid links are set up correctly but labels
> are not written/persisted
>
> https://paste.ubuntu.com/26502111/
>
> It appears to be that there are also no dname rules written in rules.d
> hence dname symlinks are not correct when minor numbers change:
>
> ubuntu@maas-xenial2:~$ grep -RiP dname /lib/udev/rules.d/ ; echo $?
> 1
>
> Not sure if that's my curtin_userdata that is incorrect or something
> wrong happens during the rule rendering.
>
>
> root@maas-xenial2:~# pastebinit curtin-install-cfg.yaml
> http://paste.ubuntu.com/26502171/
>
> root@maas-xenial2:~# pastebinit curtin-install.log
> http://paste.ubuntu.com/26502172/
>
> The package seems to be coming from a ppa correctly based on curtin-
> install.log.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1728742
>
> Title:
> curtin dname for bcache uses unstable devname instead of UUID
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/bcache-tools/+bug/1728742/+subscriptions
>

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Ryan,

I did have a PPA added via MAAS though.

I looked at /lib/udev/rules.d which was incorrect - should have looked at /etc/udev/rules.d.

Now, with an entry in MAAS and in curtin_userdata as well:

https://paste.ubuntu.com/26502622/

root@maas-xenial2:~# apt policy bcache-tools
bcache-tools:
  Installed: 1.0.8-2ubuntu2~ppa3
  Candidate: 1.0.8-2ubuntu2~ppa3
  Version table:
 *** 1.0.8-2ubuntu2~ppa3 500
        500 http://ppa.launchpad.net/raharper/bugfixes/ubuntu xenial/main amd64 Packages

...
dev.label (empty)
...

root@maas-xenial2:~# cat /etc/udev/rules.d/bcache0.rules.rules
SUBSYSTEM=="block", ACTION=="add|change", ENV{DEVNAME}=="/dev/bcache0", SYMLINK+="disk/by-dname/bcache0"

So, it seems like a dname symlink -> UUID symlink was not created. The /dev/bcache/by-uuid rules are there.

I did try a manual label writing test based on the kernel code - a written label seems to persist after reboot.
https://paste.ubuntu.com/26502649/

So something tells me that I simply did not get the new *curtin* code.

https://paste.ubuntu.com/26502671/ (curtin-install-cfg.yaml)

I will poke around more in the morning - any ideas are appreciated.

Also, thx for working on this.

Revision history for this message
Ryan Harper (raharper) wrote :

On Thu, Feb 1, 2018 at 4:09 PM, Dmitrii Shcherbakov <
<email address hidden>> wrote:

> Ryan,
>
> I did have a PPA added via MAAS though.
>
> I looked at /lib/udev/rules.d which was incorrect - should have looked
> at /etc/udev/rules.d.
>
> Now, with an entry in MAAS and in curtin_userdata as well:
>
> https://paste.ubuntu.com/26502622/
>
> root@maas-xenial2:~# apt policy bcache-tools
> bcache-tools:
> Installed: 1.0.8-2ubuntu2~ppa3
> Candidate: 1.0.8-2ubuntu2~ppa3
> Version table:
> *** 1.0.8-2ubuntu2~ppa3 500
> 500 http://ppa.launchpad.net/raharper/bugfixes/ubuntu xenial/main
> amd64 Packages
>
> ...
> dev.label (empty)
> ...
>
> root@maas-xenial2:~# cat /etc/udev/rules.d/bcache0.rules.rules
> SUBSYSTEM=="block", ACTION=="add|change", ENV{DEVNAME}=="/dev/bcache0",
> SYMLINK+="disk/by-dname/bcache0"
>
> So, it seems like a dname symlink -> UUID symlink was not created. The
> /dev/bcache/by-uuid rules are there.
>
> I did try a manual label writing test based on the kernel code - a written
> label seems to persist after reboot.
> https://paste.ubuntu.com/26502649/
>
> So something tells me that I simply did not get the new *curtin* code.
>

Yes, you need the updated curtin installed, I *think* to the region
controllers; I'm not sure.
The updated curtin includes the new rule format that uses the CACHED_UUID
value from the
backing device (as well as updating the backing device label).

>
> https://paste.ubuntu.com/26502671/ (curtin-install-cfg.yaml)
>
> I will poke around more in the morning - any ideas are appreciated.
>
> Also, thx for working on this.
>

Sure

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1728742
>
> Title:
> curtin dname for bcache uses unstable devname instead of UUID
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/bcache-tools/+bug/1728742/+subscriptions
>

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Looks like there are old curtin packages there (from Jan last year):

http://ppa.launchpad.net/raharper/bugfixes/ubuntu/pool/main/c/curtin/
curtin_0.1.0~bzr443-0ubuntu1_all.deb 2017-01-27 17:17 2.4K
curtin_17.1-20-g824cfab2.orig.tar.gz 2018-01-30 21:03 365K
python-curtin_0.1.0~bzr443-0ubuntu1_all.deb 2017-01-27 17:17 83K
python3-curtin_0.1.0~bzr443-0ubuntu1_all.deb 2017-01-27 17:17 83K

The source tarball is new but not the debs.

Revision history for this message
Ryan Harper (raharper) wrote :

On Fri, Feb 2, 2018 at 6:46 AM, Dmitrii Shcherbakov <
<email address hidden>> wrote:

> Looks like there are old curtin packages there (from Jan last year):
>
> http://ppa.launchpad.net/raharper/bugfixes/ubuntu/pool/main/c/curtin/
> curtin_0.1.0~bzr443-0ubuntu1_all.deb 2017-01-27 17:17 2.4K
> curtin_17.1-20-g824cfab2.orig.tar.gz 2018-01-30 21:03 365K
> python-curtin_0.1.0~bzr443-0ubuntu1_all.deb 2017-01-27 17:17
> 83K
> python3-curtin_0.1.0~bzr443-0ubuntu1_all.deb 2017-01-27 17:17
> 83K
>
> The source tarball is new but not the debs.
>

Bah, yeah, build failure. I'll fix.

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1728742
>
> Title:
> curtin dname for bcache uses unstable devname instead of UUID
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/bcache-tools/+bug/1728742/+subscriptions
>

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Ryan, haven't realized you've updated the package in the ppa already.

Performed some more tests with the new package - LGTM.

1. one cache device + multiple backing devices + reboot with bcache minor number changes -> symlinks are maintained correctly (that is persistent symlinks point to a different set of /dev/bcache devices);
2. removal of a single device (when machine is powered off) + hotplug (after a power-on without a device) + device number does not match a symlink/label -> symlink is created correctly;
3. two cache devices + partitions with mixed cache dev partition/backing device partitions and regular blog devices.

https://paste.ubuntu.com/26534615/

Thanks a lot!

I would not wait for the kernel side in pad.lv/1729145 now albeit it's fix-committed - from my perspective the userspace workaround can be replaced once all affected kernels contain the fix and removal of a workaround is validated with a particular kernel.

Revision history for this message
Ryan Harper (raharper) wrote :

On Wed, Feb 7, 2018 at 5:38 AM, Dmitrii Shcherbakov <
<email address hidden>> wrote:

> Ryan, haven't realized you've updated the package in the ppa already.
>
> Performed some more tests with the new package - LGTM.
>

\o/

>
> 1. one cache device + multiple backing devices + reboot with bcache minor
> number changes -> symlinks are maintained correctly (that is persistent
> symlinks point to a different set of /dev/bcache devices);
> 2. removal of a single device (when machine is powered off) + hotplug
> (after a power-on without a device) + device number does not match a
> symlink/label -> symlink is created correctly;
> 3. two cache devices + partitions with mixed cache dev partition/backing
> device partitions and regular blog devices.
>
> https://paste.ubuntu.com/26534615/
>
> Thanks a lot!
>
> I would not wait for the kernel side in pad.lv/1729145 now albeit it's
> fix-committed - from my perspective the userspace workaround can be
> replaced once all affected kernels contain the fix and removal of a
> workaround is validated with a particular kernel.
>

No plans to wait; however; we should test with a fixed kernel to see it
somehow breaks things again.

> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1728742
>
> Title:
> curtin dname for bcache uses unstable devname instead of UUID
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/bcache-tools/+bug/1728742/+subscriptions
>

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

HWE (no patch applied)
https://paste.ubuntu.com/26537602/

* Below is a test on GA with bcache patch applied and bcache-export-cached commented out (post-deployment kernel change + udev rule modification)
* Looks good based on the previous configuration - all symlinks are where they should be https://paste.ubuntu.com/26534615/

69-bcache-rules:
https://paste.ubuntu.com/26540975/

# Ubuntu-4.4.0-113.136 with 3454b682755f9a800ba368334afa40dd9d821a88 bcache uevent patch applied

root@maas-xenial6:~# uname -a
Linux maas-xenial6 4.4.98-ueventbcache #1 SMP Thu Feb 8 10:59:26 MSK 2018 x86_64 x86_64 x86_64 GNU/Linux

root@maas-xenial6:~# tree /dev/bcache
/dev/bcache
├── by-label
│   ├── bcache0 -> ../../bcache2
│   ├── bcache2 -> ../../bcache3
│   ├── bcache3 -> ../../bcache0
│   ├── bcache4 -> ../../bcache1
│   └── bcache5 -> ../../bcache4
└── by-uuid
    ├── 07d7270c-ed0e-4814-bbc6-affd8d4585a9 -> ../../bcache0
    ├── 2ad4aecf-cb7c-495f-946d-21b59377e0fe -> ../../bcache4
    ├── 5bd523eb-0256-4700-ae6e-2db5ce82eca0 -> ../../bcache1
    ├── dc1f2545-cb5d-4420-80eb-59eb8b83a8a7 -> ../../bcache2
    └── eaa6bc7f-8a9c-495a-9f8b-92132cc49c49 -> ../../bcache3

2 directories, 10 files
root@maas-xenial6:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 64G 0 disk
└─sda1 8:1 0 64G 0 part /
sdb 8:16 0 64G 0 disk
├─bcache1 252:1 0 32.3G 0 disk
├─bcache2 252:2 0 64G 0 disk
└─bcache3 252:3 0 32.3G 0 disk
sdc 8:32 0 64G 0 disk
└─bcache2 252:2 0 64G 0 disk
sdd 8:48 0 64G 0 disk
└─bcache4 252:4 0 64G 0 disk
sde 8:64 0 64G 0 disk
├─sde1 8:65 0 31.7G 0 part
│ ├─bcache0 252:0 0 31.7G 0 disk
│ └─bcache4 252:4 0 64G 0 disk
└─sde2 8:66 0 32.3G 0 part
  └─bcache3 252:3 0 32.3G 0 disk
sdf 8:80 0 64G 0 disk
├─sdf1 8:81 0 31.7G 0 part
│ └─bcache0 252:0 0 31.7G 0 disk
└─sdf2 8:82 0 32.3G 0 part
  └─bcache1 252:1 0 32.3G 0 disk
root@maas-xenial6:~# tree /dev/disk/by-dname/
/dev/disk/by-dname/
├── bcache0 -> ../../bcache2
├── bcache2 -> ../../bcache3
├── bcache3 -> ../../bcache0
├── bcache4 -> ../../bcache1
├── bcache5 -> ../../bcache4
├── sda -> ../../sda
├── sda-part1 -> ../../sda1
├── sde-part1 -> ../../sde1
├── sde-part2 -> ../../sde2
├── sdf-part1 -> ../../sdf1
└── sdf-part2 -> ../../sdf2

0 directories, 11 files
root@maas-xenial6:~# uname -a
Linux maas-xenial6 4.4.98-ueventbcache #1 SMP Thu Feb 8 10:59:26 MSK 2018 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Ryan Harper (raharper) wrote :

Excellent; thanks for testing.

On Thu, Feb 8, 2018 at 7:05 AM, Dmitrii Shcherbakov <
<email address hidden>> wrote:

> HWE (no patch applied)
> https://paste.ubuntu.com/26537602/
>
> * Below is a test on GA with bcache patch applied and bcache-export-cached
> commented out (post-deployment kernel change + udev rule modification)
> * Looks good based on the previous configuration - all symlinks are where
> they should be https://paste.ubuntu.com/26534615/
>
> 69-bcache-rules:
> https://paste.ubuntu.com/26540975/
>
>
> # Ubuntu-4.4.0-113.136 with 3454b682755f9a800ba368334afa40dd9d821a88
> bcache uevent patch applied
>
> root@maas-xenial6:~# uname -a
> Linux maas-xenial6 4.4.98-ueventbcache #1 SMP Thu Feb 8 10:59:26 MSK 2018
> x86_64 x86_64 x86_64 GNU/Linux
>
> root@maas-xenial6:~# tree /dev/bcache
> /dev/bcache
> ├── by-label
> │ ├── bcache0 -> ../../bcache2
> │ ├── bcache2 -> ../../bcache3
> │ ├── bcache3 -> ../../bcache0
> │ ├── bcache4 -> ../../bcache1
> │ └── bcache5 -> ../../bcache4
> └── by-uuid
> ├── 07d7270c-ed0e-4814-bbc6-affd8d4585a9 -> ../../bcache0
> ├── 2ad4aecf-cb7c-495f-946d-21b59377e0fe -> ../../bcache4
> ├── 5bd523eb-0256-4700-ae6e-2db5ce82eca0 -> ../../bcache1
> ├── dc1f2545-cb5d-4420-80eb-59eb8b83a8a7 -> ../../bcache2
> └── eaa6bc7f-8a9c-495a-9f8b-92132cc49c49 -> ../../bcache3
>
> 2 directories, 10 files
> root@maas-xenial6:~# lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 64G 0 disk
> └─sda1 8:1 0 64G 0 part /
> sdb 8:16 0 64G 0 disk
> ├─bcache1 252:1 0 32.3G 0 disk
> ├─bcache2 252:2 0 64G 0 disk
> └─bcache3 252:3 0 32.3G 0 disk
> sdc 8:32 0 64G 0 disk
> └─bcache2 252:2 0 64G 0 disk
> sdd 8:48 0 64G 0 disk
> └─bcache4 252:4 0 64G 0 disk
> sde 8:64 0 64G 0 disk
> ├─sde1 8:65 0 31.7G 0 part
> │ ├─bcache0 252:0 0 31.7G 0 disk
> │ └─bcache4 252:4 0 64G 0 disk
> └─sde2 8:66 0 32.3G 0 part
> └─bcache3 252:3 0 32.3G 0 disk
> sdf 8:80 0 64G 0 disk
> ├─sdf1 8:81 0 31.7G 0 part
> │ └─bcache0 252:0 0 31.7G 0 disk
> └─sdf2 8:82 0 32.3G 0 part
> └─bcache1 252:1 0 32.3G 0 disk
> root@maas-xenial6:~# tree /dev/disk/by-dname/
> /dev/disk/by-dname/
> ├── bcache0 -> ../../bcache2
> ├── bcache2 -> ../../bcache3
> ├── bcache3 -> ../../bcache0
> ├── bcache4 -> ../../bcache1
> ├── bcache5 -> ../../bcache4
> ├── sda -> ../../sda
> ├── sda-part1 -> ../../sda1
> ├── sde-part1 -> ../../sde1
> ├── sde-part2 -> ../../sde2
> ├── sdf-part1 -> ../../sdf1
> └── sdf-part2 -> ../../sdf2
>
> 0 directories, 11 files
> root@maas-xenial6:~# uname -a
> Linux maas-xenial6 4.4.98-ueventbcache #1 SMP Thu Feb 8 10:59:26 MSK 2018
> x86_64 x86_64 x86_64 GNU/Linux
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1728742
>
> Title:
> curtin dname for bcache uses unstable devname instead of UUID
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/bcache-tools/+bug/1728742/+subscriptions
>

Revision history for this message
Ryan Harper (raharper) wrote :

Debdiff for bionic

Revision history for this message
Ryan Harper (raharper) wrote :

Debdiff for Xenial, this will need an SRU to Xenial and Trusty, probably Artful as well.

Scott Moser (smoser)
Changed in bcache-tools (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
status: Confirmed → In Progress
assignee: nobody → Ryan Harper (raharper)
Changed in bcache-tools (Ubuntu Trusty):
status: New → Confirmed
Changed in bcache-tools (Ubuntu Xenial):
status: New → Confirmed
Changed in bcache-tools (Ubuntu Artful):
status: New → Confirmed
Changed in bcache-tools (Ubuntu Trusty):
importance: Undecided → Medium
Changed in bcache-tools (Ubuntu Xenial):
importance: Undecided → Medium
Changed in bcache-tools (Ubuntu Artful):
importance: Undecided → Medium
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "fix_bcache_dname.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Changed in bcache-tools (Debian):
status: Unknown → New
Ryan Harper (raharper)
description: updated
Scott Moser (smoser)
description: updated
Changed in bcache-tools (Debian):
status: New → Confirmed
Revision history for this message
Scott Moser (smoser) wrote :

Ryan forwarded this to bcache-tools upstream on github.
 https://github.com/koverstreet/bcache-tools/pull/1

description: updated
Scott Moser (smoser)
Changed in curtin:
status: New → Fix Committed
David Britton (dpb)
no longer affects: bcache-tools (Ubuntu)
Changed in bcache-tools (Ubuntu Bionic):
status: In Progress → Invalid
Changed in bcache-tools (Ubuntu Artful):
status: Confirmed → Invalid
Changed in bcache-tools (Ubuntu Xenial):
status: Confirmed → Invalid
Changed in bcache-tools (Ubuntu Trusty):
status: Confirmed → Invalid
Changed in bcache-tools (Ubuntu Bionic):
assignee: Ryan Harper (raharper) → nobody
Revision history for this message
David Britton (dpb) wrote :

Now that https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729145 is released, this work to SRU back to xenial can proceed.

Changed in curtin (Ubuntu Trusty):
status: New → Won't Fix
Changed in curtin (Ubuntu Bionic):
status: New → Fix Released
Changed in curtin (Ubuntu):
status: New → Fix Released
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Revision history for this message
Ryan Harper (raharper) wrote : Fixed in curtin version 18.1-17-gae48e86f-0ubuntu1.

This bug is believed to be fixed in curtin in version18.1-17-gae48e86f-0ubuntu1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in curtin:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.