Merge ~paul-meyer/cloud-init:fix-raw-ephemeral-disk into cloud-init:master

Proposed by Paul Meyer
Status: Merged
Merged at revision: 30d0adb71c9adadf437b2a1c69529ad9f44e75a8
Proposed branch: ~paul-meyer/cloud-init:fix-raw-ephemeral-disk
Merge into: cloud-init:master
Diff against target: 16 lines (+4/-1)
1 file modified
cloudinit/config/cc_disk_setup.py (+4/-1)
Reviewer Review Type Date Requested Status
cloud-init Commiters Pending
Review via email: mp+304033@code.launchpad.net

Description of the change

When the ephemeral drive is unpartitioned, sgdisk output has an extra line (see below), which fails get_gpt_hdd_size(). This patch add flexibility to handle this case.

root@sshvm2:~# sgdisk -p /dev/sdb
Creating new GPT entries.
Disk /dev/sdb: 266338304 sectors, 127.0 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 22D20184-BA9D-4C3B-8BF0-B6A45A8396C6
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 266338270
Partitions will be aligned on 2048-sector boundaries
Total free space is 266338237 sectors (127.0 GiB)

Number Start (sector) End (sector) Size Code Name
root@sshvm2:~#

To post a comment you must log in.
Revision history for this message
Scott Moser (smoser) wrote :

Hi,

could you give some cloud-config to demonstrate this? (i realize i have to have a blank disk), or does this just happen on Azure for some releases?

Also, is there a bug open that this fixes ?

Thanks!

Revision history for this message
Paul Meyer (paul-meyer) wrote :
Download full text (4.4 KiB)

I was testing (on the Azure 16.04 image) if we can leave the disk raw instead of partitioning and formatting it NTFS. It would shave some seconds off of the deployment time of VM's. I noticed that gfdisk has an extra line at the top when the disk is completely uninitialized, throwing off the get_gpt_hdd_size function.

There's no open bug, but I thought I'd send a patch anyway, because this is a generic potential issue.
I pasted the relevant log snippet below. As you can see, the hdd size is returned as 'GPT', which is the third word on the first line, instead of the number of sectors, which is supposed to be returned.

Aug 26 17:49:39 ubuntu [CLOUDINIT] stages.py[DEBUG]: Running module disk_setup (<module 'cloudinit.config.cc_disk_setup' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_disk_setup.py'>) with frequency once-per-instance
Aug 26 17:49:39 ubuntu [CLOUDINIT] handlers.py[DEBUG]: start: modules-config/config-disk_setup: running config-disk_setup with frequency once-per-instance
Aug 26 17:49:39 ubuntu [CLOUDINIT] util.py[DEBUG]: Writing to /var/lib/cloud/instances/2EE1991E-DADF-6D47-9683-E393AD393624/sem/config_disk_setup - wb: [420] 24 bytes
Aug 26 17:49:39 ubuntu [CLOUDINIT] helpers.py[DEBUG]: Running config-disk_setup using lock (<FileLock using file '/var/lib/cloud/instances/2EE1991E-DADF-6D47-9683-E393AD393624/sem/config_disk_setup'>)
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: updated disk_setup device entry 'ephemeral0' to '/dev/sdb'
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Partitioning disks: {'/dev/sdb': {'overwrite': True, 'layout': [100], 'table_type': 'gpt', '_origname': 'ephemeral0'}}
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating new partition table/disk
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Checking values for /dev/sdb definition
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Checking against default devices
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Checking if device /dev/sdb is a valid device
Aug 26 17:49:39 ubuntu [CLOUDINIT] util.py[DEBUG]: Running command ['/bin/lsblk', '--pairs', '--output', 'NAME,TYPE,FSTYPE,LABEL', '/dev/sdb', '--nodeps'] with allowed return codes [0] (shell=False, capture=True)
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Checking if device layout matches
Aug 26 17:49:39 ubuntu [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/sgdisk', '-p', '/dev/sdb'] with allowed return codes [0] (shell=False, capture=True)
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Checking if device is safe to partition
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Checking for device size
Aug 26 17:49:39 ubuntu [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/sgdisk', '-p', '/dev/sdb'] with allowed return codes [0] (shell=False, capture=True)
Aug 26 17:49:39 ubuntu [CLOUDINIT] cc_disk_setup.py[DEBUG]: Calculating partition layout
Aug 26 17:49:39 ubuntu [CLOUDINIT] util.py[DEBUG]: Creating partition on /dev/sdb took 0.083 seconds
Aug 26 17:49:39 ubuntu [CLOUDINIT] util.py[WARNING]: Failed partitioning operation#012could not convert string to float: 'GPT'
A...

Read more...

Revision history for this message
Scott Moser (smoser) wrote :

Hey, thanks for your patience.
I'm going to pull this.that module can definitely use some cleanup.
curtin has lots of stuff that does this sort of thing better.

For something like "find out how big a block device is" in sectors we could do something like:

$ sudo blockdev --getsize /dev/sdb
83886080

$ cat /sys/class/block/sdb/size
83886080

For that value in bytes:
 $ sudo blockdev --getsize64 /dev/sdb
42949672960

Revision history for this message
Paul Meyer (paul-meyer) wrote :

No problem and thanks, Scott.

BTW, we've also started noticing mkfs failures on 14.04, which we think is related to an async partition table reread triggered by 'sgdisk -p /dev/sdb' (similar to the issue outlined here: https://github.com/Azure/WALinuxAgent/pull/183).
It might be a good idea to insert 'blockdev --rereadpt' before attempting mkfs. I can make a merge proposal for that and switch some of these other methods to using blockdev as well?

Revision history for this message
Scott Moser (smoser) wrote :

Paul, I pulled this.
There is actually lots of errors when requesting partition re-read.
I'm not sure on 14.04, but in newer releases of udev, it actually is watching RW open on block devices, and whenever a close of such a file descriptor occurs, it requests a reread.

see /lib/udev/rules.d/60-block.rules

The end result is if you do something like:
  wipefs --all /dev/sdb
  blockdev --reread-pt /dev/sdb

I just verified this on 16.04 kvm vm:
sudo sh -c 'dev=/dev/vdb; echo 2048, > sfdisk_in; while :; do sfdisk $dev <sfdisk_in && blockdev --rereadpt $dev || break; done'

eventually you'll see:
blockdev: ioctl error on BLKRRPART: Device or resource busy

Fun, eh?

Revision history for this message
Scott Moser (smoser) wrote :

Actually, the example above was bogus, because i didn't udevadm settle after sfdisk would have called rereadpt. However, this one shows the problem:

sudo sh -c 'dev=/dev/vdb;
  while :; do
    dd if=/dev/null of=$dev bs=1 count=0 &&
    blockdev --rereadpt $dev || break;
    udevadm settle; done'

There, we open /dev/vdb for RW via 'dd', close it, and then call blockdev --rereadpt.
After calling blockdev --rereadpt we do 'udevadm settle' so that any events that blockdev's reread are fully processed. Then, we do it all again (and again).

Eventually you'll see blockdev complain:
 blockdev: ioctl error on BLKRRPART: Device or resource busy

Revision history for this message
Paul Meyer (paul-meyer) wrote :

Hmmm, yeah that works on U14 as well... so would a 'udevadm settle' before a mkfs help prevent '/dev/xxx1 is not a block device'?

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/cloudinit/config/cc_disk_setup.py b/cloudinit/config/cc_disk_setup.py
2index b642f1f..352cd98 100644
3--- a/cloudinit/config/cc_disk_setup.py
4+++ b/cloudinit/config/cc_disk_setup.py
5@@ -356,7 +356,10 @@ def get_mbr_hdd_size(device):
6
7 def get_gpt_hdd_size(device):
8 out, _ = util.subp([SGDISK_CMD, '-p', device])
9- return out.splitlines()[0].split()[2]
10+ for line in out.splitlines():
11+ if line.startswith("Disk"):
12+ return line.split()[2]
13+ raise Exception("Failed to get %s size from sgdisk" % (device))
14
15
16 def get_hdd_size(table_type, device):

Subscribers

People subscribed via source and target branches