volume size pod zfs

Bug #1858201 reported by Luis Rodriguez
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Newell Jensen
libvirt (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I am trying to create a pod on a MAAS node that has virsh installed, I added a specific ZFS storage pool

And when trying to create the pod I see the following error on /var/log/syslog

Jan 03 18:13:00 node34 libvirtd[3340]: 2020-01-03 12:43:00.310+0000: 3341: error : virCommandWait:2601 : internal error: Child process (/sbin/zfs create -s -V 8789063K zpool/52a7cb41-4241-4336-b624-1f930ff398ed) unexpected exit status 1: cannot create 'zpool/52a7cb41-4241-4336-b624-1f930ff398ed': volume size must be a multiple of volume block size

Any setting for the volume size in GB doesn't match the 8K volume block that ZS requires

On MAAS while executing the "Create Pod" button I get the following error:

Add another device
 Pod unable to compose machine: Unable to compose machine because: Failed talking to pod: Start tag expected, '<' not found, line 1, column 1 (<string>, line 1)

Related branches

Revision history for this message
Luis Rodriguez (laragones) wrote :

creating a storage volume of size 16G worked

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Luis, can you please tell us which volume size you were trying before when this was failing so we can try and reproduce?

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Nevermind, it is in the create command. Will try and reproduce this.

Changed in maas:
assignee: nobody → Newell Jensen (newell-jensen)
Revision history for this message
Luis Rodriguez (laragones) wrote :

no problem.. I tried with all the options (ONLY GB OPTION is available) from 8 to 16 in 1 increment, and only with 16 worked.

FOr example you are using -V 8789063K for 8G, and it is not a multiple of the block size 8K

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Luis, what version of MAAS are you using? Latest MAAS has zfs support on root disk when deploying Ubuntu. See https://ubuntu.com/blog/deploying-ubuntu-root-on-zfs-with-maas

Changed in maas:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Luis Rodriguez (laragones) wrote :

I am on MAAS 2.6.2. I tried zfs but didn't work.. Until now with your post I realized it requires GPT

My HDD all are smaller than 2TB and the system where I wanted to install it doesn't support UEFI/EFI

Revision history for this message
Newell Jensen (newell-jensen) wrote :

I was able to reproduce this. As MAAS only talks to libvirt via ssh, giving it the parameters to create a VM, and the fact that this is not an issue with other storage backends, I believe this is a libvirt bug as libvirt is ultimately calling the command to create the volume storage.

Revision history for this message
Bryce Harrington (bryce) wrote :

Newell, what are the args / steps you're using to repro the issue in libvirt?

Changed in libvirt (Ubuntu):
status: New → Incomplete
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Ack to bryce question for details, and in addition a list of commands to create (if any) and dumpxml of the XML that defines the zfsstorage pool / volumes would be great.

Revision history for this message
Luis Rodriguez (laragones) wrote :

Please note that from MAAS you can't specify the size in KB, only in GB. the command being executed is having the argument in KB.. And that translation from 8GB to 8789063K is not a multiple of the block size of 8. THis is the command that is being executed from libvirt I suppose (taken from the log)

/sbin/zfs create -s -V 8789063K zpool/52a7cb41-4241-4336-b624-1f930ff398ed

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Note we can derive that allocation was not set as we see "-s" (capacity != allocation), but we don't see -o so allocation is "0" (likely unset).

330 if (vol->target.capacity != vol->target.allocation) {
331 virCommandAddArg(cmd, "-s");
332 if (vol->target.allocation > 0) {
333 virCommandAddArg(cmd, "-o");

Further we have:
339 virCommandAddArg(cmd, "-V");
340 virCommandAddArgFormat(cmd, "%lluK",
341 VIR_DIV_UP(vol->target.capacity, 1024));

Currently the code makes whatever you pass it 1k aligned (unconditionally).
That is in src/storage/storage_backend_zfs.c.

Code-wise it would be easy to make it 8k aligned, but I'd need to know if that is a general ZFS requirement or just one due to the way you set up your ZFS.
Note: the code is that way since ZFS support was added in 2014, not much changes going on.

But I start to guess too much ...
Still waiting for the commands and XMLs that got used to reproduce this as asked before.

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Luis,

Do you still happen to have your zfs setup to get the requested information from (XMLs etc)? My setup was deleted and I have been busy with other higher priority items. Might take me a bit to get to this if you don't have it.

Cheers

Revision history for this message
Luis Rodriguez (laragones) wrote : Re: [Bug 1858201] Re: volume size pod zfs

Yes.. what should I do?

On Fri, Jan 17, 2020 at 2:50 AM Newell Jensen <email address hidden>
wrote:

> Luis,
>
> Do you still happen to have your zfs setup to get the requested
> information from (XMLs etc)? My setup was deleted and I have been busy
> with other higher priority items. Might take me a bit to get to this if
> you don't have it.
>
> Cheers
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1858201
>
> Title:
> volume size pod zfs
>
> Status in MAAS:
> Triaged
> Status in libvirt package in Ubuntu:
> Incomplete
>
> Bug description:
> I am trying to create a pod on a MAAS node that has virsh installed,
> I added a specific ZFS storage pool
>
> And when trying to create the pod I see the following error on
> /var/log/syslog
>
> Jan 03 18:13:00 node34 libvirtd[3340]: 2020-01-03 12:43:00.310+0000:
> 3341: error : virCommandWait:2601 : internal error: Child process
> (/sbin/zfs create -s -V 8789063K
> zpool/52a7cb41-4241-4336-b624-1f930ff398ed) unexpected exit status 1:
> cannot create 'zpool/52a7cb41-4241-4336-b624-1f930ff398ed': volume
> size must be a multiple of volume block size
>
>
> Any setting for the volume size in GB doesn't match the 8K volume block
> that ZS requires
>
>
> On MAAS while executing the "Create Pod" button I get the following
> error:
>
> Add another device
> Pod unable to compose machine: Unable to compose machine because:
> Failed talking to pod: Start tag expected, '<' not found, line 1, column 1
> (<string>, line 1)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1858201/+subscriptions
>

Revision history for this message
Newell Jensen (newell-jensen) wrote :

$ virsh dumpxml <machin-name-that-maas-created-that-caused-issues>
$ virsh pool-dumpxml <name-of-zfs-pool>
$ /sbin/zfs create -s -V 8789063K zpool/52a7cb41-4241-4336-b624-1f930ff398ed

Christian, if there are any other specific commands that you know that Luis should execute since he has his setup still, please let him know.

Revision history for this message
Luis Rodriguez (laragones) wrote :
Download full text (4.9 KiB)

On Sat, Jan 18, 2020 at 12:37 AM Newell Jensen <email address hidden>
wrote:

> $ virsh dumpxml <machin-name-that-maas-created-that-caused-issues>
>
root@node35:~# virsh dumpxml testers

<domain type='kvm'>

  <name>testers</name>

  <uuid>ad6b59c9-76b5-4481-a0bb-dc896782f27e</uuid>

  <memory unit='KiB'>1048576</memory>

  <currentMemory unit='KiB'>1048576</currentMemory>

  <vcpu placement='static'>1</vcpu>

  <os>

    <type arch='x86_64' machine='pc-i440fx-bionic'>hvm</type>

    <boot dev='hd'/>

  </os>

  <features>

    <acpi/>

    <apic/>

  </features>

  <clock offset='utc'/>

  <on_poweroff>destroy</on_poweroff>

  <on_reboot>restart</on_reboot>

  <on_crash>restart</on_crash>

  <pm>

    <suspend-to-mem enabled='no'/>

    <suspend-to-disk enabled='no'/>

  </pm>

  <devices>

    <emulator>/usr/bin/qemu-system-x86_64</emulator>

    <controller type='pci' index='0' model='pci-root'/>

    <controller type='virtio-serial' index='0'>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>

    </controller>

    <controller type='usb' index='0' model='piix3-uhci'>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x2'/>

    </controller>

    <interface type='bridge'>

      <mac address='52:54:00:41:d7:d3'/>

      <source bridge='br1'/>

      <model type='virtio'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
function='0x0'/>

    </interface>

    <serial type='pty'>

      <log file='/var/log/libvirt/qemu/testers-serial0.log' append='off'/>

      <target type='isa-serial' port='0'>

        <model name='isa-serial'/>

      </target>

    </serial>

    <console type='pty'>

      <log file='/var/log/libvirt/qemu/testers-serial0.log' append='off'/>

      <target type='serial' port='0'/>

    </console>

    <channel type='spicevmc'>

      <target type='virtio' name='com.redhat.spice.0'/>

      <address type='virtio-serial' controller='0' bus='0' port='1'/>

    </channel>

    <input type='mouse' bus='ps2'/>

    <input type='keyboard' bus='ps2'/>

    <graphics type='spice' autoport='yes'>

      <listen type='address'/>

      <image compression='off'/>

    </graphics>

    <video>

      <model type='cirrus' vram='16384' heads='1' primary='yes'/>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
function='0x0'/>

    </video>

    <memballoon model='virtio'>

      <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>

    </memballoon>

  </devices>
</domain>

$ virsh pool-dumpxml <name-of-zfs-pool>
>
root@node35:~# virsh pool-dumpxml zpool

<pool type='zfs'>

  <name>zpool</name>

  <uuid>93a40940-eed8-47a7-8eba-3559e83bb24c</uuid>

  <capacity unit='bytes'>1245540515840</capacity>

  <allocation unit='bytes'>782941757440</allocation>

  <available unit='bytes'>462598758400</available>

  <source>

    <name>zpool</name>

  </source>

  <target>

    <path>/dev/zvol/zpool</path>

  </target>
</pool>

I had the pool created as mirror of two HDD partitions with no custom
parameters:

zpool create zpool -m /data/zpool mirror /dev/sda4 /dev/sdb4

virsh pool-define-as --name zpool --source-name zpool ...

Read more...

Revision history for this message
Luis Rodriguez (laragones) wrote :
Download full text (5.6 KiB)

Maybe one last thing, is that after I tried to create the machine in MAAS
and gave the error, it is not shown in MAAS composed machines, but it is
shown in virsh list --all, so I have to manually delete it

On Sat, Jan 18, 2020 at 9:15 AM Luis Alfonso Rodríguez Aragonés <
<email address hidden>> wrote:

>
>
> On Sat, Jan 18, 2020 at 12:37 AM Newell Jensen <
> <email address hidden>> wrote:
>
>> $ virsh dumpxml <machin-name-that-maas-created-that-caused-issues>
>>
> root@node35:~# virsh dumpxml testers
>
> <domain type='kvm'>
>
> <name>testers</name>
>
> <uuid>ad6b59c9-76b5-4481-a0bb-dc896782f27e</uuid>
>
> <memory unit='KiB'>1048576</memory>
>
> <currentMemory unit='KiB'>1048576</currentMemory>
>
> <vcpu placement='static'>1</vcpu>
>
> <os>
>
> <type arch='x86_64' machine='pc-i440fx-bionic'>hvm</type>
>
> <boot dev='hd'/>
>
> </os>
>
> <features>
>
> <acpi/>
>
> <apic/>
>
> </features>
>
> <clock offset='utc'/>
>
> <on_poweroff>destroy</on_poweroff>
>
> <on_reboot>restart</on_reboot>
>
> <on_crash>restart</on_crash>
>
> <pm>
>
> <suspend-to-mem enabled='no'/>
>
> <suspend-to-disk enabled='no'/>
>
> </pm>
>
> <devices>
>
> <emulator>/usr/bin/qemu-system-x86_64</emulator>
>
> <controller type='pci' index='0' model='pci-root'/>
>
> <controller type='virtio-serial' index='0'>
>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x0'/>
>
> </controller>
>
> <controller type='usb' index='0' model='piix3-uhci'>
>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> function='0x2'/>
>
> </controller>
>
> <interface type='bridge'>
>
> <mac address='52:54:00:41:d7:d3'/>
>
> <source bridge='br1'/>
>
> <model type='virtio'/>
>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
> function='0x0'/>
>
> </interface>
>
> <serial type='pty'>
>
> <log file='/var/log/libvirt/qemu/testers-serial0.log' append='off'/>
>
> <target type='isa-serial' port='0'>
>
> <model name='isa-serial'/>
>
> </target>
>
> </serial>
>
> <console type='pty'>
>
> <log file='/var/log/libvirt/qemu/testers-serial0.log' append='off'/>
>
> <target type='serial' port='0'/>
>
> </console>
>
> <channel type='spicevmc'>
>
> <target type='virtio' name='com.redhat.spice.0'/>
>
> <address type='virtio-serial' controller='0' bus='0' port='1'/>
>
> </channel>
>
> <input type='mouse' bus='ps2'/>
>
> <input type='keyboard' bus='ps2'/>
>
> <graphics type='spice' autoport='yes'>
>
> <listen type='address'/>
>
> <image compression='off'/>
>
> </graphics>
>
> <video>
>
> <model type='cirrus' vram='16384' heads='1' primary='yes'/>
>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
>
> </video>
>
> <memballoon model='virtio'>
>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> function='0x0'/>
>
> </memballoon>
>
> </devices>
> </domain>
>
> $ virsh pool-dumpxml <name-of-zfs-pool>
>>
> root@node35:~# virsh pool-dumpxml zpool
>
> <pool type='zfs'>
>
> <name>zpoo...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for your steps to create your pool in your case.
With that I recreated the issue:

$ sudo zpool create zpool -m /data/zpool mirror /dev/nvme0n1p3 /dev/nvme0n1p4
$ virsh pool-define-as --name zpool --source-name zpool --type zfs
$ virsh pool-start zpool
$ virsh pool-autostart zpool

Yours is much bigger than mine it seems, let see if that plays a role
yours:
  <capacity unit='bytes'>49928994816</capacity>
    <allocation unit='bytes'>139776</allocation>
  <available unit='bytes'>49928855040</available>

The data you sent was already good, the one thing I'm missing is the XML that Maas sends to libvirt to actually create the volume from the pool.
The guest definition that was added also has no disks, so I have to make assumptions.
For now I'll make up my own, but adding that can help if I can't reproduce.

With the above I let libvirt create a volume of 8GB (as reported 8-15G failing); 16 is working.

In Libvirt you can pass #bytes (the default) or scaled integers with e.g. G,GiB meaning powers of 2 and GB being a power of 10. For the test I tried different ones here.

Also you can tell it to allocate a specific amount of it. I have played with that as well.

I'll always report the XML that was used as it can be reported as well, that you can match to your requests sent by Maas.

basic
$ virsh vol-create-as zpool testvol1 8G
(worked)
matches:
<volume>
  <name>testvol1</name>
  <capacity>8589934592</capacity>
</volume>

with allocation:
$ virsh vol-create-as zpool testvol2 8G --allocation 8G --print-xml
matches:
<volume>
  <name>testvol2</name>
  <capacity>8589934592</capacity>
  <allocation>8589934592</allocation>
</volume>

Of course if you are passing it a wrong size not matching ZFS needs of 8k it will fail:
Here using the power-of-10 based GB
$ virsh vol-create-as zpool testvol3 8GB
error: Failed to create vol testvol3
error: internal error: Child process (/sbin/zfs create -V 7812500K zpool/testvol3) unexpected exit status 1: cannot create 'zpool/testvol3': volume size must be a multiple of volume block size

Your bug report shows 8789063K, which is in bytes a suspicious "9000000512".
Aren't you sure there might be some rounding from maas going on already?
We'd really need to see what you passed to libvirt to allocate the disk.
Newel can you get that XML content?

Right now I'm assuming you pass it in bytes and do not round to 8k on your side, or you pass it with "GB" instead of "G" as suffix.

P.S. I'll try to catch your people here on the sprint, maybe one can help me with that.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI: I'm assuming this example data was from a 9G test that became 9000000512 bytes requested.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Libvirt does 1k alignment, that was what added th 512, Maas seems to request just 9000000000. This value isn't ok, as the alignment is missing there.
The interface in libvirt is dumb, as it doesn't know the shifts - that would be whoever created the zpool (or needs a lot of extra probing). Unfortunately in your case also Maas didn't create it.

Details on the upper bounds:
- Default ashift in zpool ends up representing 8k, but it could be smaller (down to 9) and
  therefore libvirt might not want to align to a higher value.
- It also depends on the sector size of the disk 512/4k.
- I think the effective upper limit might be ashift 16 (max per man page of zfs).
  13 is 8k, so 16 would be 64k.
  With 4k sectors that would be 512k.

After talking with Bjorn (thanks for the debug help!) Maas might just align to e.g. 1M since you only allocate in GiB that isn't a huge loss and would make it work with whatever a user could set the zpool up.

I'll set the libvirt task to invalid and Maas can evaluate the change to align before sending the request through vol-create-as.

Changed in libvirt (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Newell Jensen (newell-jensen) wrote :

Luis,

If you still have your zfs setup, could you please test the linked branch? You would only need to update /usr/lib/python3/dist-packages/provisioningserver/drivers/pod/virsh.py on your MAAS host with the changes and then restart your rack controller with:

$ sudo systemctl restart maas-rackd

This branch lines up the zfs vol-create-as call to MiB, which is what seems that libvirt wants.

Thanks.

Changed in maas:
status: Triaged → In Progress
Revision history for this message
Luis Rodriguez (laragones) wrote :

I'm out of station until Feb 3rd. I'll check when I arrive

On Thu, Jan 23, 2020, 4:20 AM Newell Jensen <email address hidden>
wrote:

> Luis,
>
> If you still have your zfs setup, could you please test the linked
> branch? You would only need to update /usr/lib/python3/dist-
> packages/provisioningserver/drivers/pod/virsh.py on your MAAS host with
> the changes and then restart your rack controller with:
>
> $ sudo systemctl restart maas-rackd
>
> This branch lines up the zfs vol-create-as call to MiB, which is what
> seems that libvirt wants.
>
>
> Thanks.
>
> ** Changed in: maas
> Status: Triaged => In Progress
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1858201
>
> Title:
> volume size pod zfs
>
> Status in MAAS:
> In Progress
> Status in libvirt package in Ubuntu:
> Invalid
>
> Bug description:
> I am trying to create a pod on a MAAS node that has virsh installed,
> I added a specific ZFS storage pool
>
> And when trying to create the pod I see the following error on
> /var/log/syslog
>
> Jan 03 18:13:00 node34 libvirtd[3340]: 2020-01-03 12:43:00.310+0000:
> 3341: error : virCommandWait:2601 : internal error: Child process
> (/sbin/zfs create -s -V 8789063K
> zpool/52a7cb41-4241-4336-b624-1f930ff398ed) unexpected exit status 1:
> cannot create 'zpool/52a7cb41-4241-4336-b624-1f930ff398ed': volume
> size must be a multiple of volume block size
>
>
> Any setting for the volume size in GB doesn't match the 8K volume block
> that ZS requires
>
>
> On MAAS while executing the "Create Pod" button I get the following
> error:
>
> Add another device
> Pod unable to compose machine: Unable to compose machine because:
> Failed talking to pod: Start tag expected, '<' not found, line 1, column 1
> (<string>, line 1)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1858201/+subscriptions
>

Changed in maas:
milestone: none → next
status: In Progress → Fix Committed
Revision history for this message
Luis Rodriguez (laragones) wrote :

Sorry for the late response..

Is this included in 2.7 ? I wasn't able to test the files on my current installation

thanks
Luis

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Luis,

This did not make it into 2.7 but will be part of 2.8. If you want to work around the issue yourself of test, you can patch the Virsh Pod driver with the diff from the linked branch.

Thanks

Alberto Donato (ack)
Changed in maas:
status: Fix Committed → Fix Released
milestone: next → 2.8.0b1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.