ubuntu/focal64 very slow to boot and reboots once

Bug #1874453 reported by Jake Cobb
206
This bug affects 74 people
Affects Status Importance Assigned to Milestone
cloud-images
Fix Released
Undecided
John Chittum
Focal
Fix Released
Undecided
John Chittum
Groovy
Fix Released
Undecided
John Chittum
livecd-rootfs (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

SRU

[IMPACT]

* Users are unable to boot a vagrant image, as the boot time without a serial console connection exceeds the vagrant timeout
* bug is focal as well as latest devel devel release.
* Upload fixes the bug by creating a serial console connection to a null file

[TEST CASE]

* tests require vagrant and virtualbox to be installed on the testing machine.
* livecd-rootfs build with ubuntu-cpc project
* download vagrant .box file and add (`vagrant box add <PATH TO BOX> --name focal-test-image`)
* create testing directory and cd
* create Vagrantfile to use focal-test-image as the image
```
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
    config.vm.box = "focal-test-image"
end
```
* run `vagrant up`
* observe that image successfully boots (should boot in < 60s)
* vagrant halt
* update vagrant file to have a new serial connection and send to a log (this shows that the default NULL connection can be overridden by a local user).
```
Vagrant.configure("2") do |config|
    NOW = Time.now.strftime("%d.%m.%Y.%H:%M:%S")
    FILENAME = "serial-debug-%s.log" % NOW
    config.vm.box = "focal-test-image"
    config.vm.provider "virtualbox" do |vb|
       vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ]
       vb.customize [ "modifyvm", :id, "--uartmode1", "file", File.join(Dir.pwd, FILENAME) ]
    end
end
```
* run vagrant up and observe that a serial connection has been established, and a log is written. Log should be named `serial-debug-$NOW.log` where $NOW is the current date and time (of the vagrant up)

[Regression Potential]

* possibility of incorrect configuration, leading to either the serial console not engaging properly, or a file being written instead of sent to a NULL file.
* configuration stanza could be incorrect depending upon version(such as File::NULL not being available)

[ORIGINAL DESCRIPTION]

Using `ubuntu/focal64` version `20200423.0.0`, the boot process is extremely slow and the box always reboots once before completing a successful boot. With default settings, this causes a timeout in Vagrant waiting for SSH to be available. Setting a long enough timeout it does eventually boot successfully. For comparison, `ubuntu/bionic64` boots successfully in roughly 10 seconds on my machine.

This appears to be the big hang-up:

Apr 23 14:00:12 ubuntu-focal kernel: [ 0.219784] printk: console [tty1] enabled
Apr 23 14:00:12 ubuntu-focal kernel: [ 76.448371] printk: console [ttyS0] enabled

I have a hard time telling where exactly it reboots because my /var/log/syslog only ends up with the second boot. I can watch it in Virtualbox though and it suddenly reboots, then the second boot process runs to completion. The long delay shown above happens both times making the total time to a successful boot very long. Here is the head of my /var/log/syslog:

Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Remount Root and Kernel File Systems.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Uncomplicated firewall.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Mounting FUSE Control File System...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Mounting Kernel Configuration File System...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Condition check resulted in Rebuild Hardware Database being skipped.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting Flush Journal to Persistent Storage...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Condition check resulted in Platform Persistent Storage Archival being skipped.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting Load/Save Random Seed...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting Apply Kernel Variables...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting Create System Users...
Apr 23 14:00:12 ubuntu-focal systemd-sysctl[434]: Not setting net/ipv4/conf/all/promote_secondaries (explicit setting exists).
Apr 23 14:00:12 ubuntu-focal systemd-sysctl[434]: Not setting net/ipv4/conf/default/promote_secondaries (explicit setting exists).
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished udev Coldplug all Devices.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Mounted FUSE Control File System.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Mounted Kernel Configuration File System.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Flush Journal to Persistent Storage.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Load/Save Random Seed.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Apply Kernel Variables.
Apr 23 14:00:12 ubuntu-focal systemd-sysusers[435]: Creating group systemd-coredump with gid 999.
Apr 23 14:00:12 ubuntu-focal systemd-sysusers[435]: Creating user systemd-coredump (systemd Core Dumper) with uid 999 and gid 999.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting udev Wait for Complete Device Initialization...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Create System Users.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting Create Static Device Nodes in /dev...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Finished Create Static Device Nodes in /dev.
Apr 23 14:00:12 ubuntu-focal systemd[1]: Starting udev Kernel Device Manager...
Apr 23 14:00:12 ubuntu-focal systemd[1]: Started udev Kernel Device Manager.
Apr 23 14:00:12 ubuntu-focal kernel: [ 0.000000] Linux version 5.4.0-26-generic (buildd@lcy01-amd64-029) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #30-Ubuntu SMP Mon Apr 20 16:58:30 UTC 2020 (
Ubuntu 5.4.0-26.30-generic 5.4.30)
Apr 23 14:00:12 ubuntu-focal kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-26-generic root=PARTUUID=fac6339f-01 ro console=tty1 console=ttyS0

Subsequent boots are also slow but don't seem to have the extra reboot.
---
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 May 7 16:02 seq
 crw-rw---- 1 root audio 116, 33 May 7 16:02 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: N/A
CasperMD5CheckResult: skip
DistroRelease: Ubuntu 20.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
Lsusb: Error: command ['lsusb'] failed with exit code 1:
Lsusb-t:

Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
MachineType: innotek GmbH VirtualBox
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 vboxvideodrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-29-generic root=PARTUUID=ff8ca6a0-01 ro console=tty1 console=ttyS0
ProcVersionSignature: Ubuntu 5.4.0-29.33-generic 5.4.30
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-29-generic N/A
 linux-backports-modules-5.4.0-29-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
Tags: focal uec-images
Uname: Linux 5.4.0-29-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 12/01/2006
dmi.bios.vendor: innotek GmbH
dmi.bios.version: VirtualBox
dmi.board.name: VirtualBox
dmi.board.vendor: Oracle Corporation
dmi.board.version: 1.2
dmi.chassis.type: 1
dmi.chassis.vendor: Oracle Corporation
dmi.modalias: dmi:bvninnotekGmbH:bvrVirtualBox:bd12/01/2006:svninnotekGmbH:pnVirtualBox:pvr1.2:rvnOracleCorporation:rnVirtualBox:rvr1.2:cvnOracleCorporation:ct1:cvr:
dmi.product.family: Virtual Machine
dmi.product.name: VirtualBox
dmi.product.version: 1.2
dmi.sys.vendor: innotek GmbH

Related branches

Revision history for this message
Jake Cobb (error404) wrote :
Revision history for this message
Jake Cobb (error404) wrote :

To give concrete time comparison on my machine, `ubuntu/bionic64` (20200416.0.0) boots in about 30 seconds, where `ubuntu/focal64` (20200423.0.0) takes over 4 minutes. That is on boots of a machine that has already booted once so does not include the initial reboot problem.

Revision history for this message
Robert C Jennings (rcj) wrote :

Thank you for reporting this bug, we will attempt a local recreate and debug.

tags: added: id-5ea301b83ffa4678e126f5af
Revision history for this message
Reinier Schoof (reinier-skoef) wrote :

The slowness seems to be caused by outputting the kernel logs to the serial console. If I remove the `console=ttyS0` part from the kernel command line and reboot, it's as fast as the bionic box. However, the bionic box also has the `console=ttyS0` parameter set, so it doesn't necessarily explain why it causes the slowdown.

Revision history for this message
Reinier Schoof (reinier-skoef) wrote :

For reference, the delay after the initial message a console is enabled is gone:

[ 0.173966] Console: colour VGA+ 80x25
[ 0.217870] printk: console [tty1] enabled
[ 0.218068] ACPI: Core revision 20190816
[ 0.218382] APIC: Switch to symmetric I/O mode setup

Revision history for this message
Reinier Schoof (reinier-skoef) wrote :

The error message when it reboots the first time is:

VFS: Cannot open root device "PARTUUID=e5a81102-01" or unknown-block(0,0): error -6
Please append a correct "root=" boot option: here are the available partitions:

and then no partitions are shown but instead the kernel panics

Revision history for this message
John Chittum (jchittum) wrote :

To aid in debugging more, could anyone affected provide the following information:

1. relevant Vagrantfile info (are you doing any major customizations?)
2. vagrant version
3. virtualbox version
4. host OS
5. relevant hw specs (as hardware can affect boot times)
6. run the vagrant up commands as `time vagrant up`. If you have to pull the box new try to time the download so it doesn't affect the timings too much
7. vagrant ssh in and run the following:

* systemd-analyze
* cloud-init analyze show

Those will show more info about boot timings and cloud-init timings.

8. if possible, run with a serial console debug printed to file. you can do that by adding the following snippet to your Vagrantfile

  NOW = Time.now.strftime("%d.%m.%Y.%H:%M:%S")
  FILENAME = "serial-debug-%s.log" % NOW
  config.vm.provider "virtualbox" do |vb|
     vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ]
     vb.customize [ "modifyvm", :id, "--uartmode1", "file",
     File.join(Dir.pwd, FILENAME) ]
  end

I have reproduced the double booting issue. However in testing, I'm not seeing extreme slow boots on focal. My testing is showing ~30s slower boot time compared to Bionic, which is in line with seeing a kernel panic and reboot. All tests used the same basic Vagrantfile, only setting teh box and adding the debug serial log. I primarily tested with the latest versions of the Vagrant boxes, however I also tried the reported box in this case and found no difference from the other tests. My test setups

Test Setup1 (direct on hw):

system76 Gazellev2
Intel® Core™ i7-9750H CPU @ 2.60GHz × 12
32gb 3300Hz RAM
Ubuntu 18.04
Vagrant 2.2.7
Virtualbox 6.1.6

Tests was a simple Vagrantfile booting a basic box with a serial console debug printed to file. Command was

`time vagrant up`

bionic: 32s
eoan: 52s
focal: 49s

Test Setup2 virtualized:

To create a clean environment for testing, I used separate multipass VMs

1. multipass launch -m 6gb --name $SUITE-vagrant release:bionic
2. multipass mount ./ /vagrant # this was to ensure I had the Vagrantfile and could get the console log
3. multipass exec $SUITE-vagrant bash
4. sudo apt-get update && sudo apt-get dist-upgrade
5. exit and multipass restart $SUITE-vagrant
6. exec back in, sudo apt-get install virtualbox vagrant
7. time vagrant up

These times were slower due to the virtualization, but had similar deltas. Times have had the download of the boxes removed

bionic: 1:23
eoan: 1:46
focal: 1:50

Considering the double boot issue, the difference between bionic and focal makes sense. Please add more info to help identify further issues than the kernel panic and reboot

Changed in cloud-images:
status: New → Confirmed
Revision history for this message
Reinier Schoof (reinier-skoef) wrote :

As soon as I enabled the serial device, given the Vagrantfile snippet, booting the system was as quick as expected. Relevant output from the serial log:

[ 0.488522] Console: colour VGA+ 80x25
[ 0.540584] printk: console [tty1] enabled
[ 0.902395] printk: console [ttyS0] enabled
[ 0.904230] ACPI: Core revision 20190816
[ 0.906123] APIC: Switch to symmetric I/O mode setup

I'm running vagrant 2.2.7 with virtualbox 6.0.14 on macOS Catalina 10.15.3 on a MacBook Pro. No real changes to Vagrantfile after `vagrant init`.

It seem like output logs to a non-attached serial device is 'expensive' where it wasn't on the bionic system. This relative old issue was recently updated with similar problems about ubuntu/eoan64: https://www.virtualbox.org/ticket/12162

Revision history for this message
Jake Cobb (error404) wrote :

Two 2015 Macbook Pros, both:

2.5 GHz Quad-Core Intel Core i7
16 GB 1600 MHz DDR3

Catalina, Vagrant 2.2.5, VirtualBox 6.0.10
Sierra, Vagrant 2.2.6, VirtualBox 6.0.20

No difference in behavior between these two.

Vagrantfile from `vagrant init ubuntu/focal64` fresh, only modified to add:

  config.vm.boot_timeout = 1800

Otherwise it times out after 5 minutes. I ran once to get the box downloaded, then `vagrant destroy`, then timed the up:

real 5m8.388s
user 0m12.358s
sys 0m9.439s

systemd-analyze:
Startup finished in 44.743s (kernel) + 37.387s (userspace) = 1min 22.130s
graphical.target reached after 36.319s in userspace

cloud-init analyze show (full output as attachment):
Total Time: 1.64900 seconds

Revision history for this message
Jake Cobb (error404) wrote :

As for Reinier, adding the serial logging to file resulted in a reasonable boot time:

real 0m45.161s
user 0m5.788s
sys 0m3.188s

Revision history for this message
Jake Cobb (error404) wrote :

Just for completeness, with the serial logging to file, the analyze times:

systemd-analyze:
Startup finished in 4.279s (kernel) + 24.547s (userspace) = 28.826s
graphical.target reached after 23.436s in userspace

cloud-init analyze show:
Total Time: 1.86400 seconds

Revision history for this message
Jake Cobb (error404) wrote :

systemd-analyze blame for a slow boot.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1874453

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
John Chittum (jchittum) wrote :

Attached are some prints and logs. lshw shows that it's using a scsi storage control with the mptspi driver. serial console logs show that initramfsless boots using a a generic scsi driver which appears to fail to load the root device. initramfs pulls in the mptspi driver properly, and things boot.

system details

host OS: Focal
vagrant: 2.2.7
virtualbox: 6.1.6_Ubuntu r137129

Revision history for this message
Jake Cobb (error404) wrote :

Same behavior on eoan, both the one-off kernel panic and slow boot without overriding the console settings.

tags: added: eoan
Revision history for this message
Jake Cobb (error404) wrote : CurrentDmesg.txt

apport information

tags: added: apport-collected uec-images
description: updated
Revision history for this message
Jake Cobb (error404) wrote : Lspci.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : Lspci-vt.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : PciMultimedia.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : ProcModules.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : UdevDb.txt

apport information

Revision history for this message
Jake Cobb (error404) wrote : WifiSyslog.txt

apport information

Revision history for this message
John Chittum (jchittum) wrote :

Vagrant boxes with updated kernels have been pushed. My current tests are showing boot times in the 30s range.

Could folks commenting on slow boot speeds in Focal and Eoan please run a `vagrant box update` on their current setups?

https://app.vagrantup.com/ubuntu/boxes/eoan64/versions/20200522.0.0

and

https://app.vagrantup.com/ubuntu/boxes/focal64/versions/20200518.0.0

Have the latest kernels. Any produced after those dates should also load faster.

Revision history for this message
Reinier Schoof (reinier-skoef) wrote :

It's still very slow. When I don't attach a serial device like you gave an example for to the virtualbox, it still has that 90s pause between mentioned 'printk' lines during booting and booting is slow in general.

When I do attach the serial device, it still panics the first time, but the whole thing is so fast that vagrant doesn't notice or times out. So that setup is now workable. Still doesn't explain why it behaves so strange without a serial device.

Revision history for this message
John Chittum (jchittum) wrote :

is this post updating the boxes? could you please post which boxes are still kernel panicking, as that should have been fixed with https://bugs.launchpad.net/cloud-images/+bug/1873506

I'm having a bear of a time trying to reproduce the slow boots. I'm wondering if it's host specific to MacOS and Virtualbox on MacOS. I haven't been able to reproduce with Ubuntu as my host machine.

Revision history for this message
Jake Cobb (error404) wrote :

Both problems persist on 20200522.0.0, here is the kernel panic and reboot:

[ 0.883971] VFS: Cannot open root device "PARTUUID=43931bda-01" or unknown-block(0,0): error -6
[ 0.885744] Please append a correct "root=" boot option; here are the available partitions:
[ 0.887371] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.888956] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-31-generic #35-Ubuntu
[ 0.890427] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 0.891863] Call Trace:
[ 0.892322] dump_stack+0x6d/0x9a
[ 0.892865] panic+0x101/0x2e3
[ 0.893466] mount_block_root+0x23f/0x2e8
[ 0.894322] mount_root+0x38/0x3a
[ 0.894984] prepare_namespace+0x13f/0x194
[ 0.895776] kernel_init_freeable+0x231/0x255
[ 0.896743] ? rest_init+0xb0/0xb0
[ 0.897499] kernel_init+0xe/0x100
[ 0.898283] ret_from_fork+0x35/0x40
[ 0.899058] Kernel Offset: 0x32400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 0.900974] ACPI MEMORY or I/O RESET_REG.
[ 0.000000] Linux version 5.4.0-31-generic (buildd@lgw01-amd64-059) (gcc version 9.3.0 (Ubuntu 9.3.0-10ubuntu2)) #35-Ubuntu SMP Thu May 7 20:20:34 UTC 2020 (Ubuntu 5.4.0-31.35-generic 5.4.34)

Revision history for this message
John Chittum (jchittum) wrote :

I've set the kernel component to confirmed to move this ticket forward.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
John Chittum (jchittum) wrote :

Was able to recreate the slow boot. Put together a clean environment, made a simple focal Vagrantfile only pulling the box, saw the same things. Confirmed on a Focal host.

Revision history for this message
John Chittum (jchittum) wrote :
Revision history for this message
Moshe Katz (kohenkatz) wrote :

Re #28

I have both problems using the 20200522.0.0 image on a Windows host.

Host specs: Lenovo Thinkpad P50, Intel Core i7-6700HQ, Windows 10 v1909, tried with both VirtualBox 6.0.4 and 6.1.8.

Full details at https://askubuntu.com/questions/1243582/unable-to-boot-official-vagrant-boxes-for-20-04

Revision history for this message
Moshe Katz (kohenkatz) wrote :

Interestingly, using the VirtualBox GUI to boot a VM created from the 20200522.0.0 vagrant box causes the entire VirtualBox GUI to hang. The only way out of it is to force quit the VM processes.

Revision history for this message
Moshe Katz (kohenkatz) wrote :

Note: My comment (#34) above is in VirtualBox 6.1.8 and without the Serial port redirection mentioned in comment #7. With the serial port redirection everything seems to work properly for me.

Revision history for this message
Ashish Nigam (nigam214) wrote :

added the following lines in Vagrantfile to fix the issue temporary. It wasn't required in Ubuntu 18.04

vbit.vm.provider "virtualbox" do |vb|
  vb.customize [ "modifyvm", :id, "--uartmode1", "file", File::NULL ]
end

Revision history for this message
Brian Pontarelli (brian-pontarelli) wrote :

I just tried firing up 20200629.0.0 and it has the same issue. I'm also not able to terminate the VMs at all. I've waiting 10+ minutes for them to stop and nothing. I eventually just killed the PIDs for everything VirtualBox related.

Revision history for this message
Dave McNeill (davemac30) wrote :

Since I just hit this issue and found this thread amazingly useful, I thought it might help if I post a theory about what's going on here.

The official ubuntu/bionic64 box was configured, as a default, to log console output to a file in the Vagrantfile directory. The embedded Vagrantfile (on my system, currently this is `~/.vagrant.d/boxes/ubuntu-VAGRANTSLASH-bionic64/20200701.0.0/virtualbox/Vagrantfile`) contains the following:

  config.vm.provider "virtualbox" do |vb|
     vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ]
     vb.customize [ "modifyvm", :id, "--uartmode1", "file", File.join(Dir.pwd, "ubuntu-bionic-18.04-cloudimg-console.log") ]
  end

When an instance of this box boots, the kernel command line contains `console=ttyS0` so console output is sent to ttyS0 which is virtualbox serial port 1, connected to the file specified. Everything works fine.

The official ubuntu/eoan64 and ubuntu/focal64 boxes, on the other hand, have this:

  config.vm.provider "virtualbox" do |vb|
     vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ]
# Creating a console log file is not an expected behavior for vagrant boxes. LP #1777827
# vb.customize [ "modifyvm", :id, "--uartmode1", "file", File.join(Dir.pwd, "ubuntu-focal-20.04-cloudimg-console.log") ]
  end

So someone decided not to log console output at some point. Makes sense - vagrant boxes don't tend to do that by default.

Problem is, when the change was made to the Vagrantfile the kernel command prompt wasn't also changed to reflect that console -> ttyS0 was no longer required. In fact, since virtualbox was not even configuring a serial port on the machine this configuration appears to be massively detrimental to boot performance (I guess because every line the kernel tries to send to the kernel errors). This explains, I think, why the workaround given by @nigam214 works - it creates a dummy serial port so at least the kernel is happy.

I think these boxes should be configured at build-time not to include `console=ttyS0` in the kernel command line.

Revision history for this message
Stanislav German-Evtushenko (giner) wrote :
Revision history for this message
Stanislav German-Evtushenko (giner) wrote :
John Chittum (jchittum)
affects: linux (Ubuntu) → livecd-rootfs (Ubuntu)
John Chittum (jchittum)
Changed in cloud-images:
assignee: nobody → John Chittum (jchittum)
Revision history for this message
John Chittum (jchittum) wrote :

MP, taking giner's work and integrating all the needs for livecd-rootfs SRU process is going:

https://code.launchpad.net/~jchittum/livecd-rootfs/+git/livecd-rootfs/+merge/389918

Should be moving in Groovy soon, and then SRU'd back to Focal.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package livecd-rootfs - 2.687

---------------
livecd-rootfs (2.687) groovy; urgency=medium

  [ Stanislav German-Evtushenko <giner> / John Chittum ]
  * Send Vagrant serial connection to NULL. (LP: #1874453)

 -- Robert C Jennings <email address hidden> Tue, 15 Sep 2020 15:25:19 -0500

Changed in livecd-rootfs (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
John Chittum (jchittum) wrote :

fix in livecd-rootfs/master has been merged, built, and tested. Groovy boxes with any serial with a date 20200917 or newer should have the fix (we're working on getting this published now). I am also beginning the SRU procedure to backport the change into Focal.

Thank you giner for the providing the code to pass a serial to a null file handler in our base Vagrantfile.

Revision history for this message
John Chittum (jchittum) wrote :

Updated Groovy Vagrant box has been pushed to Atlas, and tested. Steps below, Vagrantfile at bottom:

1. create a minimal Vagrantfile for testing
2. `vagrant box update`
3. check box is at a serial 20200917 or newer (if done now, 20200917 should download)
4. `vagrant up`
5. observe it now boots faster (on my system, < 40s)

Vagrantfile

# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/groovy64"
end

Will begin SRU cycle for Focal with successful test of the public Vagrant box.

Revision history for this message
John Chittum (jchittum) wrote :

Also, for anyone that has run a Focal or Groovy box, and had it hang completely with it not showing up in Virtualbox GUI for stopping, the following can get the image unlocked for future work (without a reboot)

vboxmanage startvm <UUID or name> --type emergencystop
vboxmanage unregistervm <UUID or name>

If you're unsure of a UUID or name, try running a Vagrant command (such as vagrant halt). If it fails, Vagrant will print the command used which includes the UUID.

John Chittum (jchittum)
description: updated
John Chittum (jchittum)
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Jake, or anyone else affected,

Accepted livecd-rootfs into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/livecd-rootfs/2.664.7 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in livecd-rootfs (Ubuntu Focal):
status: New → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
John Chittum (jchittum) wrote :

SRU for Focal testing.

Created 2 Vagrant boxes to cover the 2 main build cases of CPC (produced of the Vagrant image and user of the hook changed)

box A: Created via ubuntu-bartender(https://github.com/chrisglass/ubuntu-old-fashioned/tree/master/scripts/ubuntu-bartender) utilizing the source download of the proposed package (found here in the web UI). ran bartender using multipass as a provider with the following command

./ubuntu-bartender --livecd-rootfs-dir /home/jchittum/dev01/troubleshooting/vagrant-testing/1874453/livecd-rootfs --build-provider multipass -- --series focal --project ubuntu-cpc --image-target vagrant

box B: Created via a Launchpad build utilizing a recipe pointed to ~livecd-rootfs ubuntu/focal. This created a livecd-rootfs package in a personal PPA which was then used to run the build.

boxes were then added to the vagrant inventory

vagrant box add <downloaded box> --name focal-sru-test-from-(bartender || lp)

2 directories were then created from-bartender & from-lp

Ran the following 2 test cases:

[Test Case 1]

-- vagrant up with minimal configuration --

1. made minimal Vagrantfiles in each directory
```
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
    config.vm.box = "focal-sru-test-from-bartender" # Or from-lp
end
```
2. ran `time vagrant up`
3. observed that boxes successfully booted
4. observed that boxes booted in a reasonable amount of time (on my machine ~30s)
5. ensured the ability to SSH into the box
6. checked the guest system has mounted the default guest directory (./ on host mounted to /vagrant on guest)
7. exited host and destroyed boxes (vagrant halt && vagrant destroy)

Both boxes worked.

[Test Case 2]
1. Update Vagrantfiles to override default serial connection and write file

```
Vagrant.configure("2") do |config|
    NOW = Time.now.strftime("%d.%m.%Y.%H:%M:%S")
    FILENAME = "serial-debug-%s.log" % NOW
    config.vm.box = "focal-sru-test-from-bartender" # Or from-lp
    config.vm.provider "virtualbox" do |vb|
       vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ]
       vb.customize [ "modifyvm", :id, "--uartmode1", "file", File.join(Dir.pwd, FILENAME) ]
    end
end
```
2. ran `time vagrant up`
3. Observed that boxes successfully booted
4. observed the log file was created
5. observed that boxes booted in a reasonable amount of time (on my machine ~30s)
6. ensured the ability to SSH into the box
7. checked the guest system has mounted the default guest directory (./ on host mounted to /vagrant on guest)

John Chittum (jchittum)
tags: added: verification-done-focal
removed: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package livecd-rootfs - 2.664.7

---------------
livecd-rootfs (2.664.7) focal; urgency=medium

  [ Stanislav German-Evtushenko <giner> / John Chittum ]
  * Send Vagrant serial connection to NULL. (LP: #1874453)

 -- Robert C Jennings <email address hidden> Wed, 23 Sep 2020 13:32:32 -0500

Changed in livecd-rootfs (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for livecd-rootfs has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
John Chittum (jchittum) wrote :

I've confirmed that the latest Vagrant box published has the fix. I've also confirmed that users can overwrite the default passing of the serial console to a File::NULL and write to a different file.

For everyone affected, run a `vagrant box update` on Focal boxes. This will grab the latest box. If you cannot update your base, using the workaround of attaching a NULL handler will work. This is documented in the code attached to this MP, as well as below

Vagrant cloud link to latest:

https://app.vagrantup.com/ubuntu/boxes/focal64/versions/20201012.0.0

configuring a NULL file handler

```
  config.vm.provider "virtualbox" do |vb|
    vb.customize [ "modifyvm", :id, "--uart1", "0x3F8", "4" ]
    # Create a NULL serial port to skip console logging by default
    vb.customize [ "modifyvm", :id, "--uartmode1", "file", File::NULL ]
  end
```

Revision history for this message
Wal T (vooose) wrote :

@jchittum, confirmed also your fix using fresh `vagrant init ubuntu/focal64` and `vagrant up`
thx

Revision history for this message
Jake Cobb (error404) wrote :

Confirmed, thanks for your work on this.

Revision history for this message
Haggai Eran (haggai-eran) wrote :

@jchittum, thanks for the fix. I validated it and it seemed to work.

I came into a related issue however - I've created an image that is based on ubuntu/focal64 on a Linux system. When trying to use it on a Windows system, I got the following error:

```
Stderr: VBoxManage.exe: error: RawFile#0 failed to create the raw output file /dev/null (VERR_PATH_NOT_FOUND)

```

I believe that Ruby's File::NULL constant is translated into /dev/null when utilizing the focal64 image on Linux, and then it gets written into the resulting .ovf file.

Revision history for this message
Stanislav German-Evtushenko (giner) wrote :

@haggai-eran, what if you try to disable the port completely?
```
  config.vm.provider "virtualbox" do |vb|
    vb.customize [ "modifyvm", :id, "--uart1", "off" ]
  end
```

Revision history for this message
John Chittum (jchittum) wrote :

@haggan-era, could you open a new bug on this issue? Please provide the following information:

1. version of Windows, vagrant, virtualbox, and Ruby in use on the Windows computer
2. the Vagrantfile used to create the new .box
3. any steps to reproduce the issue.

On first smell it seems like a Ruby on Windows issue handling File::NULL. that was added Ruby 1.9.3 I think. I'd surmise that if you took the Vagrantfile, and ran the operation there, it'd work fine. Give that a try. If it is translating the object to a specific path reference, then that means File::NULL is platform dependent on creation of the box.

Revision history for this message
Haggai Eran (haggai-eran) wrote :

@giner, I tried setting the uart1 to "disconnected" in the past, but then the VM on Windows slows down to a halt.

@jchittum
After investigating it, I saw that indeed the OVF file contains "/dev/null". I used packer in the process, which basically does if I understand correctly creates a VM based on the Ubuntu vagrant box, modifies the VM, and then uses vagrant package to generate a new box. Therefore, I'm not sure this is an issue with Ubuntu.

If I just try to run the Ubuntu VM on Windows, I see that `File::NULL` is translated into `NUL`, and the VM seems to be working fine.

I think I'll try to include the line `vb.customize [ "modifyvm", :id, "--uartmode1", "file", File::NULL ]` in the packer-generated vagrantfile somehow.

Revision history for this message
Stanislav German-Evtushenko (giner) wrote :

@haggan-era, uartmode disconnected means that the port is available for the vm. If it uart set to of them then the port will not be available at all. In past the os wouldn't boot in this case (this is why the dummy port was introduced) at all however the latest images seem to work.

Revision history for this message
Stanislav German-Evtushenko (giner) wrote :

@haggan-era, uartmode disconnected means that the port is available for the vm. If uart is set to off then the port will not be available at all. In past the os wouldn't boot in such case (this is why the dummy port was introduced) however the latest images seem to work.

Revision history for this message
Haggai Eran (haggai-eran) wrote :

Thanks @giner, that's good to know.

If this is the case, why doesn't the official focal image use this solution (rather than setting the serial port to File::NULL)?

Revision history for this message
Stanislav German-Evtushenko (giner) wrote :

@haggan-era, I don't think it was possible before but now turning the port off completely could make sense. The following links seem to be related:
- https://bugs.launchpad.net/cloud-images/+bug/1573095/comments/58
- https://bugs.launchpad.net/cloud-images/+bug/1573095/comments/59

Revision history for this message
John Chittum (jchittum) wrote :

I've actually done some checks on latest versions, and it looks like we can. I'll make a public bug to match the internal ticket I made to remove the connection.

We've got a release tomorrow (20201022), so the change will go into the next devel release first, then get SRU'd back.

Revision history for this message
John Chittum (jchittum) wrote :

I've opened https://bugs.launchpad.net/cloud-images/+bug/1900852 related to removing uart1 connections in the base Vagrantfile. We can move discussion there related to the issue found for derivative images via Packer, and generally removing it.

I, too, had been under the impression that due to the console=ttyS0 line in the kernel setup that we wouldn't be able to just not have the connection. but I now understand that it was our initiating of the connection and then not sending it anywhere that was causing the hangups.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.