snappy update: fsck errors

Bug #1435774 reported by Yung Shen
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Snappy
Invalid
High
James Hunt

Bug Description

Environment:

base: system-image.ubuntu.com
channel: ubuntu-core/devel-proposed
device: generic_armhf
build_number: 185

Steps to reproduce:

# from ubuntu-core 184 to 185
sudo snappy-go update (using snappy-go because of this: bug 1433485)
sudo reboot

# system bring up correctly

# from ubuntu-core 185 to 186

ubuntu@localhost:/etc/system-image$ sudo snappy-go update
Installing ubuntu-core (186)
Starting download of ubuntu-core
138.44 KB / 138.44 KB [================================================================================================================================================================================] 100.00 % 3.95 KB/s
Done
error from system-image-cli: ('%s exited with %s:\n%s', ['ubuntu-core-upgrade'], 1, "['/sbin/fsck', '-M', '-av', b'/dev/mmcblk0p2'] returned 4 (failed): fsck from util-linux 2.25.2\nsystem-a contains a file system with errors, check forced.\nsystem-a: Inode 9096 (...) has invalid mode (014).\n, \n\nsystem-a: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.\n\t(i.e., without -a or -p options)\n\nI: preparing root filesy")

Related branches

Yung Shen (kaxing)
Changed in snappy-ubuntu:
status: New → Confirmed
Yung Shen (kaxing)
description: updated
Revision history for this message
Alexander Sack (asac) wrote :

ok, so few things I believe that should be looked at:

1. reduce risk of corruption of other partition during update -> mount it with "sync" option as well

2. try harder to recover from corruption
  -> if normal fsck fails, maybe try harder
  -> if that also fails or if try harder is considered not clean -> mkfs and recreate from scratch

Changed in snappy-ubuntu:
importance: Undecided → High
assignee: nobody → James Hunt (jamesodhunt)
Revision history for this message
Alexander Sack (asac) wrote :

also we might want to think about what we do in worst case where even mkfs cannot do its job anymore... in that case I would assume we need to have a special system mode like "requires maintenance" that then can be surfaced somehow to user by appliance manufacturer...

Revision history for this message
Alexander Sack (asac) wrote :

along these lines we should think about how to reproduce these cases (e.g. how to forcefully corrupt the FS to ensure we can test this works).

Revision history for this message
Michael Vogt (mvo) wrote :

Lool mentioned on irc that mounting "sync":

<lool> (it will slow down the updates greatly and will wear the media faster)

so maybe a sync at the end of the update is the right compromise.

Revision history for this message
Alexander Sack (asac) wrote :

for testing we might want to play with:

11:05 < asac> like dd if=/dev/random of=/dev/partition :)
or /dev/null

Revision history for this message
Alexander Sack (asac) wrote :

err, /dev/zero not /dev/null for getting zeros

Revision history for this message
Alexander Sack (asac) wrote :

subscribed pitti so he can help us on our testing story here as part of our system update testing thread.

Revision history for this message
James Hunt (jamesodhunt) wrote :

> # system bring up correctly
Do you recall seeing any errors on boot when you first booted the image written using ubuntu-device-flash?

Is it possible you had previously attempted to update from 185 -> 186 and the power failed or the device was unplugged?

If not, either your media has "gone bad", or was always bad but you were lucky enough to boot from the "system-a" partition initially.

To resolve the situation, you have a few options. Tthe normal caveats apply - if I were you, I'd back up any writable data you have on the device before proceeding:

(1) Run fsck manually:

# check that the partition really is one of the read-only rootfs's first
$ sudo blkid|grep "/dev/mmcblk0p2.*LABEL=\"system-a\"" && echo OK || echo FAIL

$ sudo umount /dev/mmcblk0p2
$ sudo fsck -M -yv /dev/mmcblk0p2
(answer any prompts - the 'y' option should do everything for you though)
$ sudo reboot
$ sudo snappy-go update

(2) Reformat the "system-a" rootfs:

# check that the partition really is one of the read-only rootfs's first
$ sudo blkid|grep "/dev/mmcblk0p2.*LABEL=\"system-a\"" && echo OK || echo FAIL

$ sudo umount /dev/mmcblk0p2
$ sudo mkfs.ext4 -v -L system-a /dev/mmcblk0p2
$ sudo reboot
$ sudo snappy-go update

(3) Reflash the entire disk using ubuntu-device-flash

(4) Flash an image to a different disk using ubuntu-device-flash.

John Lenton (chipaca)
Changed in snappy-ubuntu:
status: Confirmed → Incomplete
Michael Terry (mterry)
affects: snappy-ubuntu → snappy
Revision history for this message
Leo Arias (elopio) wrote :

We no longer have separate partitions. We have been testing many times a day the update of ubuntu-core without any errors like this one. Marking as invalid.

Changed in snappy:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.