Code review comment for ~crass/casper:fix-unsafe-mount-iso-loopback

Revision history for this message
Glenn Washburn (crass) wrote :

> Ok looking at that bug report, I see that it says:
>
> > When a hibernated windows partition exists the ntfs3g mounter will refuse to
> mount the partition in
> > read/write mode and returns an error. This will cause the search script to
> die and the boot process to
> > stop, bringing the system to a debugging shell in the initrd.
>
> But it's straightforward to just mount the filesystem read-only instead, and

Yes, more straight forward for NTFS, but not general. Are you suggesting have special logic for NTFS?

You appear to understand that replaying a journal of on a hibernated filesystem will corrupt the resume. I'm not sure if ntfs3g replays the journal when mounting in readonly, but the kernel does for ext3/4. I believe I've seen BTRFS fail to mount on readonly block devices, so I think there's a similar problem there. And I'm unsure of XFS, but let's error on the safe side. This appears to be why the allowed filesystems for the persistent storage file are limited to FAT[1] (though it could and should be opened to more filesystems).

As you'll note, the merge description does not mention the scenario of the linked bug report, which is not a data loss scenario. That was intentional. The bug was linked only because it is also fixed by these changes. The primary intent of these changes is to fix the data loss issue.

> you're using a much bigger hammer here to disallow even ATTEMPTING to mount
> the filesystem by default.

Yes, that is intentional for safety reasons. Its more important to err on the side of not allowing ISOs on some filesystems by default, than to risk data-loss from a corrupted hibernation image.

Now, there are techniques that will allow the mounting of any filesystem read/write without writing any data to the physical block device. But that requires more complexity (and additional kernel modules) and because of that I figured it would be harder to be accepted. Another option that is simpler but I'm more wary of is setting the block device itself to readonly (blockdev --setro). Care would need to be taken that block devices can not be used before being set to readonly. And code would need to be added to remove the readonly status for the persistent storage block device (maybe others). This solution is simpler and I think strikes a good trade-off of by default more limited functionality for avoidance of data loss, with the option to go back to the old behavior if really needed.

> And the net result is not a better user experience
> for the user vs what is described in that bug report, because you say the
> ntfs3g filesystem fails to mount and thus blocks the boot - which is the same
> result as when casper refuses to try to mount it at all.

Your statement is only true _if_ the ISO is on the hibernated NTFS filesystem. Using ISOs on hibernated filesystems are not supported with this patch. As noted above, the more complex solution that allows file access on hibernated filesystems without writing to them would allow this to be supported. But this is not that important to me personally.

A more common case is that the ISO will be on a non-journalled filesystem (eg. FAT). Currently, this would still be a problem when there also exists a hibernated journalled filesystem (eg. EXT4 mounted while hibernated). The EXT4 filesystem could be mounted by the iso scan, which may corrupt the hibernation image. With the changes, by default in that situation, the EXT4 filesystem will be skipped instead of mounted, and the ISO will be found on the FAT filesystem. In that case, the improved user experience is no data loss from a corrupted hibernation image.

This also fixes the linked bug's scenario. The typical scenario is the user of windows wants to check out Ubuntu, but is non-commital. So they create a supergrub disk, download the Ubuntu ISO to that usb drive, hibernate windows, and boot into the ISO from supergrub disk. As the linked bug explains, currently the boot of the ISO can go haywire, but with these changes things work fine.

> I see that you point out that casper actually scans filesystems to find the
> image, so it's not a matter of having wrongly told casper to look at a
> SPECIFIC device that has a dirty journal. Nevertheless, the bug report does
> not support the changes proposed here.

I take you to mean that the linked bug report is for a much narrower issue than what these changes address. This is true. I added the linked bug report to merely show an issue that these changes resolve, not to carry the full weight of supporting these changes. I had no way of knowing that you'd add the requirement of a bug report. So, as asked in response to your initial comment, are you wanting me to create a bug report for the data loss scenario? Since you have a better idea of what you're looking for in the bug report, perhaps it would be easier if you create and link it.

This merge is for a general issue with data-loss from hibernation image corruption. The linked bug report happens to not cause the hibernation image corruption because the ntfs3g mounter does not allow a hibernated NTFS filsystem to be mounted read-write (not sure about readonly), but the linux kernel has no such issues (for other filesystems, I'm unsure about the NTFS kernel driver). So the user lucks out there. If the hibernation image was on EXT4, the user is not so lucky. So, yes, these changes fix a broader class of issues than just that specific scenario in the linked bug report, though the linked bug is fixed as well.

I can see how the linked bug report fails to make clear the data loss potential. However, this merge proposal does, and I'm a little disappointed in and uneasy about the cavalier attitude to data loss, especially when I'm providing a free fix. I hope this is not representative of how Ubuntu treats data loss bugs in "uncommon" situations.

I wonder how many users trying out the Ubuntu livecd over the years have had data-loss and corruption, but they just didn't connect it with the livecd. Or if they did, couldn't be bothered to create a launchpad account and submit a bug, a non-trivial amount of time for someone who might be new to Ubuntu.

[1] https://git.launchpad.net/casper/tree/scripts/casper-helpers#n221

« Back to merge proposal