Merge lp:~dave-martin-arm/linaro-image-tools/proper-cleanup into lp:linaro-image-tools/11.11

Proposed by Dave Martin
Status: Merged
Merged at revision: 161
Proposed branch: lp:~dave-martin-arm/linaro-image-tools/proper-cleanup
Merge into: lp:linaro-image-tools/11.11
Diff against target: 167 lines (+103/-3)
1 file modified
linaro-media-create (+103/-3)
To merge this branch: bzr merge lp:~dave-martin-arm/linaro-image-tools/proper-cleanup
Reviewer Review Type Date Requested Status
Steve Langasek (community) Needs Fixing
Review via email: mp+39030@code.launchpad.net

Commit message

I've tried to include appropriate sanity-checks, but since this change will attempt to unmount and delete stuff, careful review is recommended.

To post a comment you must log in.
Revision history for this message
Peter Maydell (pmaydell) wrote :

Can we have some more indentation in the cleanup_handler() function, please?

I don't think you need to go to the effort of
 status=$?
 [..]
 exit $status

in cleanup_handler() -- since it's a trap if you just fall off the end of it then the shell will exit with the exit status from before invocation of the trap handler.
(POSIX also says you can get this behaviour if you end the trap handler with a plain "exit", but this is broken in dash: https://bugs.launchpad.net/dash/+bug/647450)

157. By Dave Martin

Tidy cleanup code formatting to be consistent with the rest of the script.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

> Can we have some more indentation in the cleanup_handler() function, please?

Pushed as -r157

I thought I was being consistent, but I was looking at prepare_partitions() which has the same weird (non) indentation (unlike every other function...)

Other minor tidyups also to make the style more consistent.

>
> I don't think you need to go to the effort of
> status=$?
> [..]
> exit $status
>
> in cleanup_handler() -- since it's a trap if you just fall off the end of it
> then the shell will exit with the exit status from before invocation of the
> trap handler.

I didn't know that, could be useful.

In this case however, the same handler is used for INT/TERM as well as EXIT.
In the INT/TERM case I believe it's essential to exit the shell explicitly, otherwise you just fall back out into the interrupted code...

If you feel strongly, we could move the exit out into signal_handler(). My own opinion is that the explicit exit is a good thing (or at least neutral), because it avoids any doubt about what may happen when the handler finishes (but I'm happy to be contradicted).

> (POSIX also says you can get this behaviour if you end the trap handler with a
> plain "exit", but this is broken in dash:
> https://bugs.launchpad.net/dash/+bug/647450)

#!/bin/bash (already)

exit $status would work around this too if we needed to(?)

158. By Dave Martin

Actually trap interrupts to signal_handler, as originally intended.

Revision history for this message
Loïc Minier (lool) wrote :

If you don't do anything special on SIGINT, then don't trap it; you will still be called when the script exits if you setup an EXIT handler.

e.g.:
cleanup() {
    echo in cleanup
}

trap cleanup EXIT
cat
echo after sleep

^C will output only "in cleanup" and wont resume execution after cat

if however you trap SIGINT:
cleanup() {
    echo in cleanup
}

trap cleanup EXIT INT
cat
echo after sleep

then ^C will run the cleanup handler, then resume execution -- unless you're in set -e mode, in which case it does exit.

So to sum up: it should work because we're in set -e, but if you don't need to handle SIGINT specially, then don't handle it at all.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

> If you don't do anything special on SIGINT, then don't trap it; you will still
> be called when the script exits if you setup an EXIT handler.

Hmmm, I can't find anything about this in bash (1), and POSIX also doesn't specify precisely when the EXIT trap will execute.

http://www.opengroup.org/onlinepubs/009695399/utilities/trap.html
http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html
http://www.opengroup.org/onlinepubs/009695399/utilities/set.html

Perhaps unsurprisingly therefore, bash and dash don't have consistent behaviour on this point:

$ bash ./loïc\'s\ script
^Cin cleanup

$ dash ./loïc\'s\ script
^C

... so I prefer not to rely on this behaviour.

Also, we want to ensure that a non-zero exit status is returned if the script is interrupted - I wasn't sure whether we could rely on this unless we return the appropriate status explicitly...

Did you find some more specific documentation you can point me at?

Revision history for this message
Loïc Minier (lool) wrote :

Hmm right, dash and bash differ in the handling of SIGINT; crap

However, if you trap SIGINT in set -e mode, they seem to behave identically

I don't have a better reference, sorry

Revision history for this message
Matt Waddel (mwaddel) wrote :

One minor thing:

+unset IMAGE_FILE

isn't needed as it's already done as part of the global "unset" earlier in the script.

159. By Dave Martin

Remove unnecessary extra unset IMAGE_FILE

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

> +unset IMAGE_FILE
>
> isn't needed as it's already done as part of the global "unset" earlier in the
> script.

Duh - I thought I'd checked to see if it was already done, but it looks like I did the check with my eyes shut :P

Pushed as -r159

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

> Hmm right, dash and bash differ in the handling of SIGINT; crap
>
> However, if you trap SIGINT in set -e mode, they seem to behave identically
>
> I don't have a better reference, sorry

Hmmm... I still get the same inconsistency even with -e:

$ cat <<EOF >script
#!/bin/bash

set -e

cleanup() {
    echo in cleanup
}

trap cleanup EXIT
cat
echo after sleep
EOF

$ ./script
^Cin cleanup

$ dash ./script
^C

More than this, in order for the shell exit actually to happen,
we are relying on the exit status of the command which was pending (if any)
when the SIGINT arrives. In the above case, WIFSIGNALED will be true, since
cat does not install a handler for SIGINT. The shell (bash) treats this as
an error and follows the set -e path.

But if the command exits normally with status 0, bash does not exit,
and continues running the script. There's nothing incorrect about such a
command, though I don't know whether any of the things called by
linaro-media-create will behave this way.

Trapping to INT explicitly does the expeceted thing with regard to this:
the handler runs once at SIGINT, then again when the shell exits at the
and of the script (not shown)

Example:

$ cat <<EOF | gcc -o noint -xc -
#include <signal.h>
#include <unistd.h>

static void signal_handler(int n)
{
}

int main(void)
{
 signal(SIGINT, signal_handler);
 pause();
 return 0;
}
EOF
$ cat <<EOF >script
#!/bin/bash

cleanup() {
    echo in cleanup
}

trap cleanup EXIT
./noint
echo after sleep
EOF

$ ./script
^Cafter sleep
in cleanup

$ dash ./script
^C

In both cases, it looks rather like dash does not internally install any
SIGINT handler at all unless you do a trap, so its default behaviour is for the
shell process to be terminated (as per signal (7)). But that's guesswork on
my part -- I haven't looked at the sources.

Adding set -e doesn't appear to change anything for me here: dash still just
dies, while bash totally ignores the SIGINT and proceeds as if nothing had
happened.

$ cat <<EOF >script
#!/bin/bash

set -e

cleanup() {
    echo in cleanup
}

trap cleanup EXIT
./noint
echo after sleep
EOF
$ ./script
^Cafter sleep
in cleanup
$ dash ./script
^C

This isn't the desired behaviour -- a pending command may well have done
its work successfully and exited normally in spite of SIGINT. But the user
still wants the script to terminate -- this is why I think the correct
policy is to trap SIGINT in the script and take explicit action.

In a C program the equivalent would be to ignore SIGINT and only exit
prematurely if some called function returns an error. This is a racy strategy
and you could have a hard time breaking out of such a program.

Since POSIX is vague, the bash docs seem to make no statement about what
will happen and we have examples of two shells with different behaviour, again I don't
like to rely on it. It's the kind of behaviour which might easily "evolve" as
bash undergoes maintenance.

So for shell scripting in general, I prefer to have a script which may be
verbose but where the behaviour is cleanly and explicitly defined, rather
than one where we rely on side-effects not covered in the relevant standards.

Am I being too paranoid?

Revision history for this message
Loïc Minier (lool) wrote :

On Fri, Oct 22, 2010, Dave Martin wrote:
> > However, if you trap SIGINT in set -e mode, they seem to behave identically
> Hmmm... I still get the same inconsistency even with -e:
>
> $ cat <<EOF >script
> #!/bin/bash
>
> set -e
>
> cleanup() {
> echo in cleanup
> }
>
> trap cleanup EXIT

 "if you trap SIGINT in set -e mode", then they seem to behave
 identically, but you didn't trap SIGINT above

> So for shell scripting in general, I prefer to have a script which may be
> verbose but where the behaviour is cleanly and explicitly defined, rather
> than one where we rely on side-effects not covered in the relevant standards.

 I fully agree with your principles

--
Loïc Minier

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Loïc

[...]

> "if you trap SIGINT in set -e mode", then they seem to behave
> identically, but you didn't trap SIGINT above

Sorry, I missed that one for some reason -- you're quite right:

with cat:

# ./script
^Cin cleanup
in cleanup

# dash ./script
^Cin cleanup
in cleanup

with noint:

# ./script
^Cin cleanup
after sleep
in cleanup

# dash ./script
^Cin cleanup
after sleep
in cleanup

So it's consistent in this case (which is sort of consistent with my assumption that SIGINT nukes dash unless you install a handler with trap). And POSIX looks adequate to guarantee this behaviour _if_ the interrupted command returns failure status.

I'd still suggest that relying on the interrupted command to return WIFSIGNALED() or WEXITSTATUS()!=0 to caue the shell to terminate when the user really wants to interrupt the script is probably incorrect, though it may well work for us in practice (especially if the bash maintainers don't get too creative).

It sounds like you were agreed with this principle?

Cheers
---Dave

Revision history for this message
Loïc Minier (lool) wrote :

I'm fine with both approaches: either using the dash and bash datapoints with our own POSIX interpretation or going for the super-safe route.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

I suggest to keep it as-is for now, in that case - unless anyone else objects.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Fri, Oct 22, 2010 at 10:38:48AM -0000, Dave Martin wrote:
> So for shell scripting in general, I prefer to have a script which may be
> verbose but where the behaviour is cleanly and explicitly defined, rather
> than one where we rely on side-effects not covered in the relevant standards.

I agree, and think that what you've proposed is consistent with best
practices in signal handling in shell scripts.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message
Peter Maydell (pmaydell) wrote :

Isn't that implementation of cleanup_tempfiles() going to break if the user has specified TMPDIR? mktemp -d honours TMPDIR (defaulting to /tmp/ if not set), so if eg TMPDIR=/var/tmp then TMP_DIR will be /var/tmp/tmp.gk4nlQfOyG/ or something, which is perfectly fine but will cause the script to print the BUG message rather than cleaning up properly.

I thought that mktemp -d would always create a fresh directory for you so it should always be safe to just rm -rf it -- but perhaps I'm wrong?

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Hmmm, you're right.

The intent was to avoid accidentally executing rm -r / (or something equally damaging) if TMP_DIR accidentally becomes garbage due to a bug in the script - I've learned the hard way to be careful about rm -r in scripts... Ensuring some sane absolutely prefix on the path is usually a good way of protecting against such disasters.

But as you observe, the /tmp/tmp. prefix isn't always guaranteed by mktemp shouldn't be hard-coded.

We could use something like

"${TMPDIR-/tmp}"/*)
        # ok
*)
        # BUG

... or remove the check entirely if it's not considered useful.

Any thoughts?

Revision history for this message
Loïc Minier (lool) wrote :

rm -rf "" is safe, so I'd rather just rm -rf "$TMP_DIR"

160. By Dave Martin

Remove incorrect, overzealous sanity-check when cleaning up $TMP_DIR

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

> rm -rf "" is safe, so I'd rather just rm -rf "$TMP_DIR"

Fixed as r160

Revision history for this message
Loïc Minier (lool) wrote :

I wouldn't touch save $? + exit in the cleanup_handler; just do it after calling cleanup_handler in the signal_handler?

Revision history for this message
Peter Maydell (pmaydell) wrote :

Just noticed a typo in an error message:

112 + echo >&2 "$0: FALIED"

should be "FAILED".

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Loïc

> I wouldn't touch save $? + exit in the cleanup_handler; just do it after
> calling cleanup_handler in the signal_handler?

See r162 which simplifies the exit a bit

161. By Dave Martin

Fixed harmless error message typo

162. By Dave Martin

Simplify shell exit from the cleanup / signal handlers

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

@Peter

> 112 + echo >&2 "$0: FALIED"
>
> should be "FAILED".

Fixed in r161

Revision history for this message
Peter Maydell (pmaydell) wrote :

It looks to me like cleanup_loopbacks() will try to run losetup -d on $BOOTFS and $ROOTFS even if we were writing to an SDcard rather than to an image file (in which case those variables will be the device nodes of real partitions on the SDcard).

163. By Dave Martin

Only run losetup -d on loop devices, not random partitions

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

> It looks to me like cleanup_loopbacks() will try to run losetup -d on $BOOTFS
> and $ROOTFS even if we were writing to an SDcard rather than to an image file
> (in which case those variables will be the device nodes of real partitions on
> the SDcard).

Oops, good spot

See r163, which switches to a slightly cleaner way of tracking established loop devices.

Revision history for this message
Steve Langasek (vorlon) wrote :

I see one minor buglet left: cleanup_sd is called at the bottom of linaro-media-create, right before calling $DEPLOY_STEPS; at the end of the function, cleanup_sd redefines itself as a no-op; this means that cleanup_sd is *always* a no-op, because it voids itself out before the filesystems it's meant to clean up are ever mounted.

So either this explicit call to cleanup_sd needs to be dropped, or the function shouldn't redefine itself out of existence.

If you can make this last change today, then I'm happy for this to be merged immediately and included in a final upload of l-i-t for 10.11; we need to get a package uploaded that includes the current round of fixes from bzr, and I would very much like to see this merge in there as well.

review: Needs Fixing
164. By Dave Martin

Remove suprious call to cleanup_sd before $DEPLOY_STEPS.

cleanup_sd will still get called at the appropriate time in cleanup_handler,
as part of the EXIT trap.

Since cleanup_sd is now never called twice, the code to avoid double-
cleanup can be removed there too.

Revision history for this message
Dave Martin (dave-martin-arm) wrote :

Good point - that call to cleanup_sd was left over from the trunk and should logically have been removed in the diff.

Since cleanup_sd is called as EXIT/INT/TERM anyway via the cleanup trap, I've just got rid of that call.

See r164.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'linaro-media-create'
2--- linaro-media-create 2010-11-04 13:53:45 +0000
3+++ linaro-media-create 2010-11-09 08:59:41 +0000
4@@ -37,9 +37,27 @@
5
6 DIR=$PWD
7 TMP_DIR=$(mktemp -d)
8+cleanup_tempfiles() {
9+ rm -r "$TMP_DIR"
10+
11+ cleanup_tempfiles() { :; }
12+}
13+
14 BOOT_DISK="${DIR}/boot-disc"
15 ROOT_DISK="${DIR}/root-disc"
16
17+cleanup_mountpoints() {
18+
19+ local dir
20+
21+ for dir in "$BOOT_DISK" "$ROOT_DISK"; do
22+ if [ -d "$dir" ]; then
23+ rmdir -v "$dir"
24+ fi
25+ done
26+
27+}
28+
29 PATH="$(dirname "$(readlink -f "$0")"):$PATH"
30
31 ensure_command() {
32@@ -313,7 +331,6 @@
33 chroot=${DIR}/binary
34 # Make sure we unmount /proc in the chroot or else it can't be moved to the
35 # rootfs.
36- trap "sudo umount ${chroot}/proc || true" EXIT
37
38 LINARO_HWPACK_INSTALL=$(which linaro-hwpack-install)
39
40@@ -339,11 +356,17 @@
41
42 # Actually install the hwpack.
43 sudo mount proc ${chroot}/proc -t proc
44+ cleanup_chroot() {
45+ sudo umount -v "$chroot/proc"
46+ }
47+
48 sudo LC_ALL=C chroot "$chroot" linaro-hwpack-install /"$(basename "$HWPACK_FILE")"
49
50 # Revert some changes we did to the rootfs as we don't want them in the
51 # image.
52 sudo umount ${chroot}/proc
53+ cleanup_chroot() { :; }
54+
55 sudo mv -f ${TMP_DIR}/resolv.conf.orig ${chroot}/etc/resolv.conf
56 sudo mv -f ${TMP_DIR}/hosts.orig ${chroot}/etc/hosts
57 if [ "arch_is_arm" = no ]; then
58@@ -415,6 +438,30 @@
59 fi
60 }
61
62+loop_devices=
63+
64+cleanup_loopbacks() {
65+
66+ local d
67+
68+ if [ -n "$loop_devices" ]; then
69+
70+ echo
71+ echo "Releasing loop devices."
72+
73+ for d in $loop_devices; do
74+ if [ -n "$d" ]; then
75+ sudo losetup -d "$d"
76+ fi
77+ done
78+
79+ fi
80+}
81+
82+register_loopback() {
83+ loop_devices="$loop_devices $1"
84+}
85+
86 create_partitions() {
87 if [ "${DEVICE}" ]; then
88 sudo parted -s ${DEVICE} mklabel msdos
89@@ -461,7 +508,9 @@
90 ROOTSIZE1=$(($(LC_ALL=C fdisk -l -u "$IMAGE_FILE" | grep Linux | awk '{print $2}')))
91 ROOTSIZE=$((((ROOTSIZE2+1-ROOTSIZE1)/2)*1024))
92 BOOTFS=$(sudo losetup -f --show "$IMAGE_FILE" --offset $VFATOFFSET --sizelimit $VFATSIZE)
93+ register_loopback "$BOOTFS"
94 ROOTFS=$(sudo losetup -f --show "$IMAGE_FILE" --offset $ROOTOFFSET --sizelimit $ROOTSIZE)
95+ register_loopback "$ROOTFS"
96 fi
97 }
98
99@@ -695,6 +744,59 @@
100 fi
101 }
102
103+remove_image_file() {
104+ if [ -f "$IMAGE_FILE" ]; then
105+ rm -v $IMAGE_FILE
106+ fi
107+}
108+
109+signal_handler() {
110+ cleanup_handler
111+ exit 2
112+}
113+
114+# Initially, no action is needed to clean up mounts in the chroot.
115+# install_hwpack will temporarily define this function when needed.
116+cleanup_chroot() { :; }
117+
118+cleanup_handler() {
119+ status=$?
120+
121+ trap - EXIT INT TERM
122+
123+ set +e
124+
125+ if [ $status != 0 ]; then
126+ echo >&2 "$0: FAILED"
127+ fi
128+
129+ echo
130+ echo "Performing cleanup..."
131+ echo
132+
133+ cleanup_chroot
134+ cleanup_sd
135+ cleanup_mountpoints
136+ cleanup_tempfiles
137+ cleanup_loopbacks
138+ remove_binary_dir
139+
140+ if [ "$status" != 0 ]; then
141+ remove_image_file
142+ fi
143+
144+ # According to POSIX, for the EXIT trap the shell will now exit with the
145+ # exit status of the failing command which triggered the EXIT, or 0 if the
146+ # script terminated successfully.
147+
148+ # For the INT/TERM trap, return to signal_handler.
149+}
150+
151+# Call cleanup_handler as appropriate.
152+# The EXIT trap gets called both on normal exit or exit-on-error (set -e)
153+# so we get called either way.
154+trap cleanup_handler EXIT
155+trap signal_handler INT TERM
156
157 setup_sizes
158 if [ "${IMAGE_FILE}" ]; then
159@@ -806,9 +908,7 @@
160 echo "Warning, --hwpack <filename> was not specified, the result is unlikely to be functional without manual effort to setup uImage, MLO, u-boot.bin and so as as appropriate for your hardware."
161 fi
162 get_device_by_id
163-cleanup_sd
164 for func in $DEPLOY_STEPS; do
165 $func
166 done
167-remove_binary_dir
168

Subscribers

People subscribed via source and target branches