Merge into trunk : proper-cleanup : Code : Linaro Image Tools

Status:	Merged
Merged at revision:	161
Proposed branch:	lp:~dave-martin-arm/linaro-image-tools/proper-cleanup
Merge into:	lp:linaro-image-tools/11.11
Diff against target:	167 lines (+103/-3) 1 file modified linaro-media-create (+103/-3)
To merge this branch:	bzr merge lp:~dave-martin-arm/linaro-image-tools/proper-cleanup
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Steve Langasek (community)		2010-10-21	Needs Fixing on 2010-11-09
Review via email: mp+39030@code.launchpad.net

Revision history for this message

Peter Maydell (pmaydell) wrote on 2010-10-21:

#

Can we have some more indentation in the cleanup_handler() function, please?

I don't think you need to go to the effort of
status=$?
[..]
exit $status

in cleanup_handler() -- since it's a trap if you just fall off the end of it then the shell will exit with the exit status from before invocation of the trap handler.
(POSIX also says you can get this behaviour if you end the trap handler with a plain "exit", but this is broken in dash: https://bugs.launchpad.net/dash/+bug/647450)

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-10-21

157. By Dave Martin on 2010-10-21: Tidy cleanup code formatting to be consistent with the rest of the script.

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-10-21:

#

> Can we have some more indentation in the cleanup_handler() function, please?

Pushed as -r157

I thought I was being consistent, but I was looking at prepare_partitions() which has the same weird (non) indentation (unlike every other function...)

Other minor tidyups also to make the style more consistent.

>
> I don't think you need to go to the effort of
> status=$?
> [..]
> exit $status
>
> in cleanup_handler() -- since it's a trap if you just fall off the end of it
> then the shell will exit with the exit status from before invocation of the
> trap handler.

I didn't know that, could be useful.

In this case however, the same handler is used for INT/TERM as well as EXIT.
In the INT/TERM case I believe it's essential to exit the shell explicitly, otherwise you just fall back out into the interrupted code...

If you feel strongly, we could move the exit out into signal_handler(). My own opinion is that the explicit exit is a good thing (or at least neutral), because it avoids any doubt about what may happen when the handler finishes (but I'm happy to be contradicted).

> (POSIX also says you can get this behaviour if you end the trap handler with a
> plain "exit", but this is broken in dash:
> https://bugs.launchpad.net/dash/+bug/647450)

#!/bin/bash (already)

exit $status would work around this too if we needed to(?)

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-10-21

158. By Dave Martin on 2010-10-21: Actually trap interrupts to signal_handler, as originally intended.

Revision history for this message

Loïc Minier (lool) wrote on 2010-10-21:

#

If you don't do anything special on SIGINT, then don't trap it; you will still be called when the script exits if you setup an EXIT handler.

e.g.:
cleanup() {
echo in cleanup
}

trap cleanup EXIT
cat
echo after sleep

^C will output only "in cleanup" and wont resume execution after cat

if however you trap SIGINT:
cleanup() {
echo in cleanup
}

trap cleanup EXIT INT
cat
echo after sleep

then ^C will run the cleanup handler, then resume execution -- unless you're in set -e mode, in which case it does exit.

So to sum up: it should work because we're in set -e, but if you don't need to handle SIGINT specially, then don't handle it at all.

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-10-21:

#

> If you don't do anything special on SIGINT, then don't trap it; you will still
> be called when the script exits if you setup an EXIT handler.

Hmmm, I can't find anything about this in bash (1), and POSIX also doesn't specify precisely when the EXIT trap will execute.

http://www.opengroup.org/onlinepubs/009695399/utilities/trap.html
http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html
http://www.opengroup.org/onlinepubs/009695399/utilities/set.html

Perhaps unsurprisingly therefore, bash and dash don't have consistent behaviour on this point:

$ bash ./loïc\'s\ script
^Cin cleanup

$ dash ./loïc\'s\ script
^C

... so I prefer not to rely on this behaviour.

Also, we want to ensure that a non-zero exit status is returned if the script is interrupted - I wasn't sure whether we could rely on this unless we return the appropriate status explicitly...

Did you find some more specific documentation you can point me at?

Revision history for this message

Loïc Minier (lool) wrote on 2010-10-21:

#

Hmm right, dash and bash differ in the handling of SIGINT; crap

However, if you trap SIGINT in set -e mode, they seem to behave identically

I don't have a better reference, sorry

Revision history for this message

Matt Waddel (mwaddel) wrote on 2010-10-22:

#

One minor thing:

+unset IMAGE_FILE

isn't needed as it's already done as part of the global "unset" earlier in the script.

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-10-22

159. By Dave Martin on 2010-10-22: Remove unnecessary extra unset IMAGE_FILE

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-10-22:

#

> +unset IMAGE_FILE
>
> isn't needed as it's already done as part of the global "unset" earlier in the
> script.

Duh - I thought I'd checked to see if it was already done, but it looks like I did the check with my eyes shut :P

Pushed as -r159

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-10-22:

#

> Hmm right, dash and bash differ in the handling of SIGINT; crap
>
> However, if you trap SIGINT in set -e mode, they seem to behave identically
>
> I don't have a better reference, sorry

Hmmm... I still get the same inconsistency even with -e:

$ cat <<EOF >script
#!/bin/bash

set -e

cleanup() {
echo in cleanup
}

trap cleanup EXIT
cat
echo after sleep
EOF

$ ./script
^Cin cleanup

$ dash ./script
^C

More than this, in order for the shell exit actually to happen,
we are relying on the exit status of the command which was pending (if any)
when the SIGINT arrives. In the above case, WIFSIGNALED will be true, since
cat does not install a handler for SIGINT. The shell (bash) treats this as
an error and follows the set -e path.

But if the command exits normally with status 0, bash does not exit,
and continues running the script. There's nothing incorrect about such a
command, though I don't know whether any of the things called by
linaro-media-create will behave this way.

Trapping to INT explicitly does the expeceted thing with regard to this:
the handler runs once at SIGINT, then again when the shell exits at the
and of the script (not shown)

Example:

$ cat <<EOF | gcc -o noint -xc -
#include <signal.h>
#include <unistd.h>

static void signal_handler(int n)
{
}

int main(void)
{
signal(SIGINT, signal_handler);
pause();
return 0;
}
EOF
$ cat <<EOF >script
#!/bin/bash

cleanup() {
echo in cleanup
}

trap cleanup EXIT
./noint
echo after sleep
EOF

$ ./script
^Cafter sleep
in cleanup

$ dash ./script
^C

In both cases, it looks rather like dash does not internally install any
SIGINT handler at all unless you do a trap, so its default behaviour is for the
shell process to be terminated (as per signal (7)). But that's guesswork on
my part -- I haven't looked at the sources.

Adding set -e doesn't appear to change anything for me here: dash still just
dies, while bash totally ignores the SIGINT and proceeds as if nothing had
happened.

$ cat <<EOF >script
#!/bin/bash

set -e

cleanup() {
echo in cleanup
}

trap cleanup EXIT
./noint
echo after sleep
EOF
$ ./script
^Cafter sleep
in cleanup
$ dash ./script
^C

This isn't the desired behaviour -- a pending command may well have done
its work successfully and exited normally in spite of SIGINT. But the user
still wants the script to terminate -- this is why I think the correct
policy is to trap SIGINT in the script and take explicit action.

In a C program the equivalent would be to ignore SIGINT and only exit
prematurely if some called function returns an error. This is a racy strategy
and you could have a hard time breaking out of such a program.

Since POSIX is vague, the bash docs seem to make no statement about what
will happen and we have examples of two shells with different behaviour, again I don't
like to rely on it. It's the kind of behaviour which might easily "evolve" as
bash undergoes maintenance.

So for shell scripting in general, I prefer to have a script which may be
verbose but where the behaviour is cleanly and explicitly defined, rather
than one where we rely on side-effects not covered in the relevant standards.

Am I being too paranoid?

> Hmm right, dash and bash differ in the handling of SIGINT; crap
> 
> However, if you trap SIGINT in set -e mode, they seem to behave identically
> 
> I don't have a better reference, sorry

Hmmm... I still get the same inconsistency even with -e:

$ cat <<EOF >script
#!/bin/bash

set -e

cleanup() {
    echo in cleanup
}

trap cleanup EXIT
cat
echo after sleep
EOF

$ ./script 
^Cin cleanup

$ dash ./script 
^C

More than this, in order for the shell exit actually to happen,
we are relying on the exit status of the command which was pending (if any)
when the SIGINT arrives.  In the above case, WIFSIGNALED will be true, since
cat does not install a handler for SIGINT.  The shell (bash) treats this as
an error and follows the set -e path.

But if the command exits normally with status 0, bash does not exit,
and continues running the script.  There's nothing incorrect about such a
command, though I don't know whether any of the things called by
linaro-media-create will behave this way.

Trapping to INT explicitly does the expeceted thing with regard to this:
the handler runs once at SIGINT, then again when the shell exits at the
and of the script (not shown)

Example:

$ cat <<EOF | gcc -o noint -xc -
#include <signal.h>
#include <unistd.h>

static void signal_handler(int n)
{
}

int main(void)
{
	signal(SIGINT, signal_handler);
	pause();
	return 0;
}
EOF
$ cat <<EOF >script
#!/bin/bash

cleanup() {
    echo in cleanup
}

trap cleanup EXIT
./noint
echo after sleep
EOF

$ ./script 
^Cafter sleep
in cleanup

$ dash ./script 
^C

In both cases, it looks rather like dash does not internally install any
SIGINT handler at all unless you do a trap, so its default behaviour is for the
shell process to be terminated (as per signal (7)).  But that's guesswork on
my part -- I haven't looked at the sources.

Adding set -e doesn't appear to change anything for me here: dash still just
dies, while bash totally ignores the SIGINT and proceeds as if nothing had
happened.

$ cat <<EOF >script
#!/bin/bash

set -e

cleanup() {
    echo in cleanup
}

trap cleanup EXIT
./noint
echo after sleep
EOF
$ ./script 
^Cafter sleep
in cleanup
$ dash ./script 
^C

This isn't the desired behaviour -- a pending command may well have done
its work successfully and exited normally in spite of SIGINT.  But the user
still wants the script to terminate -- this is why I think the correct
policy is to trap SIGINT in the script and take explicit action.

In a C program the equivalent would be to ignore SIGINT and only exit
prematurely if some called function returns an error.  This is a racy strategy
and you could have a hard time breaking out of such a program.

Since POSIX is vague, the bash docs seem to make no statement about what
will happen and we have examples of two shells with different behaviour, again I don't
like to rely on it.  It's the kind of behaviour which might easily "evolve" as
bash undergoes maintenance.

So for shell scripting in general, I prefer to have a script which may be
verbose but where the behaviour is cleanly and explicitly defined, rather
than one where we rely on side-effects not covered in the relevant standards.

Am I being too paranoid?

Revision history for this message

Loïc Minier (lool) wrote on 2010-10-22:

#

On Fri, Oct 22, 2010, Dave Martin wrote:
> > However, if you trap SIGINT in set -e mode, they seem to behave identically
> Hmmm... I still get the same inconsistency even with -e:
>
> $ cat <<EOF >script
> #!/bin/bash
>
> set -e
>
> cleanup() {
> echo in cleanup
> }
>
> trap cleanup EXIT

"if you trap SIGINT in set -e mode", then they seem to behave
identically, but you didn't trap SIGINT above

> So for shell scripting in general, I prefer to have a script which may be
> verbose but where the behaviour is cleanly and explicitly defined, rather
> than one where we rely on side-effects not covered in the relevant standards.

I fully agree with your principles

--
Loïc Minier

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-10-22:

#

@Loïc

[...]

> "if you trap SIGINT in set -e mode", then they seem to behave
> identically, but you didn't trap SIGINT above

Sorry, I missed that one for some reason -- you're quite right:

with cat:

# ./script
^Cin cleanup
in cleanup

# dash ./script
^Cin cleanup
in cleanup

with noint:

# ./script
^Cin cleanup
after sleep
in cleanup

# dash ./script
^Cin cleanup
after sleep
in cleanup

So it's consistent in this case (which is sort of consistent with my assumption that SIGINT nukes dash unless you install a handler with trap). And POSIX looks adequate to guarantee this behaviour _if_ the interrupted command returns failure status.

I'd still suggest that relying on the interrupted command to return WIFSIGNALED() or WEXITSTATUS()!=0 to caue the shell to terminate when the user really wants to interrupt the script is probably incorrect, though it may well work for us in practice (especially if the bash maintainers don't get too creative).

It sounds like you were agreed with this principle?

Cheers
---Dave

Revision history for this message

Loïc Minier (lool) wrote on 2010-10-22:

#

I'm fine with both approaches: either using the dash and bash datapoints with our own POSIX interpretation or going for the super-safe route.

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-10-22:

#

I suggest to keep it as-is for now, in that case - unless anyone else objects.

Revision history for this message

Steve Langasek (vorlon) wrote on 2010-10-31:

#

On Fri, Oct 22, 2010 at 10:38:48AM -0000, Dave Martin wrote:
> So for shell scripting in general, I prefer to have a script which may be
> verbose but where the behaviour is cleanly and explicitly defined, rather
> than one where we rely on side-effects not covered in the relevant standards.

I agree, and think that what you've proposed is consistent with best
practices in signal handling in shell scripts.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message

Peter Maydell (pmaydell) wrote on 2010-11-01:

#

Isn't that implementation of cleanup_tempfiles() going to break if the user has specified TMPDIR? mktemp -d honours TMPDIR (defaulting to /tmp/ if not set), so if eg TMPDIR=/var/tmp then TMP_DIR will be /var/tmp/tmp.gk4nlQfOyG/ or something, which is perfectly fine but will cause the script to print the BUG message rather than cleaning up properly.

I thought that mktemp -d would always create a fresh directory for you so it should always be safe to just rm -rf it -- but perhaps I'm wrong?

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-11-01:

#

Hmmm, you're right.

The intent was to avoid accidentally executing rm -r / (or something equally damaging) if TMP_DIR accidentally becomes garbage due to a bug in the script - I've learned the hard way to be careful about rm -r in scripts... Ensuring some sane absolutely prefix on the path is usually a good way of protecting against such disasters.

But as you observe, the /tmp/tmp. prefix isn't always guaranteed by mktemp shouldn't be hard-coded.

We could use something like

"${TMPDIR-/tmp}"/*)
# ok
*)
# BUG

... or remove the check entirely if it's not considered useful.

Any thoughts?

Revision history for this message

Loïc Minier (lool) wrote on 2010-11-01:

#

rm -rf "" is safe, so I'd rather just rm -rf "$TMP_DIR"

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-11-01

160. By Dave Martin on 2010-11-01: Remove incorrect, overzealous sanity-check when cleaning up $TMP_DIR

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-11-01:

#

> rm -rf "" is safe, so I'd rather just rm -rf "$TMP_DIR"

Fixed as r160

Revision history for this message

Loïc Minier (lool) wrote on 2010-11-01:

#

I wouldn't touch save $? + exit in the cleanup_handler; just do it after calling cleanup_handler in the signal_handler?

Revision history for this message

Peter Maydell (pmaydell) wrote on 2010-11-01:

#

Just noticed a typo in an error message:

112 + echo >&2 "$0: FALIED"

should be "FAILED".

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-11-02:

#

@Loïc

> I wouldn't touch save $? + exit in the cleanup_handler; just do it after
> calling cleanup_handler in the signal_handler?

See r162 which simplifies the exit a bit

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-11-02

161. By Dave Martin on 2010-11-02: Fixed harmless error message typo
162. By Dave Martin on 2010-11-02: Simplify shell exit from the cleanup / signal handlers

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-11-02:

#

@Peter

> 112 + echo >&2 "$0: FALIED"
>
> should be "FAILED".

Fixed in r161

Revision history for this message

Peter Maydell (pmaydell) wrote on 2010-11-02:

#

It looks to me like cleanup_loopbacks() will try to run losetup -d on $BOOTFS and $ROOTFS even if we were writing to an SDcard rather than to an image file (in which case those variables will be the device nodes of real partitions on the SDcard).

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-11-02

163. By Dave Martin on 2010-11-02: Only run losetup -d on loop devices, not random partitions

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-11-02:

#

> It looks to me like cleanup_loopbacks() will try to run losetup -d on $BOOTFS
> and $ROOTFS even if we were writing to an SDcard rather than to an image file
> (in which case those variables will be the device nodes of real partitions on
> the SDcard).

Oops, good spot

See r163, which switches to a slightly cleaner way of tracking established loop devices.

Revision history for this message

Steve Langasek (vorlon) wrote on 2010-11-09:

#

I see one minor buglet left: cleanup_sd is called at the bottom of linaro-media-create, right before calling $DEPLOY_STEPS; at the end of the function, cleanup_sd redefines itself as a no-op; this means that cleanup_sd is *always* a no-op, because it voids itself out before the filesystems it's meant to clean up are ever mounted.

So either this explicit call to cleanup_sd needs to be dropped, or the function shouldn't redefine itself out of existence.

If you can make this last change today, then I'm happy for this to be merged immediately and included in a final upload of l-i-t for 10.11; we need to get a package uploaded that includes the current round of fixes from bzr, and I would very much like to see this merge in there as well.

review: Needs Fixing

lp:~dave-martin-arm/linaro-image-tools/proper-cleanup updated on 2010-11-09

164. By Dave Martin on 2010-11-09

Remove suprious call to cleanup_sd before $DEPLOY_STEPS.

cleanup_sd will still get called at the appropriate time in cleanup_handler,
as part of the EXIT trap.

Since cleanup_sd is now never called twice, the code to avoid double-
cleanup can be removed there too.

Revision history for this message

Dave Martin (dave-martin-arm) wrote on 2010-11-09:

#

Good point - that call to cleanup_sd was left over from the trunk and should logically have been removed in the diff.

Since cleanup_sd is called as EXIT/INT/TERM anyway via the cleanup trap, I've just got rid of that call.

See r164.

 === modified file 'linaro-media-create'
 --- linaro-media-create	2010-11-04 13:53:45 +0000
 +++ linaro-media-create	2010-11-09 08:59:41 +0000
@@ -37,9 +37,27 @@
  DIR=$PWD
  TMP_DIR=$(mktemp -d)
++cleanup_tempfiles() {
++  rm -r "$TMP_DIR"
++
++  cleanup_tempfiles() { :; }
++}
++
  BOOT_DISK="${DIR}/boot-disc"
  ROOT_DISK="${DIR}/root-disc"
++cleanup_mountpoints() {
++
++  local dir
++
++  for dir in "$BOOT_DISK" "$ROOT_DISK"; do
++    if [ -d "$dir" ]; then
++      rmdir -v "$dir"
++    fi
++  done
++
++}
++
  PATH="$(dirname "$(readlink -f "$0")"):$PATH"
  ensure_command() {
@@ -313,7 +331,6 @@
    chroot=${DIR}/binary
    # Make sure we unmount /proc in the chroot or else it can't be moved to the
    # rootfs.
--  trap "sudo umount ${chroot}/proc || true" EXIT
    LINARO_HWPACK_INSTALL=$(which linaro-hwpack-install)
@@ -339,11 +356,17 @@
    # Actually install the hwpack.
    sudo mount proc ${chroot}/proc -t proc
++  cleanup_chroot() {
++    sudo umount -v "$chroot/proc"
++  }
++
    sudo LC_ALL=C chroot "$chroot" linaro-hwpack-install /"$(basename "$HWPACK_FILE")"
    # Revert some changes we did to the rootfs as we don't want them in the
    # image.
    sudo umount ${chroot}/proc
++  cleanup_chroot() { :; }
++
    sudo mv -f ${TMP_DIR}/resolv.conf.orig ${chroot}/etc/resolv.conf
    sudo mv -f ${TMP_DIR}/hosts.orig ${chroot}/etc/hosts
    if [ "arch_is_arm" = no ]; then
@@ -415,6 +438,30 @@
    fi
+ }
++loop_devices=
++
++cleanup_loopbacks() {
++
++  local d
++
++  if [ -n "$loop_devices" ]; then
++
++    echo
++    echo "Releasing loop devices."
++
++    for d in $loop_devices; do
++      if [ -n "$d" ]; then
++	sudo losetup -d "$d"
++      fi
++    done
++
++  fi
++}
++
++register_loopback() {
++  loop_devices="$loop_devices $1"
++}
++
  create_partitions() {
    if [ "${DEVICE}" ]; then
      sudo parted -s ${DEVICE} mklabel msdos
@@ -461,7 +508,9 @@
    ROOTSIZE1=$(($(LC_ALL=C fdisk -l -u "$IMAGE_FILE" | grep Linux | awk '{print $2}')))
    ROOTSIZE=$((((ROOTSIZE2+1-ROOTSIZE1)/2)*1024))
    BOOTFS=$(sudo losetup -f --show "$IMAGE_FILE" --offset $VFATOFFSET --sizelimit $VFATSIZE)
++  register_loopback "$BOOTFS"
    ROOTFS=$(sudo losetup -f --show "$IMAGE_FILE" --offset $ROOTOFFSET --sizelimit $ROOTSIZE)
++  register_loopback "$ROOTFS"
    fi
+ }
@@ -695,6 +744,59 @@
    fi
+ }
++remove_image_file() {
++  if [ -f "$IMAGE_FILE" ]; then
++    rm -v $IMAGE_FILE
++  fi
++}
++
++signal_handler() {
++  cleanup_handler
++  exit 2
++}
++
++# Initially, no action is needed to clean up mounts in the chroot.
++# install_hwpack will temporarily define this function when needed.
++cleanup_chroot() { :; }
++
++cleanup_handler() {
++  status=$?
++
++  trap - EXIT INT TERM
++
++  set +e
++
++  if [ $status != 0 ]; then
++    echo >&2 "$0: FAILED"
++  fi
++
++  echo
++  echo "Performing cleanup..."
++  echo
++
++  cleanup_chroot
++  cleanup_sd
++  cleanup_mountpoints
++  cleanup_tempfiles
++  cleanup_loopbacks
++  remove_binary_dir
++
++  if [ "$status" != 0 ]; then
++      remove_image_file
++  fi
++
++  # According to POSIX, for the EXIT trap the shell will now exit with the
++  # exit status of the failing command which triggered the EXIT, or 0 if the
++  # script terminated successfully.
++
++  # For the INT/TERM trap, return to signal_handler.
++}
++
++# Call cleanup_handler as appropriate.
++# The EXIT trap gets called both on normal exit or exit-on-error (set -e)
++# so we get called either way.
++trap cleanup_handler EXIT
++trap signal_handler INT TERM
  setup_sizes
  if [ "${IMAGE_FILE}" ]; then
@@ -806,9 +908,7 @@
    echo "Warning, --hwpack <filename> was not specified, the result is unlikely to be functional without manual effort to setup uImage, MLO, u-boot.bin and so as as appropriate for your hardware."
  fi
  get_device_by_id
--cleanup_sd
  for func in $DEPLOY_STEPS; do
    $func
  done
--remove_binary_dir

Linaro Image Tools

Merge lp:~dave-martin-arm/linaro-image-tools/proper-cleanup into lp:linaro-image-tools/11.11

Commit message

Description of the change

Preview Diff

Subscribers