Merge lp:~nick-schutt/lava-dispatcher/nicks-highbank-support into lp:lava-dispatcher

Proposed by Nicholas Schutt
Status: Merged
Merged at revision: 584
Proposed branch: lp:~nick-schutt/lava-dispatcher/nicks-highbank-support
Merge into: lp:lava-dispatcher
Diff against target: 438 lines (+382/-2)
6 files modified
lava_dispatcher/client/base.py (+1/-1)
lava_dispatcher/client/lmc_utils.py (+6/-1)
lava_dispatcher/config.py (+2/-0)
lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf (+2/-0)
lava_dispatcher/device/highbank.py (+286/-0)
lava_dispatcher/ipmi.py (+85/-0)
To merge this branch: bzr merge lp:~nick-schutt/lava-dispatcher/nicks-highbank-support
Reviewer Review Type Date Requested Status
Antonio Terceiro Approve
Review via email: mp+159678@code.launchpad.net

This proposal supersedes a proposal from 2013-04-16.

Description of the change

added partition resize support

changed tmpfs size from 50% to 100%

To post a comment you must log in.
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

This looks interesting, but as I'm sure you're aware it's still pretty
ugly. Has there been any progress in the mean time on cleaning things
up?

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

waiting for a hw pack + snapshot that work to be able to complete

On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:

> This looks interesting, but as I'm sure you're aware it's still pretty
> ugly. Has there been any progress in the mean time on cleaning things
> up?
>
> --
>
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are the owner of
> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

I kind of understand that it's "ugly," but I'm wondering if you might have
some suggestions? I know it's not done yet. I have made it work a couple of
times now, once with a tarball of an ubuntu system (calxeda02-10), and
another time now with a snapshot that requires a bit of hacking right now.

I think the goal now is to use lmc to produce a working image and dd that
to the disk; that should remove the "hack" part. (Copying Antonio since we
have discussed this)

On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:

> This looks interesting, but as I'm sure you're aware it's still pretty
> ugly. Has there been any progress in the mean time on cleaning things
> up?
>
> --
>
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are the owner of
> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

Nicholas Schutt <email address hidden> writes:

> I kind of understand that it's "ugly," but I'm wondering if you might have
> some suggestions?

Well, I guess mainly it's the duplication from master.py (which is my
fault, I know). It would be nice if we could have a common base class
of MasterImageTarget and HighbankTarget. The other ugliness is mostly
the usual dispatcher cruft :(

> I know it's not done yet. I have made it work a couple of times now,
> once with a tarball of an ubuntu system (calxeda02-10), and another
> time now with a snapshot that requires a bit of hacking right now.

Cool!

> I think the goal now is to use lmc to produce a working image and dd that
> to the disk; that should remove the "hack" part. (Copying Antonio since we
> have discussed this)

When I talked to Fathi about this (admittedly about 6 weeks ago), he
didn't think that lmc would support highbank. I agree just dd-ing an
image over /dev/sda would be a lot cleaner.

Cheers,
mwh

> On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:
>
>> This looks interesting, but as I'm sure you're aware it's still pretty
>> ugly. Has there been any progress in the mean time on cleaning things
>> up?
>>
>> --
>>
>> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
>> You are the owner of
>> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>>
>
> --
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are subscribed to branch lp:lava-dispatcher.

Revision history for this message
Fathi Boudra (fboudra) wrote : Posted in a previous version of this proposal

On 2 April 2013 23:10, Michael Hudson-Doyle
<email address hidden> wrote:
>> I think the goal now is to use lmc to produce a working image and dd that
>> to the disk; that should remove the "hack" part. (Copying Antonio since we
>> have discussed this)
>
> When I talked to Fathi about this (admittedly about 6 weeks ago), he
> didn't think that lmc would support highbank. I agree just dd-ing an
> image over /dev/sda would be a lot cleaner.

https://code.launchpad.net/~fboudra/linaro-image-tools/highbank-support

It's now supported in a similar way to a development board.
I'm not convinced it's the best approach for servers but it works.

Revision history for this message
Antonio Terceiro (terceiro) wrote : Posted in a previous version of this proposal
Download full text (22.6 KiB)

Hey Nick,

It's freaking cool that we already had this working! :-)

Follows my comments. I know you have a separate branch where you are
trying to abstract the parts in common with master.py, but I am making
some comments about that here anyway.

 review needs-fixing

> === modified file 'lava_dispatcher/client/base.py'
> --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> @@ -154,7 +154,7 @@
> lava_server_ip = self._client.context.config.lava_server_ip
> self.run(
> "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> - ["1 received", "0 received", "Network is unreachable"],
> + ["1 received|1 packets* received", "0 received|0 packets received", "Network is unreachable"],
> timeout=5, failok=True)
> if self.match_id == 0:
> return True
>

Do you really need this? Did ping had a different output on the rootfs
you tested with?

> === modified file 'lava_dispatcher/client/lmc_utils.py'
> --- lava_dispatcher/client/lmc_utils.py 2013-02-18 03:19:14 +0000
> +++ lava_dispatcher/client/lmc_utils.py 2013-04-08 13:59:23 +0000
> @@ -15,7 +15,8 @@
> )
>
>
> -def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None):
> +def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None,
> + extra_boot_args=None, image_size=None):
> """Generate image from a hwpack and rootfs url
>
> :param hwpack_url: url of the Linaro hwpack to download
> @@ -32,7 +33,7 @@
> rootfs_path = download_image(rootfs_url, client.context, outdir, decompress=False)
>
> logging.info("linaro-media-create version information")
> - cmd = "sudo linaro-media-create -v"
> + cmd = "linaro-media-create -v"
> rc, output = getstatusoutput(cmd)
> metadata = client.context.test_data.get_metadata()
> metadata['target.linaro-media-create-version'] = output

I'm not sure we want to drop the sudo there. Even though the dispatcher
currently requires being run as root, I think in the long run we should
be able to drop that requirement. Also, if we are root already, the sudo
does no harm.

Additionally, it's a good idea to keep the merge proposal focused and
avoid lateral changes.

> @@ -42,11 +43,15 @@
>
> logging.info("client.device_type = %s" %client.config.device_type)
>
> - cmd = ("sudo flock /var/lock/lava-lmc.lck linaro-media-create --hwpack-force-yes --dev %s "
> + cmd = ("flock /var/lock/lava-lmc.lck linaro-media-create --hwpack-force-yes --dev %s "
> "--image-file %s --binary %s --hwpack %s --image-size 3G --bootloader %s" %
> (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))

Ditto.

> if rootfstype is not None:
> cmd += ' --rootfs ' + rootfstype
> + if image_size is not None:
> + cmd += ' --image-size ' + image_size
> + if extra_boot_args is not None:
> + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> logging.info("Executing the linaro-media-create command")
> logging.info(cmd)
>...

review: Needs Fixing
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

Just one quibble.

Antonio Terceiro <email address hidden> writes:

>> + def start_http_server(self, runner, port=80):
>> + # busybox produces no output to parse for, so let it run as a daemon
>> + runner.run('busybox httpd -v -p %s' % port)
>> + url_base = "http://%s:%s" % (self.master_ip, port)
>> + return url_base
>
> I don't think we should use the standard http port here, because most
> probably use cases will include installing a proper web server, and in
> that case trying to start busybox on the default port will break things.

The device is running in the initramfs here, surely?

I have discovered a new thing to be annoyed about: things that don't let
you bind to port 0 and then report the port they bound (nc does this
too).

Cheers,
mwh

Revision history for this message
Antonio Terceiro (terceiro) wrote : Posted in a previous version of this proposal

On Mon, Apr 08, 2013 at 09:46:30PM -0000, Michael Hudson-Doyle wrote:
> Just one quibble.
>
> Antonio Terceiro <email address hidden> writes:
>
> >> + def start_http_server(self, runner, port=80):
> >> + # busybox produces no output to parse for, so let it run as a daemon
> >> + runner.run('busybox httpd -v -p %s' % port)
> >> + url_base = "http://%s:%s" % (self.master_ip, port)
> >> + return url_base
> >
> > I don't think we should use the standard http port here, because most
> > probably use cases will include installing a proper web server, and in
> > that case trying to start busybox on the default port will break things.
>
> The device is running in the initramfs here, surely?

Yes, you are right! Nick, in this case you don't need to care about the
port number there.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

> I'm not sure we want to drop the sudo there. Even though the dispatcher
> currently requires being run as root, I think in the long run we should
> be able to drop that requirement. Also, if we are root already, the sudo
> does no harm.

Antonio,

I will undo the sudo changes since they're not important now that everything works with the scheduler. But, I saw an issue when running the dispatcher as root in my own local virtual environment. For each sudo command the virtual environment was lost; I'm not sure why.

Nick

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

> > === modified file 'lava_dispatcher/client/base.py'
> > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > @@ -154,7 +154,7 @@
> > lava_server_ip = self._client.context.config.lava_server_ip
> > self.run(
> > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > - ["1 received", "0 received", "Network is unreachable"],
> > + ["1 received|1 packets* received", "0 received|0 packets
> received", "Network is unreachable"],
> > timeout=5, failok=True)
> > if self.match_id == 0:
> > return True
> >
>
> Do you really need this? Did ping had a different output on the rootfs
> you tested with?

>

The output from ping in the initrd seems to match what I see on my ubuntu machine,
which contains the word "packets":

nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52 time=33.8 ms

--- validation.linaro.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

> > if rootfstype is not None:
> > cmd += ' --rootfs ' + rootfstype
> > + if image_size is not None:
> > + cmd += ' --image-size ' + image_size
> > + if extra_boot_args is not None:
> > + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> > logging.info("Executing the linaro-media-create command")
> > logging.info(cmd)
> >
> > @@ -85,7 +90,7 @@
> > mntdir = mkdtemp()
> > image = image_file
> > offset = get_partition_offset(image, partno)
> > - mount_cmd = "sudo mount -o loop,offset=%s %s %s" % (offset, image,
> mntdir)
> > + mount_cmd = "mount -o loop,offset=%s %s %s" % (offset, image, mntdir)
> > rc = logging_system(mount_cmd)
> > if rc != 0:
> > os.rmdir(mntdir)
>

We need lmc changes to get the snapshot image to work. Should this be done on another branch? The existing snapshots won't work without these changes.

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

> > + if runner.match_id != 0:
> > + msg = "Unable to determine dns address"
> > + logging.error(msg)
> > + raise CriticalError(msg)
> > + dns = runner.match.group(1)
> > + logging.info("DNS Address is %s" % dns)
> > + runner.run("echo nameserver %s > /etc/resolv.conf" % dns)
>
> Do we actually still need this DNS setup, by the way? Isn't everything
> that's needed to be acessed acessed by IP now?
>

Agreed. I have now removed it.

Revision history for this message
Antonio Terceiro (terceiro) wrote : Posted in a previous version of this proposal

On Thu, Apr 11, 2013 at 01:13:31PM -0000, Nicholas Schutt wrote:
>
> > > === modified file 'lava_dispatcher/client/base.py'
> > > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > > @@ -154,7 +154,7 @@
> > > lava_server_ip = self._client.context.config.lava_server_ip
> > > self.run(
> > > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > > - ["1 received", "0 received", "Network is unreachable"],
> > > + ["1 received|1 packets* received", "0 received|0 packets
> > received", "Network is unreachable"],
> > > timeout=5, failok=True)
> > > if self.match_id == 0:
> > > return True
> > >
> >
> > Do you really need this? Did ping had a different output on the rootfs
> > you tested with?
>
> >
>
> The output from ping in the initrd seems to match what I see on my ubuntu machine,
> which contains the word "packets":
>
>
> nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
> PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
> 64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52 time=33.8 ms
>
> --- validation.linaro.org ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms

yes, but note that "packets" are mentioned in the "transmitted" part,
but we really only care about the "received" part, which in this case
will say exactly either "1 received" or "0 received".

my point is to keep the diff minimal, without changes that are unrelated
with the purpose of this MP.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Antonio Terceiro (terceiro) wrote : Posted in a previous version of this proposal

On Thu, Apr 11, 2013 at 01:16:24PM -0000, Nicholas Schutt wrote:
> > > if rootfstype is not None:
> > > cmd += ' --rootfs ' + rootfstype
> > > + if image_size is not None:
> > > + cmd += ' --image-size ' + image_size
> > > + if extra_boot_args is not None:
> > > + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> > > logging.info("Executing the linaro-media-create command")
> > > logging.info(cmd)
> > >
> > > @@ -85,7 +90,7 @@
> > > mntdir = mkdtemp()
> > > image = image_file
> > > offset = get_partition_offset(image, partno)
> > > - mount_cmd = "sudo mount -o loop,offset=%s %s %s" % (offset, image,
> > mntdir)
> > > + mount_cmd = "mount -o loop,offset=%s %s %s" % (offset, image, mntdir)
> > > rc = logging_system(mount_cmd)
> > > if rc != 0:
> > > os.rmdir(mntdir)
> >
>
> We need lmc changes to get the snapshot image to work. Should this be
> done on another branch? The existing snapshots won't work without
> these changes.

if this is needed in order to make this branch work with the intended
input data, so it's fine to keep them here.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

Implemented device versioning in the master image as follows:

/usr/share/initramfs-tools/hooks/master-extras

# Create version info for this image
echo '#!/bin/sh' > /tmp/lava-master-image-info
echo "echo $(date +%Y.%m.%d-%H.%M.%S)" > /tmp/lava-master-image-info
chmod +x /tmp/lava-master-image-info
copy_exec /tmp/lava-master-image-info /sbin

> > + def get_device_version(self):
> > + # To be re-implemented when master image is generated by linaro-
> image-tools
> > + device_version = "unknown"
> > + return device_version
>
> Please make sure that some build number is included in the initrd so
> that we can read it here as device version. You can include something as
> simples as the following snippet in the initramfs-tools hooks file:
>
> echo '#!/bin/sh' > /tmp/lava-master-version
> echo "echo $(date +%Y.%m.%d)" > /tmp/lava-master-version
> chmod +x /tmp/lava-master-version
> copy_exec /tmp/lava-master-version /sbin
>
> Then you can call lava-master-version on the target from this method and
> return the version number printed.
>

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

Here is the actual output from the initrd version of ping. It includes "packets" in the received field

root@master [rc=0]# ^[[38;19RLC_ALL=C ping -W4 -c1 192.168.1.71:8100
PING 192.168.1.71:8100 (192.168.1.71): 56 data bytes
64 bytes from 192.168.1.71: seq=0 ttl=64 time=0.802 ms

--- 192.168.1.71:8100 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.802/0.802/0.802 ms
root@master [rc=0]#
root@master [rc=0]# ^[[38;21Rifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' |awk
 '{print "<" $1 ">"}'

> On Thu, Apr 11, 2013 at 01:13:31PM -0000, Nicholas Schutt wrote:
> >
> > > > === modified file 'lava_dispatcher/client/base.py'
> > > > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > > > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > > > @@ -154,7 +154,7 @@
> > > > lava_server_ip = self._client.context.config.lava_server_ip
> > > > self.run(
> > > > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > > > - ["1 received", "0 received", "Network is unreachable"],
> > > > + ["1 received|1 packets* received", "0 received|0 packets
> > > received", "Network is unreachable"],
> > > > timeout=5, failok=True)
> > > > if self.match_id == 0:
> > > > return True
> > > >
> > >
> > > Do you really need this? Did ping had a different output on the rootfs
> > > you tested with?
> >
> > >
> >
> > The output from ping in the initrd seems to match what I see on my ubuntu
> machine,
> > which contains the word "packets":
> >
> >
> > nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-
> support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
> > PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
> > 64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52
> time=33.8 ms
> >
> > --- validation.linaro.org ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms
>
> yes, but note that "packets" are mentioned in the "transmitted" part,
> but we really only care about the "received" part, which in this case
> will say exactly either "1 received" or "0 received".
>
> my point is to keep the diff minimal, without changes that are unrelated
> with the purpose of this MP.
>
> --
> Antonio Terceiro
> Software Engineer - Linaro
> http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

send_control('c') does not appear to work when running in the initrd master. Maybe that needs to be looked at and fixed, but to make things work I had to run in the background and use kill instead.

I modified to do this differently now:

class HBMasterCommandRunner(MasterCommandRunner):

    http_pid = None

    def start_http_server(self):
        master_ip = self.get_master_ip()
        if self.http_pid != None:
            raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
        # busybox produces no output to parse for, so run it in the bg and get its pid
        self.run('busybox httpd -f &')
        self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
        if self.match_id != 0:
            raise OperationFailed("busybox httpd did not start")
        else:
            self.http_pid = self.match.group(1)
        url_base = "http://%s" % (master_ip)
        return url_base

    def stop_http_server(self):
        if self.http_pid == None:
            raise OperationFailed("busybox httpd not running, but stop_http_server called.")
        self.run('kill %s' % self.http_pid)
        self.http_pid = None

> > +
> > + def start_http_server(self, runner, port=80):
> > + # busybox produces no output to parse for, so let it run as a
> daemon
> > + runner.run('busybox httpd -v -p %s' % port)
> > + url_base = "http://%s:%s" % (self.master_ip, port)
> > + return url_base
>
> I don't think we should use the standard http port here, because most
> probably use cases will include installing a proper web server, and in
> that case trying to start busybox on the default port will break things.
>
> Try some high port that is unlikely to be used by another service ...
> like ... 50888 for now. :)
>
> > +
> > + def stop_http_server(self, runner):
> > + runner.run('killall busybox')
>
> Not safe. I think it's better to just start busybox httpd in the
> foreground and then send a control-C here.
>

Revision history for this message
Antonio Terceiro (terceiro) wrote : Posted in a previous version of this proposal

On Fri, Apr 12, 2013 at 04:10:35PM -0000, Nicholas Schutt wrote:
>
> Here is the actual output from the initrd version of ping. It includes "packets" in the received field
>
>
> root@master [rc=0]# ^[[38;19RLC_ALL=C ping -W4 -c1 192.168.1.71:8100
> PING 192.168.1.71:8100 (192.168.1.71): 56 data bytes
> 64 bytes from 192.168.1.71: seq=0 ttl=64 time=0.802 ms
>
> --- 192.168.1.71:8100 ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max = 0.802/0.802/0.802 ms
> root@master [rc=0]#
> root@master [rc=0]# ^[[38;21Rifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' |awk
> '{print "<" $1 ">"}'

ok. in that case you want to drop the '*' so the pattern is
"1 received|1 packets received"

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Antonio Terceiro (terceiro) wrote : Posted in a previous version of this proposal

On Fri, Apr 12, 2013 at 04:13:34PM -0000, Nicholas Schutt wrote:
>
> send_control('c') does not appear to work when running in the initrd master. Maybe that needs to be looked at and fixed, but to make things work I had to run in the background and use kill instead.
>
> I modified to do this differently now:
>
>
> class HBMasterCommandRunner(MasterCommandRunner):
>
> http_pid = None
>
> def start_http_server(self):
> master_ip = self.get_master_ip()
> if self.http_pid != None:
> raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
> # busybox produces no output to parse for, so run it in the bg and get its pid
> self.run('busybox httpd -f &')
> self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
> if self.match_id != 0:
> raise OperationFailed("busybox httpd did not start")
> else:
> self.http_pid = self.match.group(1)
> url_base = "http://%s" % (master_ip)
> return url_base
>
> def stop_http_server(self):
> if self.http_pid == None:
> raise OperationFailed("busybox httpd not running, but stop_http_server called.")
> self.run('kill %s' % self.http_pid)
> self.http_pid = None

Looks good.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal
Download full text (7.8 KiB)

The wget, bunzip, then dd to disk approach seems to be necessary to avoid memory allocation errors from the kernel. I'm not sure of the root cause.

***> The actual error is included at the end of this comment <***

Re-implemented to keep the image compressed in the ramdisk and to set the max ramdisk size to 50% of memory:

            image_file_base = '/'.join(image_file.split('/')[-1:])

            decompression_cmd = None
            if image_file_base.endswith('.gz'):
                decompression_cmd = '/bin/gzip -dc'
            elif image_file_base.endswith('.bz2'):
                decompression_cmd = '/bin/bzip2 -dc'

            runner.run('mkdir /builddir')
            runner.run('mount -t tmpfs -o size=50% tmpfs /builddir')
            runner.run('wget -O /builddir/%s %s' % (image_file_base, image_url), timeout=1800)

            if decompression_cmd != None:
                cmd = 'dd bs=4M if=%s of=%s' % (image_file_base, device)
            else:
                cmd = '%s %s | dd bs=4M of=%s' % (decompression_cmd, image_file_base, device)

            runner.run(cmd, timeout=1800)
            runner.run('umount /builddir')

--------

>
> > + elif image_url.endswith('.bz2'):
> > + decompression_cmd = '| /bin/bzip2 -dc'
> > +
> > + runner.run('mkdir /builddir')
> > + runner.run('mount -t tmpfs -o size=4G tmpfs builddir')
> > + image = '/builddir/lava.img'
> > + runner.run('wget -O - %s %s > %s' % (image_url,
> decompression_cmd, image), timeout=1800)
> > + runner.run('dd bs=4M if=%s of=%s' % (image, device),
> timeout=1800)
> > + runner.run('umount /builddir')
>
> Is it always possible to create a 4G tmpfs?
>
> Instead of saving the image locally, I think you could pipe the wget
> output directl into dd. This way you don't need to to create a tmpfs,
> nor to mount/umount it. This would be something like:
>
> runner.run("wget -O - %s %s | dd bs=4M of=%s" % (image_url,
> decompression_cmd, device))
>
> Also, there are some wget options you can use to get some progress
> indication, which is useful. Look at how master.py does it.
>
> (if you convince me the tmpfs is needed, make sure to add a missing
> slash in the last token there so it reads `/builddir` instead of
> `builddir`)
>

----------

root@master [rc=0]# wget -O - http://192.168.1.71/highbank2-tmp/images/tmprBHgDV
/lava.img.bz2 | /bin/bzip2 -dc | dd bs=4M of=/dev/sda
Connecting to 192.168.1.71 (192.168.1.71:80)
- 1% | | 2558k 0:01:16 ETA- 2% | | 5266k 0:01:12 ETA- 4% |* | 8378k 0:01:07 ETA- 4% |* | 8930k 0:01:24 ETA- 4% |* | 9282k 0:01:41 ETA- 5% |* | 10198k 0:01:49 ETA- 6% |* |

...

143M 0:00:26 ETA- 75% |*********************** | 145M 0:00:25 ETA- 76% |******...

Read more...

Revision history for this message
Antonio Terceiro (terceiro) : Posted in a previous version of this proposal
review: Approve
Revision history for this message
Antonio Terceiro (terceiro) wrote :

Hello Nick,

I have a few comments about the partition resizing, but don't let them
stop you from going ahead with the merge.

 review approve

> + def resize_rootfs_partition(self, runner):
> + partno = '2'
> + start = None
> +
> + runner.run('parted -s /dev/sda print',
> + response='\s+%s\s+([0-9.]+.B)\s+\S+\s+\S+\s+primary\s+(\S+)' % partno,
> + wait_prompt=False)

This regex looks a little scary to me, but I think we already have
scarier ones. In general, I am worried that small changes in the output
format might break us in the future, but maybe I am overthinking this.

parted has a -m option (standing for machine-readable output), which
produces output in a parsing-friendly way:

$ sudo parted -m -s /dev/sda print
BYT;
/dev/sda:750GB:scsi:512:4096:msdos:ATA Hitachi HTS54757;
1:1049kB:256MB:255MB:ext2::boot;
2:257MB:750GB:750GB:::;
5:257MB:750GB:750GB:::;

I noted that this output does not have the partition type information,
which is needed for the resize command below.

Is the partition to be resized always a primary one? Does it have to be?
Is it safe to just assume it is?

> + if runner.match_id != 0:
> + msg = "Unable to determine rootfs partition"
> + logging.warning(msg)
> + else:
> + start = runner.match.group(1)
> + parttype = runner.match.group(2)
> +
> + if parttype == 'ext2' or parttype == 'ext3' or parttype == 'ext4':
> + runner.run('parted -s /dev/sda rm %s' % partno)
> + runner.run('parted -s /dev/sda mkpart primary %s 100%%' % start)
> + runner.run('resize2fs -f /dev/sda%s' % partno)
> + elif parttpe == 'brtfs':
> + logging.warning("resize of btrfs partition not supported")
> + else:
> + logging.warning("unknown partition type for resize: %s" % parttype)

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lava_dispatcher/client/base.py'
2--- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
3+++ lava_dispatcher/client/base.py 2013-04-18 16:59:28 +0000
4@@ -154,7 +154,7 @@
5 lava_server_ip = self._client.context.config.lava_server_ip
6 self.run(
7 "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
8- ["1 received", "0 received", "Network is unreachable"],
9+ ["1 received|1 packets received", "0 received|0 packets received", "Network is unreachable"],
10 timeout=5, failok=True)
11 if self.match_id == 0:
12 return True
13
14=== modified file 'lava_dispatcher/client/lmc_utils.py'
15--- lava_dispatcher/client/lmc_utils.py 2013-02-18 03:19:14 +0000
16+++ lava_dispatcher/client/lmc_utils.py 2013-04-18 16:59:28 +0000
17@@ -15,7 +15,8 @@
18 )
19
20
21-def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None):
22+def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None,
23+ extra_boot_args=None, image_size=None):
24 """Generate image from a hwpack and rootfs url
25
26 :param hwpack_url: url of the Linaro hwpack to download
27@@ -47,6 +48,10 @@
28 (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))
29 if rootfstype is not None:
30 cmd += ' --rootfs ' + rootfstype
31+ if image_size is not None:
32+ cmd += ' --image-size ' + image_size
33+ if extra_boot_args is not None:
34+ cmd += ' --extra-boot-args "%s"' % extra_boot_args
35 logging.info("Executing the linaro-media-create command")
36 logging.info(cmd)
37
38
39=== modified file 'lava_dispatcher/config.py'
40--- lava_dispatcher/config.py 2013-04-05 17:19:57 +0000
41+++ lava_dispatcher/config.py 2013-04-18 16:59:28 +0000
42@@ -104,6 +104,8 @@
43 default='Press Enter to stop auto boot...')
44 vexpress_usb_mass_storage_device = schema.StringOption(default=None)
45
46+ ecmeip = schema.StringOption()
47+
48 class OptionDescriptor(object):
49 def __init__(self, name):
50 self.name = name
51
52=== added file 'lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf'
53--- lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf 1970-01-01 00:00:00 +0000
54+++ lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf 2013-04-18 16:59:28 +0000
55@@ -0,0 +1,2 @@
56+client_type = highbank
57+connection_command = ipmitool -I lanplus -U admin -P admin -H %(ecmeip)s sol activate
58
59=== added file 'lava_dispatcher/device/highbank.py'
60--- lava_dispatcher/device/highbank.py 1970-01-01 00:00:00 +0000
61+++ lava_dispatcher/device/highbank.py 2013-04-18 16:59:28 +0000
62@@ -0,0 +1,286 @@
63+# Copyright (C) 2012 Linaro Limited
64+#
65+# Author: Michael Hudson-Doyle <michael.hudson@linaro.org>
66+# Author: Nicholas Schutt <nick.schutt@linaro.org>
67+#
68+# This file is part of LAVA Dispatcher.
69+#
70+# LAVA Dispatcher is free software; you can redistribute it and/or modify
71+# it under the terms of the GNU General Public License as published by
72+# the Free Software Foundation; either version 2 of the License, or
73+# (at your option) any later version.
74+#
75+# LAVA Dispatcher is distributed in the hope that it will be useful,
76+# but WITHOUT ANY WARRANTY; without even the implied warranty of
77+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
78+# GNU General Public License for more details.
79+#
80+# You should have received a copy of the GNU General Public License
81+# along
82+# with this program; if not, see <http://www.gnu.org/licenses>.
83+
84+import contextlib
85+import logging
86+import os
87+import pexpect
88+import time
89+
90+from lava_dispatcher import tarballcache
91+
92+from lava_dispatcher.device.master import (
93+ MasterCommandRunner,
94+)
95+from lava_dispatcher.device.target import (
96+ Target
97+)
98+from lava_dispatcher.errors import (
99+ NetworkError,
100+ CriticalError,
101+ OperationFailed,
102+)
103+from lava_dispatcher.downloader import (
104+ download_image,
105+ download_with_retry,
106+ )
107+from lava_dispatcher.utils import (
108+ mk_targz,
109+ rmtree,
110+)
111+from lava_dispatcher.client.lmc_utils import (
112+ generate_image,
113+)
114+from lava_dispatcher.ipmi import IpmiPxeBoot
115+
116+
117+class HighbankTarget(Target):
118+
119+ MASTER_PS1 = 'root@master [rc=$(echo \$?)]# '
120+ MASTER_PS1_PATTERN = 'root@master \[rc=(\d+)\]# '
121+
122+ def __init__(self, context, config):
123+ super(HighbankTarget, self).__init__(context, config)
124+ self.proc = self.context.spawn(self.config.connection_command, timeout=1200)
125+ self.device_version = None
126+ if self.config.ecmeip == None:
127+ msg = "The ecmeip address is not set for this target"
128+ logging.error(msg)
129+ raise CriticalError(msg)
130+ self.bootcontrol = IpmiPxeBoot(context, self.config.ecmeip)
131+
132+ def get_device_version(self):
133+ return self.device_version
134+
135+ def power_on(self):
136+ self.bootcontrol.power_on_boot_image()
137+ return self.proc
138+
139+ def power_off(self, proc):
140+ self.bootcontrol.power_off()
141+
142+ def deploy_linaro(self, hwpack, rfs, bootloader):
143+ image_file = generate_image(self, hwpack, rfs, self.scratch_dir, bootloader,
144+ extra_boot_args='1', image_size='1G')
145+ self._customize_linux(image_file)
146+ self._deploy_image(image_file, '/dev/sda')
147+
148+ def deploy_linaro_prebuilt(self, image):
149+ image_file = download_image(image, self.context, self.scratch_dir)
150+ self._customize_linux(image_file)
151+ self._deploy_image(image_file, '/dev/sda')
152+
153+ def _deploy_image(self, image_file, device):
154+ with self._as_master() as runner:
155+
156+ # compress the image to reduce the transfer size
157+ if not image_file.endswith('.bz2') and not image_file.endswith('gz'):
158+ os.system('bzip2 -9v ' + image_file)
159+ image_file += '.bz2'
160+
161+ tmpdir = self.context.config.lava_image_tmpdir
162+ url = self.context.config.lava_image_url
163+ image_file = image_file.replace(tmpdir, '')
164+ image_url = '/'.join(u.strip('/') for u in [url, image_file])
165+
166+ build_dir = '/builddir'
167+ image_file_base = build_dir + '/' + '/'.join(image_file.split('/')[-1:])
168+
169+ decompression_cmd = None
170+ if image_file_base.endswith('.gz'):
171+ decompression_cmd = '/bin/gzip -dc'
172+ elif image_file_base.endswith('.bz2'):
173+ decompression_cmd = '/bin/bzip2 -dc'
174+
175+ runner.run('mkdir %s' % build_dir)
176+ runner.run('mount -t tmpfs -o size=100%% tmpfs %s' % build_dir)
177+ runner.run('wget -O %s %s' % (image_file_base, image_url), timeout=1800)
178+
179+ if decompression_cmd != None:
180+ cmd = '%s %s | dd bs=4M of=%s' % (decompression_cmd, image_file_base, device)
181+ else:
182+ cmd = 'dd bs=4M if=%s of=%s' % (image_file_base, device)
183+
184+ runner.run(cmd, timeout=1800)
185+ runner.run('umount %s' % build_dir)
186+
187+ self.resize_rootfs_partition(runner)
188+
189+ def get_partition(self, runner, partition):
190+ if partition == self.config.boot_part:
191+ partition = '/dev/disk/by-label/boot'
192+ elif partition == self.config.root_part:
193+ partition = '/dev/disk/by-label/rootfs'
194+ else:
195+ raise RuntimeError(
196+ 'unknown master image partition(%d)' % partition)
197+ return partition
198+
199+ def resize_rootfs_partition(self, runner):
200+ partno = '2'
201+ start = None
202+
203+ runner.run('parted -s /dev/sda print',
204+ response='\s+%s\s+([0-9.]+.B)\s+\S+\s+\S+\s+primary\s+(\S+)' % partno,
205+ wait_prompt=False)
206+ if runner.match_id != 0:
207+ msg = "Unable to determine rootfs partition"
208+ logging.warning(msg)
209+ else:
210+ start = runner.match.group(1)
211+ parttype = runner.match.group(2)
212+
213+ if parttype == 'ext2' or parttype == 'ext3' or parttype == 'ext4':
214+ runner.run('parted -s /dev/sda rm %s' % partno)
215+ runner.run('parted -s /dev/sda mkpart primary %s 100%%' % start)
216+ runner.run('resize2fs -f /dev/sda%s' % partno)
217+ elif parttpe == 'brtfs':
218+ logging.warning("resize of btrfs partition not supported")
219+ else:
220+ logging.warning("unknown partition type for resize: %s" % parttype)
221+
222+
223+ @contextlib.contextmanager
224+ def file_system(self, partition, directory):
225+ logging.info('attempting to access master filesystem %r:%s' %
226+ (partition, directory))
227+
228+ assert directory != '/', "cannot mount entire partition"
229+
230+ with self._as_master() as runner:
231+ runner.run('mkdir -p /mnt')
232+ partition = self.get_partition(runner, partition)
233+ runner.run('mount %s /mnt' % partition)
234+ try:
235+ targetdir = '/mnt/%s' % directory
236+ runner.run('mkdir -p %s' % targetdir)
237+
238+ parent_dir, target_name = os.path.split(targetdir)
239+
240+ runner.run('/bin/tar -cmzf /tmp/fs.tgz -C %s %s' % (parent_dir, target_name))
241+ runner.run('cd /tmp') # need to be in same dir as fs.tgz
242+
243+ url_base = runner.start_http_server()
244+
245+ url = url_base + '/fs.tgz'
246+ logging.info("Fetching url: %s" % url)
247+ tf = download_with_retry(self.context, self.scratch_dir, url, False)
248+
249+ tfdir = os.path.join(self.scratch_dir, str(time.time()))
250+
251+ try:
252+ os.mkdir(tfdir)
253+ self.context.run_command('/bin/tar -C %s -xzf %s' % (tfdir, tf))
254+ yield os.path.join(tfdir, target_name)
255+
256+ finally:
257+ tf = os.path.join(self.scratch_dir, 'fs.tgz')
258+ mk_targz(tf, tfdir)
259+ rmtree(tfdir)
260+
261+ # get the last 2 parts of tf, ie "scratchdir/tf.tgz"
262+ tf = '/'.join(tf.split('/')[-2:])
263+ runner.run('rm -rf %s' % targetdir)
264+ self._target_extract(runner, tf, parent_dir)
265+
266+ finally:
267+ runner.stop_http_server()
268+ runner.run('umount /mnt')
269+
270+ def _target_extract(self, runner, tar_file, dest, timeout=-1):
271+ tmpdir = self.context.config.lava_image_tmpdir
272+ url = self.context.config.lava_image_url
273+ tar_file = tar_file.replace(tmpdir, '')
274+ tar_url = '/'.join(u.strip('/') for u in [url, tar_file])
275+ self._target_extract_url(runner,tar_url,dest,timeout=timeout)
276+
277+ def _target_extract_url(self, runner, tar_url, dest, timeout=-1):
278+ decompression_cmd = ''
279+ if tar_url.endswith('.gz') or tar_url.endswith('.tgz'):
280+ decompression_cmd = '| /bin/gzip -dc'
281+ elif tar_url.endswith('.bz2'):
282+ decompression_cmd = '| /bin/bzip2 -dc'
283+ elif tar_url.endswith('.tar'):
284+ decompression_cmd = ''
285+ else:
286+ raise RuntimeError('bad file extension: %s' % tar_url)
287+
288+ runner.run('wget -O - %s %s | /bin/tar -C %s -xmf -'
289+ % (tar_url, decompression_cmd, dest),
290+ timeout=timeout)
291+
292+ @contextlib.contextmanager
293+ def _as_master(self):
294+ self.bootcontrol.power_on_boot_master()
295+
296+ # Two reboots seem to be necessary to ensure that pxe boot is used.
297+ # Need to identify the cause and fix it
298+ self.proc.expect("Hit any key to stop autoboot:")
299+ self.proc.sendline('')
300+ self.bootcontrol.power_reset_boot_master()
301+
302+ self.proc.expect("\(initramfs\)")
303+ self.proc.sendline('export PS1="%s"' % self.MASTER_PS1)
304+ self.proc.expect(self.MASTER_PS1_PATTERN, timeout=180, lava_no_logging=1)
305+ runner = HBMasterCommandRunner(self)
306+ runner.run(". /scripts/functions")
307+ device = "eth0"
308+ runner.run("DEVICE=%s configure_networking" % device)
309+
310+ self.device_version = runner.get_device_version()
311+
312+ try:
313+ yield runner
314+ finally:
315+ logging.debug("deploy done")
316+
317+
318+target_class = HighbankTarget
319+
320+
321+class HBMasterCommandRunner(MasterCommandRunner):
322+ """A CommandRunner to use when the target is booted into the master image.
323+ """
324+ http_pid = None
325+
326+ def __init__(self, target):
327+ super(HBMasterCommandRunner, self).__init__(target)
328+
329+ def start_http_server(self):
330+ master_ip = self.get_master_ip()
331+ if self.http_pid != None:
332+ raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
333+ # busybox produces no output to parse for, so run it in the bg and get its pid
334+ self.run('busybox httpd -f &')
335+ self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
336+ if self.match_id != 0:
337+ raise OperationFailed("busybox httpd did not start")
338+ else:
339+ self.http_pid = self.match.group(1)
340+ url_base = "http://%s" % (master_ip)
341+ return url_base
342+
343+ def stop_http_server(self):
344+ if self.http_pid == None:
345+ raise OperationFailed("busybox httpd not running, but stop_http_server called.")
346+ self.run('kill %s' % self.http_pid)
347+ self.http_pid = None
348+
349
350=== added file 'lava_dispatcher/ipmi.py'
351--- lava_dispatcher/ipmi.py 1970-01-01 00:00:00 +0000
352+++ lava_dispatcher/ipmi.py 2013-04-18 16:59:28 +0000
353@@ -0,0 +1,85 @@
354+# Copyright (C) 2013 Linaro Limited
355+#
356+# Authors:
357+# Antonio Terceiro <antonio.terceiro@linaro.org>
358+# Michael Hudson-Doyle <michael.hudson@linaro.org>
359+#
360+# This file is part of LAVA Dispatcher.
361+#
362+# LAVA Dispatcher is free software; you can redistribute it and/or modify
363+# it under the terms of the GNU General Public License as published by
364+# the Free Software Foundation; either version 2 of the License, or
365+# (at your option) any later version.
366+#
367+# LAVA Dispatcher is distributed in the hope that it will be useful,
368+# but WITHOUT ANY WARRANTY; without even the implied warranty of
369+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
370+# GNU General Public License for more details.
371+#
372+# You should have received a copy of the GNU General Public License
373+# along
374+# with this program; if not, see <http://www.gnu.org/licenses>.
375+
376+
377+class IPMITool(object):
378+ """
379+ This class wraps the ipmitool CLI to provide a convenient object-oriented
380+ API that can be composed into the implementation of devices that can be
381+ managed with IPMI.
382+ """
383+
384+ def __init__(self, context, host, ipmitool="ipmitool"):
385+ self.host = host
386+ self.context = context
387+ self.ipmitool = ipmitool
388+
389+ def __ipmi(self, command):
390+ self.context.run_command(
391+ "%s -H %s -U admin -P admin %s" % (
392+ self.ipmitool, self.host, command
393+ )
394+ )
395+
396+ def set_to_boot_from_disk(self):
397+ self.__ipmi("chassis bootdev disk")
398+
399+ def set_to_boot_from_pxe(self):
400+ self.__ipmi("chassis bootdev pxe")
401+
402+ def power_off(self):
403+ self.__ipmi("chassis power off")
404+
405+ def power_on(self):
406+ self.__ipmi("chassis power on")
407+
408+ def reset(self):
409+ self.__ipmi("chassis power reset")
410+
411+
412+class IpmiPxeBoot(object):
413+ """
414+ This class provides a convenient object-oriented API that can be
415+ used to initiate power on/off and boot device selection for pxe
416+ and disk boot devices using ipmi commands.
417+ """
418+
419+ def __init__(self, context, host):
420+ self.ipmitool = IPMITool(context, host)
421+
422+ def power_on_boot_master(self):
423+ self.ipmitool.set_to_boot_from_pxe()
424+ self.ipmitool.power_on()
425+ self.ipmitool.reset()
426+
427+ def power_reset_boot_master(self):
428+ self.ipmitool.set_to_boot_from_pxe()
429+ self.ipmitool.reset()
430+
431+ def power_on_boot_image(self):
432+ self.ipmitool.set_to_boot_from_disk()
433+ self.ipmitool.power_on()
434+ self.ipmitool.reset()
435+
436+ def power_off(self):
437+ self.ipmitool.power_off()
438+

Subscribers

People subscribed via source and target branches