Merge lp:~nick-schutt/lava-dispatcher/nicks-highbank-support into lp:lava-dispatcher

Proposed by Nicholas Schutt
Status: Superseded
Proposed branch: lp:~nick-schutt/lava-dispatcher/nicks-highbank-support
Merge into: lp:lava-dispatcher
Diff against target: 413 lines (+357/-2)
6 files modified
lava_dispatcher/client/base.py (+1/-1)
lava_dispatcher/client/lmc_utils.py (+6/-1)
lava_dispatcher/config.py (+2/-0)
lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf (+2/-0)
lava_dispatcher/device/highbank.py (+261/-0)
lava_dispatcher/ipmi.py (+85/-0)
To merge this branch: bzr merge lp:~nick-schutt/lava-dispatcher/nicks-highbank-support
Reviewer Review Type Date Requested Status
Antonio Terceiro Needs Fixing
Review via email: mp+157658@code.launchpad.net

This proposal supersedes a proposal from 2013-03-22.

This proposal has been superseded by a proposal from 2013-04-16.

Description of the change

Beta version of calxeda highbank support. Tested with lava_test_shell, deployed for 8 machines to test.

deploy_linaro & deploy_linaro_prebuilt implemented

tested with hwpack + rootfs from snapshots.linaro.org

1. Boots a kernel + initrd master image via pxe

2. dd's lmc-produced image directly to the hard disk

3. uses busybox httpd on the master initrd to transfer files from the target

4. boots via the disk once the disk is prepared

5. boots the initrd master image again to retrieve the results via busybox httpd

To post a comment you must log in.
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

This looks interesting, but as I'm sure you're aware it's still pretty
ugly. Has there been any progress in the mean time on cleaning things
up?

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

waiting for a hw pack + snapshot that work to be able to complete

On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:

> This looks interesting, but as I'm sure you're aware it's still pretty
> ugly. Has there been any progress in the mean time on cleaning things
> up?
>
> --
>
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are the owner of
> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

I kind of understand that it's "ugly," but I'm wondering if you might have
some suggestions? I know it's not done yet. I have made it work a couple of
times now, once with a tarball of an ubuntu system (calxeda02-10), and
another time now with a snapshot that requires a bit of hacking right now.

I think the goal now is to use lmc to produce a working image and dd that
to the disk; that should remove the "hack" part. (Copying Antonio since we
have discussed this)

On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:

> This looks interesting, but as I'm sure you're aware it's still pretty
> ugly. Has there been any progress in the mean time on cleaning things
> up?
>
> --
>
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are the owner of
> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

Nicholas Schutt <email address hidden> writes:

> I kind of understand that it's "ugly," but I'm wondering if you might have
> some suggestions?

Well, I guess mainly it's the duplication from master.py (which is my
fault, I know). It would be nice if we could have a common base class
of MasterImageTarget and HighbankTarget. The other ugliness is mostly
the usual dispatcher cruft :(

> I know it's not done yet. I have made it work a couple of times now,
> once with a tarball of an ubuntu system (calxeda02-10), and another
> time now with a snapshot that requires a bit of hacking right now.

Cool!

> I think the goal now is to use lmc to produce a working image and dd that
> to the disk; that should remove the "hack" part. (Copying Antonio since we
> have discussed this)

When I talked to Fathi about this (admittedly about 6 weeks ago), he
didn't think that lmc would support highbank. I agree just dd-ing an
image over /dev/sda would be a lot cleaner.

Cheers,
mwh

> On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:
>
>> This looks interesting, but as I'm sure you're aware it's still pretty
>> ugly. Has there been any progress in the mean time on cleaning things
>> up?
>>
>> --
>>
>> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
>> You are the owner of
>> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>>
>
> --
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are subscribed to branch lp:lava-dispatcher.

Revision history for this message
Fathi Boudra (fboudra) wrote : Posted in a previous version of this proposal

On 2 April 2013 23:10, Michael Hudson-Doyle
<email address hidden> wrote:
>> I think the goal now is to use lmc to produce a working image and dd that
>> to the disk; that should remove the "hack" part. (Copying Antonio since we
>> have discussed this)
>
> When I talked to Fathi about this (admittedly about 6 weeks ago), he
> didn't think that lmc would support highbank. I agree just dd-ing an
> image over /dev/sda would be a lot cleaner.

https://code.launchpad.net/~fboudra/linaro-image-tools/highbank-support

It's now supported in a similar way to a development board.
I'm not convinced it's the best approach for servers but it works.

Revision history for this message
Antonio Terceiro (terceiro) wrote :
Download full text (22.6 KiB)

Hey Nick,

It's freaking cool that we already had this working! :-)

Follows my comments. I know you have a separate branch where you are
trying to abstract the parts in common with master.py, but I am making
some comments about that here anyway.

 review needs-fixing

> === modified file 'lava_dispatcher/client/base.py'
> --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> @@ -154,7 +154,7 @@
> lava_server_ip = self._client.context.config.lava_server_ip
> self.run(
> "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> - ["1 received", "0 received", "Network is unreachable"],
> + ["1 received|1 packets* received", "0 received|0 packets received", "Network is unreachable"],
> timeout=5, failok=True)
> if self.match_id == 0:
> return True
>

Do you really need this? Did ping had a different output on the rootfs
you tested with?

> === modified file 'lava_dispatcher/client/lmc_utils.py'
> --- lava_dispatcher/client/lmc_utils.py 2013-02-18 03:19:14 +0000
> +++ lava_dispatcher/client/lmc_utils.py 2013-04-08 13:59:23 +0000
> @@ -15,7 +15,8 @@
> )
>
>
> -def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None):
> +def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None,
> + extra_boot_args=None, image_size=None):
> """Generate image from a hwpack and rootfs url
>
> :param hwpack_url: url of the Linaro hwpack to download
> @@ -32,7 +33,7 @@
> rootfs_path = download_image(rootfs_url, client.context, outdir, decompress=False)
>
> logging.info("linaro-media-create version information")
> - cmd = "sudo linaro-media-create -v"
> + cmd = "linaro-media-create -v"
> rc, output = getstatusoutput(cmd)
> metadata = client.context.test_data.get_metadata()
> metadata['target.linaro-media-create-version'] = output

I'm not sure we want to drop the sudo there. Even though the dispatcher
currently requires being run as root, I think in the long run we should
be able to drop that requirement. Also, if we are root already, the sudo
does no harm.

Additionally, it's a good idea to keep the merge proposal focused and
avoid lateral changes.

> @@ -42,11 +43,15 @@
>
> logging.info("client.device_type = %s" %client.config.device_type)
>
> - cmd = ("sudo flock /var/lock/lava-lmc.lck linaro-media-create --hwpack-force-yes --dev %s "
> + cmd = ("flock /var/lock/lava-lmc.lck linaro-media-create --hwpack-force-yes --dev %s "
> "--image-file %s --binary %s --hwpack %s --image-size 3G --bootloader %s" %
> (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))

Ditto.

> if rootfstype is not None:
> cmd += ' --rootfs ' + rootfstype
> + if image_size is not None:
> + cmd += ' --image-size ' + image_size
> + if extra_boot_args is not None:
> + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> logging.info("Executing the linaro-media-create command")
> logging.info(cmd)
>...

review: Needs Fixing
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Just one quibble.

Antonio Terceiro <email address hidden> writes:

>> + def start_http_server(self, runner, port=80):
>> + # busybox produces no output to parse for, so let it run as a daemon
>> + runner.run('busybox httpd -v -p %s' % port)
>> + url_base = "http://%s:%s" % (self.master_ip, port)
>> + return url_base
>
> I don't think we should use the standard http port here, because most
> probably use cases will include installing a proper web server, and in
> that case trying to start busybox on the default port will break things.

The device is running in the initramfs here, surely?

I have discovered a new thing to be annoyed about: things that don't let
you bind to port 0 and then report the port they bound (nc does this
too).

Cheers,
mwh

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Mon, Apr 08, 2013 at 09:46:30PM -0000, Michael Hudson-Doyle wrote:
> Just one quibble.
>
> Antonio Terceiro <email address hidden> writes:
>
> >> + def start_http_server(self, runner, port=80):
> >> + # busybox produces no output to parse for, so let it run as a daemon
> >> + runner.run('busybox httpd -v -p %s' % port)
> >> + url_base = "http://%s:%s" % (self.master_ip, port)
> >> + return url_base
> >
> > I don't think we should use the standard http port here, because most
> > probably use cases will include installing a proper web server, and in
> > that case trying to start busybox on the default port will break things.
>
> The device is running in the initramfs here, surely?

Yes, you are right! Nick, in this case you don't need to care about the
port number there.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> I'm not sure we want to drop the sudo there. Even though the dispatcher
> currently requires being run as root, I think in the long run we should
> be able to drop that requirement. Also, if we are root already, the sudo
> does no harm.

Antonio,

I will undo the sudo changes since they're not important now that everything works with the scheduler. But, I saw an issue when running the dispatcher as root in my own local virtual environment. For each sudo command the virtual environment was lost; I'm not sure why.

Nick

597. By Nicholas Schutt

put back sudo

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> > === modified file 'lava_dispatcher/client/base.py'
> > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > @@ -154,7 +154,7 @@
> > lava_server_ip = self._client.context.config.lava_server_ip
> > self.run(
> > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > - ["1 received", "0 received", "Network is unreachable"],
> > + ["1 received|1 packets* received", "0 received|0 packets
> received", "Network is unreachable"],
> > timeout=5, failok=True)
> > if self.match_id == 0:
> > return True
> >
>
> Do you really need this? Did ping had a different output on the rootfs
> you tested with?

>

The output from ping in the initrd seems to match what I see on my ubuntu machine,
which contains the word "packets":

nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52 time=33.8 ms

--- validation.linaro.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> > if rootfstype is not None:
> > cmd += ' --rootfs ' + rootfstype
> > + if image_size is not None:
> > + cmd += ' --image-size ' + image_size
> > + if extra_boot_args is not None:
> > + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> > logging.info("Executing the linaro-media-create command")
> > logging.info(cmd)
> >
> > @@ -85,7 +90,7 @@
> > mntdir = mkdtemp()
> > image = image_file
> > offset = get_partition_offset(image, partno)
> > - mount_cmd = "sudo mount -o loop,offset=%s %s %s" % (offset, image,
> mntdir)
> > + mount_cmd = "mount -o loop,offset=%s %s %s" % (offset, image, mntdir)
> > rc = logging_system(mount_cmd)
> > if rc != 0:
> > os.rmdir(mntdir)
>

We need lmc changes to get the snapshot image to work. Should this be done on another branch? The existing snapshots won't work without these changes.

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> > + if runner.match_id != 0:
> > + msg = "Unable to determine dns address"
> > + logging.error(msg)
> > + raise CriticalError(msg)
> > + dns = runner.match.group(1)
> > + logging.info("DNS Address is %s" % dns)
> > + runner.run("echo nameserver %s > /etc/resolv.conf" % dns)
>
> Do we actually still need this DNS setup, by the way? Isn't everything
> that's needed to be acessed acessed by IP now?
>

Agreed. I have now removed it.

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Thu, Apr 11, 2013 at 01:13:31PM -0000, Nicholas Schutt wrote:
>
> > > === modified file 'lava_dispatcher/client/base.py'
> > > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > > @@ -154,7 +154,7 @@
> > > lava_server_ip = self._client.context.config.lava_server_ip
> > > self.run(
> > > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > > - ["1 received", "0 received", "Network is unreachable"],
> > > + ["1 received|1 packets* received", "0 received|0 packets
> > received", "Network is unreachable"],
> > > timeout=5, failok=True)
> > > if self.match_id == 0:
> > > return True
> > >
> >
> > Do you really need this? Did ping had a different output on the rootfs
> > you tested with?
>
> >
>
> The output from ping in the initrd seems to match what I see on my ubuntu machine,
> which contains the word "packets":
>
>
> nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
> PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
> 64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52 time=33.8 ms
>
> --- validation.linaro.org ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms

yes, but note that "packets" are mentioned in the "transmitted" part,
but we really only care about the "received" part, which in this case
will say exactly either "1 received" or "0 received".

my point is to keep the diff minimal, without changes that are unrelated
with the purpose of this MP.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Thu, Apr 11, 2013 at 01:16:24PM -0000, Nicholas Schutt wrote:
> > > if rootfstype is not None:
> > > cmd += ' --rootfs ' + rootfstype
> > > + if image_size is not None:
> > > + cmd += ' --image-size ' + image_size
> > > + if extra_boot_args is not None:
> > > + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> > > logging.info("Executing the linaro-media-create command")
> > > logging.info(cmd)
> > >
> > > @@ -85,7 +90,7 @@
> > > mntdir = mkdtemp()
> > > image = image_file
> > > offset = get_partition_offset(image, partno)
> > > - mount_cmd = "sudo mount -o loop,offset=%s %s %s" % (offset, image,
> > mntdir)
> > > + mount_cmd = "mount -o loop,offset=%s %s %s" % (offset, image, mntdir)
> > > rc = logging_system(mount_cmd)
> > > if rc != 0:
> > > os.rmdir(mntdir)
> >
>
> We need lmc changes to get the snapshot image to work. Should this be
> done on another branch? The existing snapshots won't work without
> these changes.

if this is needed in order to make this branch work with the intended
input data, so it's fine to keep them here.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

598. By Nicholas Schutt

code review updates from antonio

599. By Nicholas Schutt

fixes for code review updates from antonio

600. By Nicholas Schutt

change busybox httpd implementation - put in runner, use 12743 to get pid

601. By Nicholas Schutt

ctrl-c implementation does not work

602. By Nicholas Schutt

switch back to bg/kill for httpd

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

Implemented device versioning in the master image as follows:

/usr/share/initramfs-tools/hooks/master-extras

# Create version info for this image
echo '#!/bin/sh' > /tmp/lava-master-image-info
echo "echo $(date +%Y.%m.%d-%H.%M.%S)" > /tmp/lava-master-image-info
chmod +x /tmp/lava-master-image-info
copy_exec /tmp/lava-master-image-info /sbin

> > + def get_device_version(self):
> > + # To be re-implemented when master image is generated by linaro-
> image-tools
> > + device_version = "unknown"
> > + return device_version
>
> Please make sure that some build number is included in the initrd so
> that we can read it here as device version. You can include something as
> simples as the following snippet in the initramfs-tools hooks file:
>
> echo '#!/bin/sh' > /tmp/lava-master-version
> echo "echo $(date +%Y.%m.%d)" > /tmp/lava-master-version
> chmod +x /tmp/lava-master-version
> copy_exec /tmp/lava-master-version /sbin
>
> Then you can call lava-master-version on the target from this method and
> return the version number printed.
>

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

Here is the actual output from the initrd version of ping. It includes "packets" in the received field

root@master [rc=0]# ^[[38;19RLC_ALL=C ping -W4 -c1 192.168.1.71:8100
PING 192.168.1.71:8100 (192.168.1.71): 56 data bytes
64 bytes from 192.168.1.71: seq=0 ttl=64 time=0.802 ms

--- 192.168.1.71:8100 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.802/0.802/0.802 ms
root@master [rc=0]#
root@master [rc=0]# ^[[38;21Rifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' |awk
 '{print "<" $1 ">"}'

> On Thu, Apr 11, 2013 at 01:13:31PM -0000, Nicholas Schutt wrote:
> >
> > > > === modified file 'lava_dispatcher/client/base.py'
> > > > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > > > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > > > @@ -154,7 +154,7 @@
> > > > lava_server_ip = self._client.context.config.lava_server_ip
> > > > self.run(
> > > > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > > > - ["1 received", "0 received", "Network is unreachable"],
> > > > + ["1 received|1 packets* received", "0 received|0 packets
> > > received", "Network is unreachable"],
> > > > timeout=5, failok=True)
> > > > if self.match_id == 0:
> > > > return True
> > > >
> > >
> > > Do you really need this? Did ping had a different output on the rootfs
> > > you tested with?
> >
> > >
> >
> > The output from ping in the initrd seems to match what I see on my ubuntu
> machine,
> > which contains the word "packets":
> >
> >
> > nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-
> support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
> > PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
> > 64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52
> time=33.8 ms
> >
> > --- validation.linaro.org ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms
>
> yes, but note that "packets" are mentioned in the "transmitted" part,
> but we really only care about the "received" part, which in this case
> will say exactly either "1 received" or "0 received".
>
> my point is to keep the diff minimal, without changes that are unrelated
> with the purpose of this MP.
>
> --
> Antonio Terceiro
> Software Engineer - Linaro
> http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

send_control('c') does not appear to work when running in the initrd master. Maybe that needs to be looked at and fixed, but to make things work I had to run in the background and use kill instead.

I modified to do this differently now:

class HBMasterCommandRunner(MasterCommandRunner):

    http_pid = None

    def start_http_server(self):
        master_ip = self.get_master_ip()
        if self.http_pid != None:
            raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
        # busybox produces no output to parse for, so run it in the bg and get its pid
        self.run('busybox httpd -f &')
        self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
        if self.match_id != 0:
            raise OperationFailed("busybox httpd did not start")
        else:
            self.http_pid = self.match.group(1)
        url_base = "http://%s" % (master_ip)
        return url_base

    def stop_http_server(self):
        if self.http_pid == None:
            raise OperationFailed("busybox httpd not running, but stop_http_server called.")
        self.run('kill %s' % self.http_pid)
        self.http_pid = None

> > +
> > + def start_http_server(self, runner, port=80):
> > + # busybox produces no output to parse for, so let it run as a
> daemon
> > + runner.run('busybox httpd -v -p %s' % port)
> > + url_base = "http://%s:%s" % (self.master_ip, port)
> > + return url_base
>
> I don't think we should use the standard http port here, because most
> probably use cases will include installing a proper web server, and in
> that case trying to start busybox on the default port will break things.
>
> Try some high port that is unlikely to be used by another service ...
> like ... 50888 for now. :)
>
> > +
> > + def stop_http_server(self, runner):
> > + runner.run('killall busybox')
>
> Not safe. I think it's better to just start busybox httpd in the
> foreground and then send a control-C here.
>

603. By Nicholas Schutt

remove device version code (reuse from master class), fix /builddir

604. By Nicholas Schutt

Remove ramdisk for wget + dd to disk

605. By Nicholas Schutt

Remove class HBMasterCommandRunner(MasterCommandRunner):

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Fri, Apr 12, 2013 at 04:10:35PM -0000, Nicholas Schutt wrote:
>
> Here is the actual output from the initrd version of ping. It includes "packets" in the received field
>
>
> root@master [rc=0]# ^[[38;19RLC_ALL=C ping -W4 -c1 192.168.1.71:8100
> PING 192.168.1.71:8100 (192.168.1.71): 56 data bytes
> 64 bytes from 192.168.1.71: seq=0 ttl=64 time=0.802 ms
>
> --- 192.168.1.71:8100 ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max = 0.802/0.802/0.802 ms
> root@master [rc=0]#
> root@master [rc=0]# ^[[38;21Rifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' |awk
> '{print "<" $1 ">"}'

ok. in that case you want to drop the '*' so the pattern is
"1 received|1 packets received"

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Fri, Apr 12, 2013 at 04:13:34PM -0000, Nicholas Schutt wrote:
>
> send_control('c') does not appear to work when running in the initrd master. Maybe that needs to be looked at and fixed, but to make things work I had to run in the background and use kill instead.
>
> I modified to do this differently now:
>
>
> class HBMasterCommandRunner(MasterCommandRunner):
>
> http_pid = None
>
> def start_http_server(self):
> master_ip = self.get_master_ip()
> if self.http_pid != None:
> raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
> # busybox produces no output to parse for, so run it in the bg and get its pid
> self.run('busybox httpd -f &')
> self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
> if self.match_id != 0:
> raise OperationFailed("busybox httpd did not start")
> else:
> self.http_pid = self.match.group(1)
> url_base = "http://%s" % (master_ip)
> return url_base
>
> def stop_http_server(self):
> if self.http_pid == None:
> raise OperationFailed("busybox httpd not running, but stop_http_server called.")
> self.run('kill %s' % self.http_pid)
> self.http_pid = None

Looks good.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :
Download full text (7.8 KiB)

The wget, bunzip, then dd to disk approach seems to be necessary to avoid memory allocation errors from the kernel. I'm not sure of the root cause.

***> The actual error is included at the end of this comment <***

Re-implemented to keep the image compressed in the ramdisk and to set the max ramdisk size to 50% of memory:

            image_file_base = '/'.join(image_file.split('/')[-1:])

            decompression_cmd = None
            if image_file_base.endswith('.gz'):
                decompression_cmd = '/bin/gzip -dc'
            elif image_file_base.endswith('.bz2'):
                decompression_cmd = '/bin/bzip2 -dc'

            runner.run('mkdir /builddir')
            runner.run('mount -t tmpfs -o size=50% tmpfs /builddir')
            runner.run('wget -O /builddir/%s %s' % (image_file_base, image_url), timeout=1800)

            if decompression_cmd != None:
                cmd = 'dd bs=4M if=%s of=%s' % (image_file_base, device)
            else:
                cmd = '%s %s | dd bs=4M of=%s' % (decompression_cmd, image_file_base, device)

            runner.run(cmd, timeout=1800)
            runner.run('umount /builddir')

--------

>
> > + elif image_url.endswith('.bz2'):
> > + decompression_cmd = '| /bin/bzip2 -dc'
> > +
> > + runner.run('mkdir /builddir')
> > + runner.run('mount -t tmpfs -o size=4G tmpfs builddir')
> > + image = '/builddir/lava.img'
> > + runner.run('wget -O - %s %s > %s' % (image_url,
> decompression_cmd, image), timeout=1800)
> > + runner.run('dd bs=4M if=%s of=%s' % (image, device),
> timeout=1800)
> > + runner.run('umount /builddir')
>
> Is it always possible to create a 4G tmpfs?
>
> Instead of saving the image locally, I think you could pipe the wget
> output directl into dd. This way you don't need to to create a tmpfs,
> nor to mount/umount it. This would be something like:
>
> runner.run("wget -O - %s %s | dd bs=4M of=%s" % (image_url,
> decompression_cmd, device))
>
> Also, there are some wget options you can use to get some progress
> indication, which is useful. Look at how master.py does it.
>
> (if you convince me the tmpfs is needed, make sure to add a missing
> slash in the last token there so it reads `/builddir` instead of
> `builddir`)
>

----------

root@master [rc=0]# wget -O - http://192.168.1.71/highbank2-tmp/images/tmprBHgDV
/lava.img.bz2 | /bin/bzip2 -dc | dd bs=4M of=/dev/sda
Connecting to 192.168.1.71 (192.168.1.71:80)
- 1% | | 2558k 0:01:16 ETA- 2% | | 5266k 0:01:12 ETA- 4% |* | 8378k 0:01:07 ETA- 4% |* | 8930k 0:01:24 ETA- 4% |* | 9282k 0:01:41 ETA- 5% |* | 10198k 0:01:49 ETA- 6% |* |

...

143M 0:00:26 ETA- 75% |*********************** | 145M 0:00:25 ETA- 76% |******...

Read more...

606. By Nicholas Schutt

separate wget, bunzip2, and dd for deploy_image

607. By Nicholas Schutt

separate wget, bunzip2, and dd for deploy_image

608. By Nicholas Schutt

move start_http/stop_http to highbank class

609. By Nicholas Schutt

fix percent symbol in mount command

610. By Nicholas Schutt

minor change to start_http_server

611. By Nicholas Schutt

simplify start_http - keep pid on the target

612. By Nicholas Schutt

put http code back in HBMasterCommandRunner

613. By Nicholas Schutt

change tmpfs size from 50% to 100%

614. By Nicholas Schutt

add partition resize support

615. By Nicholas Schutt

allow resize of ext2, ext3, ext4 and add stub for btrfs

616. By Nicholas Schutt

make partno a string

617. By Nicholas Schutt

issue a warning if part resize fails, fix tabs

618. By Nicholas Schutt

learning python

619. By Nicholas Schutt

learning python

620. By Nicholas Schutt

nedit is not vi

621. By Nicholas Schutt

add call to dhclient

622. By Nicholas Schutt

add call to dhclient

623. By Nicholas Schutt

add call to dhclient

624. By Nicholas Schutt

add dd to erase the first part of the disk

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lava_dispatcher/client/base.py'
2--- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
3+++ lava_dispatcher/client/base.py 2013-04-16 02:42:25 +0000
4@@ -154,7 +154,7 @@
5 lava_server_ip = self._client.context.config.lava_server_ip
6 self.run(
7 "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
8- ["1 received", "0 received", "Network is unreachable"],
9+ ["1 received|1 packets received", "0 received|0 packets received", "Network is unreachable"],
10 timeout=5, failok=True)
11 if self.match_id == 0:
12 return True
13
14=== modified file 'lava_dispatcher/client/lmc_utils.py'
15--- lava_dispatcher/client/lmc_utils.py 2013-02-18 03:19:14 +0000
16+++ lava_dispatcher/client/lmc_utils.py 2013-04-16 02:42:25 +0000
17@@ -15,7 +15,8 @@
18 )
19
20
21-def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None):
22+def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None,
23+ extra_boot_args=None, image_size=None):
24 """Generate image from a hwpack and rootfs url
25
26 :param hwpack_url: url of the Linaro hwpack to download
27@@ -47,6 +48,10 @@
28 (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))
29 if rootfstype is not None:
30 cmd += ' --rootfs ' + rootfstype
31+ if image_size is not None:
32+ cmd += ' --image-size ' + image_size
33+ if extra_boot_args is not None:
34+ cmd += ' --extra-boot-args "%s"' % extra_boot_args
35 logging.info("Executing the linaro-media-create command")
36 logging.info(cmd)
37
38
39=== modified file 'lava_dispatcher/config.py'
40--- lava_dispatcher/config.py 2013-04-05 17:19:57 +0000
41+++ lava_dispatcher/config.py 2013-04-16 02:42:25 +0000
42@@ -104,6 +104,8 @@
43 default='Press Enter to stop auto boot...')
44 vexpress_usb_mass_storage_device = schema.StringOption(default=None)
45
46+ ecmeip = schema.StringOption()
47+
48 class OptionDescriptor(object):
49 def __init__(self, name):
50 self.name = name
51
52=== added file 'lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf'
53--- lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf 1970-01-01 00:00:00 +0000
54+++ lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf 2013-04-16 02:42:25 +0000
55@@ -0,0 +1,2 @@
56+client_type = highbank
57+connection_command = ipmitool -I lanplus -U admin -P admin -H %(ecmeip)s sol activate
58
59=== added file 'lava_dispatcher/device/highbank.py'
60--- lava_dispatcher/device/highbank.py 1970-01-01 00:00:00 +0000
61+++ lava_dispatcher/device/highbank.py 2013-04-16 02:42:25 +0000
62@@ -0,0 +1,261 @@
63+# Copyright (C) 2012 Linaro Limited
64+#
65+# Author: Michael Hudson-Doyle <michael.hudson@linaro.org>
66+# Author: Nicholas Schutt <nick.schutt@linaro.org>
67+#
68+# This file is part of LAVA Dispatcher.
69+#
70+# LAVA Dispatcher is free software; you can redistribute it and/or modify
71+# it under the terms of the GNU General Public License as published by
72+# the Free Software Foundation; either version 2 of the License, or
73+# (at your option) any later version.
74+#
75+# LAVA Dispatcher is distributed in the hope that it will be useful,
76+# but WITHOUT ANY WARRANTY; without even the implied warranty of
77+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
78+# GNU General Public License for more details.
79+#
80+# You should have received a copy of the GNU General Public License
81+# along
82+# with this program; if not, see <http://www.gnu.org/licenses>.
83+
84+import contextlib
85+import logging
86+import os
87+import pexpect
88+import time
89+
90+from lava_dispatcher import tarballcache
91+
92+from lava_dispatcher.device.master import (
93+ MasterCommandRunner,
94+)
95+from lava_dispatcher.device.target import (
96+ Target
97+)
98+from lava_dispatcher.errors import (
99+ NetworkError,
100+ CriticalError,
101+ OperationFailed,
102+)
103+from lava_dispatcher.downloader import (
104+ download_image,
105+ download_with_retry,
106+ )
107+from lava_dispatcher.utils import (
108+ mk_targz,
109+ rmtree,
110+)
111+from lava_dispatcher.client.lmc_utils import (
112+ generate_image,
113+)
114+from lava_dispatcher.ipmi import IpmiPxeBoot
115+
116+
117+class HighbankTarget(Target):
118+
119+ MASTER_PS1 = 'root@master [rc=$(echo \$?)]# '
120+ MASTER_PS1_PATTERN = 'root@master \[rc=(\d+)\]# '
121+
122+ def __init__(self, context, config):
123+ super(HighbankTarget, self).__init__(context, config)
124+ self.proc = self.context.spawn(self.config.connection_command, timeout=1200)
125+ self.device_version = None
126+ if self.config.ecmeip == None:
127+ msg = "The ecmeip address is not set for this target"
128+ logging.error(msg)
129+ raise CriticalError(msg)
130+ self.bootcontrol = IpmiPxeBoot(context, self.config.ecmeip)
131+
132+ def get_device_version(self):
133+ return self.device_version
134+
135+ def power_on(self):
136+ self.bootcontrol.power_on_boot_image()
137+ return self.proc
138+
139+ def power_off(self, proc):
140+ self.bootcontrol.power_off()
141+
142+ def deploy_linaro(self, hwpack, rfs, bootloader):
143+ image_file = generate_image(self, hwpack, rfs, self.scratch_dir, bootloader,
144+ extra_boot_args='1', image_size='1G')
145+ self._customize_linux(image_file)
146+ self._deploy_image(image_file, '/dev/sda')
147+
148+ def deploy_linaro_prebuilt(self, image):
149+ image_file = download_image(image, self.context, self.scratch_dir)
150+ self._customize_linux(image_file)
151+ self._deploy_image(image_file, '/dev/sda')
152+
153+ def _deploy_image(self, image_file, device):
154+ with self._as_master() as runner:
155+
156+ # compress the image to reduce the transfer size
157+ if not image_file.endswith('.bz2') and not image_file.endswith('gz'):
158+ os.system('bzip2 -9v ' + image_file)
159+ image_file += '.bz2'
160+
161+ tmpdir = self.context.config.lava_image_tmpdir
162+ url = self.context.config.lava_image_url
163+ image_file = image_file.replace(tmpdir, '')
164+ image_url = '/'.join(u.strip('/') for u in [url, image_file])
165+
166+ build_dir = '/builddir'
167+ image_file_base = build_dir + '/' + '/'.join(image_file.split('/')[-1:])
168+
169+ decompression_cmd = None
170+ if image_file_base.endswith('.gz'):
171+ decompression_cmd = '/bin/gzip -dc'
172+ elif image_file_base.endswith('.bz2'):
173+ decompression_cmd = '/bin/bzip2 -dc'
174+
175+ runner.run('mkdir %s' % build_dir)
176+ runner.run('mount -t tmpfs -o size=50%% tmpfs %s' % build_dir)
177+ runner.run('wget -O %s %s' % (image_file_base, image_url), timeout=1800)
178+
179+ if decompression_cmd != None:
180+ cmd = '%s %s | dd bs=4M of=%s' % (decompression_cmd, image_file_base, device)
181+ else:
182+ cmd = 'dd bs=4M if=%s of=%s' % (image_file_base, device)
183+
184+ runner.run(cmd, timeout=1800)
185+ runner.run('umount %s' % build_dir)
186+
187+ def get_partition(self, runner, partition):
188+ if partition == self.config.boot_part:
189+ partition = '/dev/disk/by-label/boot'
190+ elif partition == self.config.root_part:
191+ partition = '/dev/disk/by-label/rootfs'
192+ else:
193+ raise RuntimeError(
194+ 'unknown master image partition(%d)' % partition)
195+ return partition
196+
197+
198+ @contextlib.contextmanager
199+ def file_system(self, partition, directory):
200+ logging.info('attempting to access master filesystem %r:%s' %
201+ (partition, directory))
202+
203+ assert directory != '/', "cannot mount entire partition"
204+
205+ with self._as_master() as runner:
206+ runner.run('mkdir -p /mnt')
207+ partition = self.get_partition(runner, partition)
208+ runner.run('mount %s /mnt' % partition)
209+ try:
210+ targetdir = '/mnt/%s' % directory
211+ runner.run('mkdir -p %s' % targetdir)
212+
213+ parent_dir, target_name = os.path.split(targetdir)
214+
215+ runner.run('/bin/tar -cmzf /tmp/fs.tgz -C %s %s' % (parent_dir, target_name))
216+ runner.run('cd /tmp') # need to be in same dir as fs.tgz
217+
218+ url_base = runner.start_http_server()
219+
220+ url = url_base + '/fs.tgz'
221+ logging.info("Fetching url: %s" % url)
222+ tf = download_with_retry(self.context, self.scratch_dir, url, False)
223+
224+ tfdir = os.path.join(self.scratch_dir, str(time.time()))
225+
226+ try:
227+ os.mkdir(tfdir)
228+ self.context.run_command('/bin/tar -C %s -xzf %s' % (tfdir, tf))
229+ yield os.path.join(tfdir, target_name)
230+
231+ finally:
232+ tf = os.path.join(self.scratch_dir, 'fs.tgz')
233+ mk_targz(tf, tfdir)
234+ rmtree(tfdir)
235+
236+ # get the last 2 parts of tf, ie "scratchdir/tf.tgz"
237+ tf = '/'.join(tf.split('/')[-2:])
238+ runner.run('rm -rf %s' % targetdir)
239+ self._target_extract(runner, tf, parent_dir)
240+
241+ finally:
242+ runner.stop_http_server()
243+ runner.run('umount /mnt')
244+
245+ def _target_extract(self, runner, tar_file, dest, timeout=-1):
246+ tmpdir = self.context.config.lava_image_tmpdir
247+ url = self.context.config.lava_image_url
248+ tar_file = tar_file.replace(tmpdir, '')
249+ tar_url = '/'.join(u.strip('/') for u in [url, tar_file])
250+ self._target_extract_url(runner,tar_url,dest,timeout=timeout)
251+
252+ def _target_extract_url(self, runner, tar_url, dest, timeout=-1):
253+ decompression_cmd = ''
254+ if tar_url.endswith('.gz') or tar_url.endswith('.tgz'):
255+ decompression_cmd = '| /bin/gzip -dc'
256+ elif tar_url.endswith('.bz2'):
257+ decompression_cmd = '| /bin/bzip2 -dc'
258+ elif tar_url.endswith('.tar'):
259+ decompression_cmd = ''
260+ else:
261+ raise RuntimeError('bad file extension: %s' % tar_url)
262+
263+ runner.run('wget -O - %s %s | /bin/tar -C %s -xmf -'
264+ % (tar_url, decompression_cmd, dest),
265+ timeout=timeout)
266+
267+ @contextlib.contextmanager
268+ def _as_master(self):
269+ self.bootcontrol.power_on_boot_master()
270+
271+ # Two reboots seem to be necessary to ensure that pxe boot is used.
272+ # Need to identify the cause and fix it
273+ self.proc.expect("Hit any key to stop autoboot:")
274+ self.proc.sendline('')
275+ self.bootcontrol.power_reset_boot_master()
276+
277+ self.proc.expect("\(initramfs\)")
278+ self.proc.sendline('export PS1="%s"' % self.MASTER_PS1)
279+ self.proc.expect(self.MASTER_PS1_PATTERN, timeout=180, lava_no_logging=1)
280+ runner = HBMasterCommandRunner(self)
281+ runner.run(". /scripts/functions")
282+ device = "eth0"
283+ runner.run("DEVICE=%s configure_networking" % device)
284+
285+ self.device_version = runner.get_device_version()
286+
287+ try:
288+ yield runner
289+ finally:
290+ logging.debug("deploy done")
291+
292+
293+target_class = HighbankTarget
294+
295+
296+class HBMasterCommandRunner(MasterCommandRunner):
297+ """A CommandRunner to use when the target is booted into the master image.
298+ """
299+ http_pid = None
300+
301+ def __init__(self, target):
302+ super(HBMasterCommandRunner, self).__init__(target)
303+
304+ def start_http_server(self):
305+ master_ip = self.get_master_ip()
306+ if self.http_pid != None:
307+ raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
308+ # busybox produces no output to parse for, so run it in the bg and get its pid
309+ self.run('busybox httpd -f &')
310+ self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
311+ if self.match_id != 0:
312+ raise OperationFailed("busybox httpd did not start")
313+ else:
314+ self.http_pid = self.match.group(1)
315+ url_base = "http://%s" % (master_ip)
316+ return url_base
317+
318+ def stop_http_server(self):
319+ if self.http_pid == None:
320+ raise OperationFailed("busybox httpd not running, but stop_http_server called.")
321+ self.run('kill %s' % self.http_pid)
322+ self.http_pid = None
323+
324
325=== added file 'lava_dispatcher/ipmi.py'
326--- lava_dispatcher/ipmi.py 1970-01-01 00:00:00 +0000
327+++ lava_dispatcher/ipmi.py 2013-04-16 02:42:25 +0000
328@@ -0,0 +1,85 @@
329+# Copyright (C) 2013 Linaro Limited
330+#
331+# Authors:
332+# Antonio Terceiro <antonio.terceiro@linaro.org>
333+# Michael Hudson-Doyle <michael.hudson@linaro.org>
334+#
335+# This file is part of LAVA Dispatcher.
336+#
337+# LAVA Dispatcher is free software; you can redistribute it and/or modify
338+# it under the terms of the GNU General Public License as published by
339+# the Free Software Foundation; either version 2 of the License, or
340+# (at your option) any later version.
341+#
342+# LAVA Dispatcher is distributed in the hope that it will be useful,
343+# but WITHOUT ANY WARRANTY; without even the implied warranty of
344+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
345+# GNU General Public License for more details.
346+#
347+# You should have received a copy of the GNU General Public License
348+# along
349+# with this program; if not, see <http://www.gnu.org/licenses>.
350+
351+
352+class IPMITool(object):
353+ """
354+ This class wraps the ipmitool CLI to provide a convenient object-oriented
355+ API that can be composed into the implementation of devices that can be
356+ managed with IPMI.
357+ """
358+
359+ def __init__(self, context, host, ipmitool="ipmitool"):
360+ self.host = host
361+ self.context = context
362+ self.ipmitool = ipmitool
363+
364+ def __ipmi(self, command):
365+ self.context.run_command(
366+ "%s -H %s -U admin -P admin %s" % (
367+ self.ipmitool, self.host, command
368+ )
369+ )
370+
371+ def set_to_boot_from_disk(self):
372+ self.__ipmi("chassis bootdev disk")
373+
374+ def set_to_boot_from_pxe(self):
375+ self.__ipmi("chassis bootdev pxe")
376+
377+ def power_off(self):
378+ self.__ipmi("chassis power off")
379+
380+ def power_on(self):
381+ self.__ipmi("chassis power on")
382+
383+ def reset(self):
384+ self.__ipmi("chassis power reset")
385+
386+
387+class IpmiPxeBoot(object):
388+ """
389+ This class provides a convenient object-oriented API that can be
390+ used to initiate power on/off and boot device selection for pxe
391+ and disk boot devices using ipmi commands.
392+ """
393+
394+ def __init__(self, context, host):
395+ self.ipmitool = IPMITool(context, host)
396+
397+ def power_on_boot_master(self):
398+ self.ipmitool.set_to_boot_from_pxe()
399+ self.ipmitool.power_on()
400+ self.ipmitool.reset()
401+
402+ def power_reset_boot_master(self):
403+ self.ipmitool.set_to_boot_from_pxe()
404+ self.ipmitool.reset()
405+
406+ def power_on_boot_image(self):
407+ self.ipmitool.set_to_boot_from_disk()
408+ self.ipmitool.power_on()
409+ self.ipmitool.reset()
410+
411+ def power_off(self):
412+ self.ipmitool.power_off()
413+

Subscribers

People subscribed via source and target branches