Merge lp:~nick-schutt/lava-dispatcher/nicks-highbank-support into lp:lava-dispatcher

Proposed by Nicholas Schutt
Status: Superseded
Proposed branch: lp:~nick-schutt/lava-dispatcher/nicks-highbank-support
Merge into: lp:lava-dispatcher
Diff against target: 413 lines (+357/-2)
6 files modified
lava_dispatcher/client/base.py (+1/-1)
lava_dispatcher/client/lmc_utils.py (+6/-1)
lava_dispatcher/config.py (+2/-0)
lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf (+2/-0)
lava_dispatcher/device/highbank.py (+261/-0)
lava_dispatcher/ipmi.py (+85/-0)
To merge this branch: bzr merge lp:~nick-schutt/lava-dispatcher/nicks-highbank-support
Reviewer Review Type Date Requested Status
Antonio Terceiro Needs Fixing
Review via email: mp+157658@code.launchpad.net

This proposal supersedes a proposal from 2013-03-22.

This proposal has been superseded by a proposal from 2013-04-16.

Description of the change

Beta version of calxeda highbank support. Tested with lava_test_shell, deployed for 8 machines to test.

deploy_linaro & deploy_linaro_prebuilt implemented

tested with hwpack + rootfs from snapshots.linaro.org

1. Boots a kernel + initrd master image via pxe

2. dd's lmc-produced image directly to the hard disk

3. uses busybox httpd on the master initrd to transfer files from the target

4. boots via the disk once the disk is prepared

5. boots the initrd master image again to retrieve the results via busybox httpd

To post a comment you must log in.
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

This looks interesting, but as I'm sure you're aware it's still pretty
ugly. Has there been any progress in the mean time on cleaning things
up?

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

waiting for a hw pack + snapshot that work to be able to complete

On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:

> This looks interesting, but as I'm sure you're aware it's still pretty
> ugly. Has there been any progress in the mean time on cleaning things
> up?
>
> --
>
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are the owner of
> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>

Revision history for this message
Nicholas Schutt (nick-schutt) wrote : Posted in a previous version of this proposal

I kind of understand that it's "ugly," but I'm wondering if you might have
some suggestions? I know it's not done yet. I have made it work a couple of
times now, once with a tarball of an ubuntu system (calxeda02-10), and
another time now with a snapshot that requires a bit of hacking right now.

I think the goal now is to use lmc to produce a working image and dd that
to the disk; that should remove the "hack" part. (Copying Antonio since we
have discussed this)

On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:

> This looks interesting, but as I'm sure you're aware it's still pretty
> ugly. Has there been any progress in the mean time on cleaning things
> up?
>
> --
>
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are the owner of
> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote : Posted in a previous version of this proposal

Nicholas Schutt <email address hidden> writes:

> I kind of understand that it's "ugly," but I'm wondering if you might have
> some suggestions?

Well, I guess mainly it's the duplication from master.py (which is my
fault, I know). It would be nice if we could have a common base class
of MasterImageTarget and HighbankTarget. The other ugliness is mostly
the usual dispatcher cruft :(

> I know it's not done yet. I have made it work a couple of times now,
> once with a tarball of an ubuntu system (calxeda02-10), and another
> time now with a snapshot that requires a bit of hacking right now.

Cool!

> I think the goal now is to use lmc to produce a working image and dd that
> to the disk; that should remove the "hack" part. (Copying Antonio since we
> have discussed this)

When I talked to Fathi about this (admittedly about 6 weeks ago), he
didn't think that lmc would support highbank. I agree just dd-ing an
image over /dev/sda would be a lot cleaner.

Cheers,
mwh

> On 2 April 2013 04:52, Michael Hudson-Doyle <email address hidden>wrote:
>
>> This looks interesting, but as I'm sure you're aware it's still pretty
>> ugly. Has there been any progress in the mean time on cleaning things
>> up?
>>
>> --
>>
>> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
>> You are the owner of
>> lp:~nick-schutt/lava-dispatcher/nicks-highbank-support.
>>
>
> --
> https://code.launchpad.net/~nick-schutt/lava-dispatcher/nicks-highbank-support/+merge/154974
> You are subscribed to branch lp:lava-dispatcher.

Revision history for this message
Fathi Boudra (fboudra) wrote : Posted in a previous version of this proposal

On 2 April 2013 23:10, Michael Hudson-Doyle
<email address hidden> wrote:
>> I think the goal now is to use lmc to produce a working image and dd that
>> to the disk; that should remove the "hack" part. (Copying Antonio since we
>> have discussed this)
>
> When I talked to Fathi about this (admittedly about 6 weeks ago), he
> didn't think that lmc would support highbank. I agree just dd-ing an
> image over /dev/sda would be a lot cleaner.

https://code.launchpad.net/~fboudra/linaro-image-tools/highbank-support

It's now supported in a similar way to a development board.
I'm not convinced it's the best approach for servers but it works.

Revision history for this message
Antonio Terceiro (terceiro) wrote :
Download full text (22.6 KiB)

Hey Nick,

It's freaking cool that we already had this working! :-)

Follows my comments. I know you have a separate branch where you are
trying to abstract the parts in common with master.py, but I am making
some comments about that here anyway.

 review needs-fixing

> === modified file 'lava_dispatcher/client/base.py'
> --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> @@ -154,7 +154,7 @@
> lava_server_ip = self._client.context.config.lava_server_ip
> self.run(
> "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> - ["1 received", "0 received", "Network is unreachable"],
> + ["1 received|1 packets* received", "0 received|0 packets received", "Network is unreachable"],
> timeout=5, failok=True)
> if self.match_id == 0:
> return True
>

Do you really need this? Did ping had a different output on the rootfs
you tested with?

> === modified file 'lava_dispatcher/client/lmc_utils.py'
> --- lava_dispatcher/client/lmc_utils.py 2013-02-18 03:19:14 +0000
> +++ lava_dispatcher/client/lmc_utils.py 2013-04-08 13:59:23 +0000
> @@ -15,7 +15,8 @@
> )
>
>
> -def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None):
> +def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None,
> + extra_boot_args=None, image_size=None):
> """Generate image from a hwpack and rootfs url
>
> :param hwpack_url: url of the Linaro hwpack to download
> @@ -32,7 +33,7 @@
> rootfs_path = download_image(rootfs_url, client.context, outdir, decompress=False)
>
> logging.info("linaro-media-create version information")
> - cmd = "sudo linaro-media-create -v"
> + cmd = "linaro-media-create -v"
> rc, output = getstatusoutput(cmd)
> metadata = client.context.test_data.get_metadata()
> metadata['target.linaro-media-create-version'] = output

I'm not sure we want to drop the sudo there. Even though the dispatcher
currently requires being run as root, I think in the long run we should
be able to drop that requirement. Also, if we are root already, the sudo
does no harm.

Additionally, it's a good idea to keep the merge proposal focused and
avoid lateral changes.

> @@ -42,11 +43,15 @@
>
> logging.info("client.device_type = %s" %client.config.device_type)
>
> - cmd = ("sudo flock /var/lock/lava-lmc.lck linaro-media-create --hwpack-force-yes --dev %s "
> + cmd = ("flock /var/lock/lava-lmc.lck linaro-media-create --hwpack-force-yes --dev %s "
> "--image-file %s --binary %s --hwpack %s --image-size 3G --bootloader %s" %
> (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))

Ditto.

> if rootfstype is not None:
> cmd += ' --rootfs ' + rootfstype
> + if image_size is not None:
> + cmd += ' --image-size ' + image_size
> + if extra_boot_args is not None:
> + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> logging.info("Executing the linaro-media-create command")
> logging.info(cmd)
>...

review: Needs Fixing
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Just one quibble.

Antonio Terceiro <email address hidden> writes:

>> + def start_http_server(self, runner, port=80):
>> + # busybox produces no output to parse for, so let it run as a daemon
>> + runner.run('busybox httpd -v -p %s' % port)
>> + url_base = "http://%s:%s" % (self.master_ip, port)
>> + return url_base
>
> I don't think we should use the standard http port here, because most
> probably use cases will include installing a proper web server, and in
> that case trying to start busybox on the default port will break things.

The device is running in the initramfs here, surely?

I have discovered a new thing to be annoyed about: things that don't let
you bind to port 0 and then report the port they bound (nc does this
too).

Cheers,
mwh

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Mon, Apr 08, 2013 at 09:46:30PM -0000, Michael Hudson-Doyle wrote:
> Just one quibble.
>
> Antonio Terceiro <email address hidden> writes:
>
> >> + def start_http_server(self, runner, port=80):
> >> + # busybox produces no output to parse for, so let it run as a daemon
> >> + runner.run('busybox httpd -v -p %s' % port)
> >> + url_base = "http://%s:%s" % (self.master_ip, port)
> >> + return url_base
> >
> > I don't think we should use the standard http port here, because most
> > probably use cases will include installing a proper web server, and in
> > that case trying to start busybox on the default port will break things.
>
> The device is running in the initramfs here, surely?

Yes, you are right! Nick, in this case you don't need to care about the
port number there.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> I'm not sure we want to drop the sudo there. Even though the dispatcher
> currently requires being run as root, I think in the long run we should
> be able to drop that requirement. Also, if we are root already, the sudo
> does no harm.

Antonio,

I will undo the sudo changes since they're not important now that everything works with the scheduler. But, I saw an issue when running the dispatcher as root in my own local virtual environment. For each sudo command the virtual environment was lost; I'm not sure why.

Nick

597. By Nicholas Schutt

put back sudo

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> > === modified file 'lava_dispatcher/client/base.py'
> > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > @@ -154,7 +154,7 @@
> > lava_server_ip = self._client.context.config.lava_server_ip
> > self.run(
> > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > - ["1 received", "0 received", "Network is unreachable"],
> > + ["1 received|1 packets* received", "0 received|0 packets
> received", "Network is unreachable"],
> > timeout=5, failok=True)
> > if self.match_id == 0:
> > return True
> >
>
> Do you really need this? Did ping had a different output on the rootfs
> you tested with?

>

The output from ping in the initrd seems to match what I see on my ubuntu machine,
which contains the word "packets":

nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52 time=33.8 ms

--- validation.linaro.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> > if rootfstype is not None:
> > cmd += ' --rootfs ' + rootfstype
> > + if image_size is not None:
> > + cmd += ' --image-size ' + image_size
> > + if extra_boot_args is not None:
> > + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> > logging.info("Executing the linaro-media-create command")
> > logging.info(cmd)
> >
> > @@ -85,7 +90,7 @@
> > mntdir = mkdtemp()
> > image = image_file
> > offset = get_partition_offset(image, partno)
> > - mount_cmd = "sudo mount -o loop,offset=%s %s %s" % (offset, image,
> mntdir)
> > + mount_cmd = "mount -o loop,offset=%s %s %s" % (offset, image, mntdir)
> > rc = logging_system(mount_cmd)
> > if rc != 0:
> > os.rmdir(mntdir)
>

We need lmc changes to get the snapshot image to work. Should this be done on another branch? The existing snapshots won't work without these changes.

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

> > + if runner.match_id != 0:
> > + msg = "Unable to determine dns address"
> > + logging.error(msg)
> > + raise CriticalError(msg)
> > + dns = runner.match.group(1)
> > + logging.info("DNS Address is %s" % dns)
> > + runner.run("echo nameserver %s > /etc/resolv.conf" % dns)
>
> Do we actually still need this DNS setup, by the way? Isn't everything
> that's needed to be acessed acessed by IP now?
>

Agreed. I have now removed it.

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Thu, Apr 11, 2013 at 01:13:31PM -0000, Nicholas Schutt wrote:
>
> > > === modified file 'lava_dispatcher/client/base.py'
> > > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > > @@ -154,7 +154,7 @@
> > > lava_server_ip = self._client.context.config.lava_server_ip
> > > self.run(
> > > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > > - ["1 received", "0 received", "Network is unreachable"],
> > > + ["1 received|1 packets* received", "0 received|0 packets
> > received", "Network is unreachable"],
> > > timeout=5, failok=True)
> > > if self.match_id == 0:
> > > return True
> > >
> >
> > Do you really need this? Did ping had a different output on the rootfs
> > you tested with?
>
> >
>
> The output from ping in the initrd seems to match what I see on my ubuntu machine,
> which contains the word "packets":
>
>
> nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
> PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
> 64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52 time=33.8 ms
>
> --- validation.linaro.org ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms

yes, but note that "packets" are mentioned in the "transmitted" part,
but we really only care about the "received" part, which in this case
will say exactly either "1 received" or "0 received".

my point is to keep the diff minimal, without changes that are unrelated
with the purpose of this MP.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Thu, Apr 11, 2013 at 01:16:24PM -0000, Nicholas Schutt wrote:
> > > if rootfstype is not None:
> > > cmd += ' --rootfs ' + rootfstype
> > > + if image_size is not None:
> > > + cmd += ' --image-size ' + image_size
> > > + if extra_boot_args is not None:
> > > + cmd += ' --extra-boot-args "%s"' % extra_boot_args
> > > logging.info("Executing the linaro-media-create command")
> > > logging.info(cmd)
> > >
> > > @@ -85,7 +90,7 @@
> > > mntdir = mkdtemp()
> > > image = image_file
> > > offset = get_partition_offset(image, partno)
> > > - mount_cmd = "sudo mount -o loop,offset=%s %s %s" % (offset, image,
> > mntdir)
> > > + mount_cmd = "mount -o loop,offset=%s %s %s" % (offset, image, mntdir)
> > > rc = logging_system(mount_cmd)
> > > if rc != 0:
> > > os.rmdir(mntdir)
> >
>
> We need lmc changes to get the snapshot image to work. Should this be
> done on another branch? The existing snapshots won't work without
> these changes.

if this is needed in order to make this branch work with the intended
input data, so it's fine to keep them here.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

598. By Nicholas Schutt

code review updates from antonio

599. By Nicholas Schutt

fixes for code review updates from antonio

600. By Nicholas Schutt

change busybox httpd implementation - put in runner, use 12743 to get pid

601. By Nicholas Schutt

ctrl-c implementation does not work

602. By Nicholas Schutt

switch back to bg/kill for httpd

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

Implemented device versioning in the master image as follows:

/usr/share/initramfs-tools/hooks/master-extras

# Create version info for this image
echo '#!/bin/sh' > /tmp/lava-master-image-info
echo "echo $(date +%Y.%m.%d-%H.%M.%S)" > /tmp/lava-master-image-info
chmod +x /tmp/lava-master-image-info
copy_exec /tmp/lava-master-image-info /sbin

> > + def get_device_version(self):
> > + # To be re-implemented when master image is generated by linaro-
> image-tools
> > + device_version = "unknown"
> > + return device_version
>
> Please make sure that some build number is included in the initrd so
> that we can read it here as device version. You can include something as
> simples as the following snippet in the initramfs-tools hooks file:
>
> echo '#!/bin/sh' > /tmp/lava-master-version
> echo "echo $(date +%Y.%m.%d)" > /tmp/lava-master-version
> chmod +x /tmp/lava-master-version
> copy_exec /tmp/lava-master-version /sbin
>
> Then you can call lava-master-version on the target from this method and
> return the version number printed.
>

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

Here is the actual output from the initrd version of ping. It includes "packets" in the received field

root@master [rc=0]# ^[[38;19RLC_ALL=C ping -W4 -c1 192.168.1.71:8100
PING 192.168.1.71:8100 (192.168.1.71): 56 data bytes
64 bytes from 192.168.1.71: seq=0 ttl=64 time=0.802 ms

--- 192.168.1.71:8100 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.802/0.802/0.802 ms
root@master [rc=0]#
root@master [rc=0]# ^[[38;21Rifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' |awk
 '{print "<" $1 ">"}'

> On Thu, Apr 11, 2013 at 01:13:31PM -0000, Nicholas Schutt wrote:
> >
> > > > === modified file 'lava_dispatcher/client/base.py'
> > > > --- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
> > > > +++ lava_dispatcher/client/base.py 2013-04-08 13:59:23 +0000
> > > > @@ -154,7 +154,7 @@
> > > > lava_server_ip = self._client.context.config.lava_server_ip
> > > > self.run(
> > > > "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
> > > > - ["1 received", "0 received", "Network is unreachable"],
> > > > + ["1 received|1 packets* received", "0 received|0 packets
> > > received", "Network is unreachable"],
> > > > timeout=5, failok=True)
> > > > if self.match_id == 0:
> > > > return True
> > > >
> > >
> > > Do you really need this? Did ping had a different output on the rootfs
> > > you tested with?
> >
> > >
> >
> > The output from ping in the initrd seems to match what I see on my ubuntu
> machine,
> > which contains the word "packets":
> >
> >
> > nick@neptune:/data/linaro/calxeda/code/lava-dispatcher/nicks-highbank-
> support/lava_dispatcher$ ping -W4 -c1 validation.linaro.org
> > PING validation.linaro.org (88.98.47.97) 56(84) bytes of data.
> > 64 bytes from validation.linaro.org (88.98.47.97): icmp_req=1 ttl=52
> time=33.8 ms
> >
> > --- validation.linaro.org ping statistics ---
> > 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> > rtt min/avg/max/mdev = 33.813/33.813/33.813/0.000 ms
>
> yes, but note that "packets" are mentioned in the "transmitted" part,
> but we really only care about the "received" part, which in this case
> will say exactly either "1 received" or "0 received".
>
> my point is to keep the diff minimal, without changes that are unrelated
> with the purpose of this MP.
>
> --
> Antonio Terceiro
> Software Engineer - Linaro
> http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :

send_control('c') does not appear to work when running in the initrd master. Maybe that needs to be looked at and fixed, but to make things work I had to run in the background and use kill instead.

I modified to do this differently now:

class HBMasterCommandRunner(MasterCommandRunner):

    http_pid = None

    def start_http_server(self):
        master_ip = self.get_master_ip()
        if self.http_pid != None:
            raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
        # busybox produces no output to parse for, so run it in the bg and get its pid
        self.run('busybox httpd -f &')
        self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
        if self.match_id != 0:
            raise OperationFailed("busybox httpd did not start")
        else:
            self.http_pid = self.match.group(1)
        url_base = "http://%s" % (master_ip)
        return url_base

    def stop_http_server(self):
        if self.http_pid == None:
            raise OperationFailed("busybox httpd not running, but stop_http_server called.")
        self.run('kill %s' % self.http_pid)
        self.http_pid = None

> > +
> > + def start_http_server(self, runner, port=80):
> > + # busybox produces no output to parse for, so let it run as a
> daemon
> > + runner.run('busybox httpd -v -p %s' % port)
> > + url_base = "http://%s:%s" % (self.master_ip, port)
> > + return url_base
>
> I don't think we should use the standard http port here, because most
> probably use cases will include installing a proper web server, and in
> that case trying to start busybox on the default port will break things.
>
> Try some high port that is unlikely to be used by another service ...
> like ... 50888 for now. :)
>
> > +
> > + def stop_http_server(self, runner):
> > + runner.run('killall busybox')
>
> Not safe. I think it's better to just start busybox httpd in the
> foreground and then send a control-C here.
>

603. By Nicholas Schutt

remove device version code (reuse from master class), fix /builddir

604. By Nicholas Schutt

Remove ramdisk for wget + dd to disk

605. By Nicholas Schutt

Remove class HBMasterCommandRunner(MasterCommandRunner):

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Fri, Apr 12, 2013 at 04:10:35PM -0000, Nicholas Schutt wrote:
>
> Here is the actual output from the initrd version of ping. It includes "packets" in the received field
>
>
> root@master [rc=0]# ^[[38;19RLC_ALL=C ping -W4 -c1 192.168.1.71:8100
> PING 192.168.1.71:8100 (192.168.1.71): 56 data bytes
> 64 bytes from 192.168.1.71: seq=0 ttl=64 time=0.802 ms
>
> --- 192.168.1.71:8100 ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max = 0.802/0.802/0.802 ms
> root@master [rc=0]#
> root@master [rc=0]# ^[[38;21Rifconfig eth0 | grep 'inet addr' | awk -F: '{print $2}' |awk
> '{print "<" $1 ">"}'

ok. in that case you want to drop the '*' so the pattern is
"1 received|1 packets received"

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Antonio Terceiro (terceiro) wrote :

On Fri, Apr 12, 2013 at 04:13:34PM -0000, Nicholas Schutt wrote:
>
> send_control('c') does not appear to work when running in the initrd master. Maybe that needs to be looked at and fixed, but to make things work I had to run in the background and use kill instead.
>
> I modified to do this differently now:
>
>
> class HBMasterCommandRunner(MasterCommandRunner):
>
> http_pid = None
>
> def start_http_server(self):
> master_ip = self.get_master_ip()
> if self.http_pid != None:
> raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
> # busybox produces no output to parse for, so run it in the bg and get its pid
> self.run('busybox httpd -f &')
> self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
> if self.match_id != 0:
> raise OperationFailed("busybox httpd did not start")
> else:
> self.http_pid = self.match.group(1)
> url_base = "http://%s" % (master_ip)
> return url_base
>
> def stop_http_server(self):
> if self.http_pid == None:
> raise OperationFailed("busybox httpd not running, but stop_http_server called.")
> self.run('kill %s' % self.http_pid)
> self.http_pid = None

Looks good.

--
Antonio Terceiro
Software Engineer - Linaro
http://www.linaro.org

Revision history for this message
Nicholas Schutt (nick-schutt) wrote :
Download full text (7.8 KiB)

The wget, bunzip, then dd to disk approach seems to be necessary to avoid memory allocation errors from the kernel. I'm not sure of the root cause.

***> The actual error is included at the end of this comment <***

Re-implemented to keep the image compressed in the ramdisk and to set the max ramdisk size to 50% of memory:

            image_file_base = '/'.join(image_file.split('/')[-1:])

            decompression_cmd = None
            if image_file_base.endswith('.gz'):
                decompression_cmd = '/bin/gzip -dc'
            elif image_file_base.endswith('.bz2'):
                decompression_cmd = '/bin/bzip2 -dc'

            runner.run('mkdir /builddir')
            runner.run('mount -t tmpfs -o size=50% tmpfs /builddir')
            runner.run('wget -O /builddir/%s %s' % (image_file_base, image_url), timeout=1800)

            if decompression_cmd != None:
                cmd = 'dd bs=4M if=%s of=%s' % (image_file_base, device)
            else:
                cmd = '%s %s | dd bs=4M of=%s' % (decompression_cmd, image_file_base, device)

            runner.run(cmd, timeout=1800)
            runner.run('umount /builddir')

--------

>
> > + elif image_url.endswith('.bz2'):
> > + decompression_cmd = '| /bin/bzip2 -dc'
> > +
> > + runner.run('mkdir /builddir')
> > + runner.run('mount -t tmpfs -o size=4G tmpfs builddir')
> > + image = '/builddir/lava.img'
> > + runner.run('wget -O - %s %s > %s' % (image_url,
> decompression_cmd, image), timeout=1800)
> > + runner.run('dd bs=4M if=%s of=%s' % (image, device),
> timeout=1800)
> > + runner.run('umount /builddir')
>
> Is it always possible to create a 4G tmpfs?
>
> Instead of saving the image locally, I think you could pipe the wget
> output directl into dd. This way you don't need to to create a tmpfs,
> nor to mount/umount it. This would be something like:
>
> runner.run("wget -O - %s %s | dd bs=4M of=%s" % (image_url,
> decompression_cmd, device))
>
> Also, there are some wget options you can use to get some progress
> indication, which is useful. Look at how master.py does it.
>
> (if you convince me the tmpfs is needed, make sure to add a missing
> slash in the last token there so it reads `/builddir` instead of
> `builddir`)
>

----------

root@master [rc=0]# wget -O - http://192.168.1.71/highbank2-tmp/images/tmprBHgDV
/lava.img.bz2 | /bin/bzip2 -dc | dd bs=4M of=/dev/sda
Connecting to 192.168.1.71 (192.168.1.71:80)
- 1% | | 2558k 0:01:16 ETA- 2% | | 5266k 0:01:12 ETA- 4% |* | 8378k 0:01:07 ETA- 4% |* | 8930k 0:01:24 ETA- 4% |* | 9282k 0:01:41 ETA- 5% |* | 10198k 0:01:49 ETA- 6% |* |

...

143M 0:00:26 ETA- 75% |*********************** | 145M 0:00:25 ETA- 76% |******...

Read more...

606. By Nicholas Schutt

separate wget, bunzip2, and dd for deploy_image

607. By Nicholas Schutt

separate wget, bunzip2, and dd for deploy_image

608. By Nicholas Schutt

move start_http/stop_http to highbank class

609. By Nicholas Schutt

fix percent symbol in mount command

610. By Nicholas Schutt

minor change to start_http_server

611. By Nicholas Schutt

simplify start_http - keep pid on the target

612. By Nicholas Schutt

put http code back in HBMasterCommandRunner

613. By Nicholas Schutt

change tmpfs size from 50% to 100%

614. By Nicholas Schutt

add partition resize support

615. By Nicholas Schutt

allow resize of ext2, ext3, ext4 and add stub for btrfs

616. By Nicholas Schutt

make partno a string

617. By Nicholas Schutt

issue a warning if part resize fails, fix tabs

618. By Nicholas Schutt

learning python

619. By Nicholas Schutt

learning python

620. By Nicholas Schutt

nedit is not vi

621. By Nicholas Schutt

add call to dhclient

622. By Nicholas Schutt

add call to dhclient

623. By Nicholas Schutt

add call to dhclient

624. By Nicholas Schutt

add dd to erase the first part of the disk

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'lava_dispatcher/client/base.py'
--- lava_dispatcher/client/base.py 2013-03-27 11:22:07 +0000
+++ lava_dispatcher/client/base.py 2013-04-16 02:42:25 +0000
@@ -154,7 +154,7 @@
154 lava_server_ip = self._client.context.config.lava_server_ip154 lava_server_ip = self._client.context.config.lava_server_ip
155 self.run(155 self.run(
156 "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,156 "LC_ALL=C ping -W4 -c1 %s" % lava_server_ip,
157 ["1 received", "0 received", "Network is unreachable"],157 ["1 received|1 packets received", "0 received|0 packets received", "Network is unreachable"],
158 timeout=5, failok=True)158 timeout=5, failok=True)
159 if self.match_id == 0:159 if self.match_id == 0:
160 return True160 return True
161161
=== modified file 'lava_dispatcher/client/lmc_utils.py'
--- lava_dispatcher/client/lmc_utils.py 2013-02-18 03:19:14 +0000
+++ lava_dispatcher/client/lmc_utils.py 2013-04-16 02:42:25 +0000
@@ -15,7 +15,8 @@
15 )15 )
1616
1717
18def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None):18def generate_image(client, hwpack_url, rootfs_url, outdir, bootloader='u_boot', rootfstype=None,
19 extra_boot_args=None, image_size=None):
19 """Generate image from a hwpack and rootfs url20 """Generate image from a hwpack and rootfs url
2021
21 :param hwpack_url: url of the Linaro hwpack to download22 :param hwpack_url: url of the Linaro hwpack to download
@@ -47,6 +48,10 @@
47 (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))48 (client.config.lmc_dev_arg, image_file, rootfs_path, hwpack_path, bootloader))
48 if rootfstype is not None:49 if rootfstype is not None:
49 cmd += ' --rootfs ' + rootfstype50 cmd += ' --rootfs ' + rootfstype
51 if image_size is not None:
52 cmd += ' --image-size ' + image_size
53 if extra_boot_args is not None:
54 cmd += ' --extra-boot-args "%s"' % extra_boot_args
50 logging.info("Executing the linaro-media-create command")55 logging.info("Executing the linaro-media-create command")
51 logging.info(cmd)56 logging.info(cmd)
5257
5358
=== modified file 'lava_dispatcher/config.py'
--- lava_dispatcher/config.py 2013-04-05 17:19:57 +0000
+++ lava_dispatcher/config.py 2013-04-16 02:42:25 +0000
@@ -104,6 +104,8 @@
104 default='Press Enter to stop auto boot...')104 default='Press Enter to stop auto boot...')
105 vexpress_usb_mass_storage_device = schema.StringOption(default=None)105 vexpress_usb_mass_storage_device = schema.StringOption(default=None)
106106
107 ecmeip = schema.StringOption()
108
107class OptionDescriptor(object):109class OptionDescriptor(object):
108 def __init__(self, name):110 def __init__(self, name):
109 self.name = name111 self.name = name
110112
=== added file 'lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf'
--- lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf 1970-01-01 00:00:00 +0000
+++ lava_dispatcher/default-config/lava-dispatcher/device-types/highbank.conf 2013-04-16 02:42:25 +0000
@@ -0,0 +1,2 @@
1client_type = highbank
2connection_command = ipmitool -I lanplus -U admin -P admin -H %(ecmeip)s sol activate
03
=== added file 'lava_dispatcher/device/highbank.py'
--- lava_dispatcher/device/highbank.py 1970-01-01 00:00:00 +0000
+++ lava_dispatcher/device/highbank.py 2013-04-16 02:42:25 +0000
@@ -0,0 +1,261 @@
1# Copyright (C) 2012 Linaro Limited
2#
3# Author: Michael Hudson-Doyle <michael.hudson@linaro.org>
4# Author: Nicholas Schutt <nick.schutt@linaro.org>
5#
6# This file is part of LAVA Dispatcher.
7#
8# LAVA Dispatcher is free software; you can redistribute it and/or modify
9# it under the terms of the GNU General Public License as published by
10# the Free Software Foundation; either version 2 of the License, or
11# (at your option) any later version.
12#
13# LAVA Dispatcher is distributed in the hope that it will be useful,
14# but WITHOUT ANY WARRANTY; without even the implied warranty of
15# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16# GNU General Public License for more details.
17#
18# You should have received a copy of the GNU General Public License
19# along
20# with this program; if not, see <http://www.gnu.org/licenses>.
21
22import contextlib
23import logging
24import os
25import pexpect
26import time
27
28from lava_dispatcher import tarballcache
29
30from lava_dispatcher.device.master import (
31 MasterCommandRunner,
32)
33from lava_dispatcher.device.target import (
34 Target
35)
36from lava_dispatcher.errors import (
37 NetworkError,
38 CriticalError,
39 OperationFailed,
40)
41from lava_dispatcher.downloader import (
42 download_image,
43 download_with_retry,
44 )
45from lava_dispatcher.utils import (
46 mk_targz,
47 rmtree,
48)
49from lava_dispatcher.client.lmc_utils import (
50 generate_image,
51)
52from lava_dispatcher.ipmi import IpmiPxeBoot
53
54
55class HighbankTarget(Target):
56
57 MASTER_PS1 = 'root@master [rc=$(echo \$?)]# '
58 MASTER_PS1_PATTERN = 'root@master \[rc=(\d+)\]# '
59
60 def __init__(self, context, config):
61 super(HighbankTarget, self).__init__(context, config)
62 self.proc = self.context.spawn(self.config.connection_command, timeout=1200)
63 self.device_version = None
64 if self.config.ecmeip == None:
65 msg = "The ecmeip address is not set for this target"
66 logging.error(msg)
67 raise CriticalError(msg)
68 self.bootcontrol = IpmiPxeBoot(context, self.config.ecmeip)
69
70 def get_device_version(self):
71 return self.device_version
72
73 def power_on(self):
74 self.bootcontrol.power_on_boot_image()
75 return self.proc
76
77 def power_off(self, proc):
78 self.bootcontrol.power_off()
79
80 def deploy_linaro(self, hwpack, rfs, bootloader):
81 image_file = generate_image(self, hwpack, rfs, self.scratch_dir, bootloader,
82 extra_boot_args='1', image_size='1G')
83 self._customize_linux(image_file)
84 self._deploy_image(image_file, '/dev/sda')
85
86 def deploy_linaro_prebuilt(self, image):
87 image_file = download_image(image, self.context, self.scratch_dir)
88 self._customize_linux(image_file)
89 self._deploy_image(image_file, '/dev/sda')
90
91 def _deploy_image(self, image_file, device):
92 with self._as_master() as runner:
93
94 # compress the image to reduce the transfer size
95 if not image_file.endswith('.bz2') and not image_file.endswith('gz'):
96 os.system('bzip2 -9v ' + image_file)
97 image_file += '.bz2'
98
99 tmpdir = self.context.config.lava_image_tmpdir
100 url = self.context.config.lava_image_url
101 image_file = image_file.replace(tmpdir, '')
102 image_url = '/'.join(u.strip('/') for u in [url, image_file])
103
104 build_dir = '/builddir'
105 image_file_base = build_dir + '/' + '/'.join(image_file.split('/')[-1:])
106
107 decompression_cmd = None
108 if image_file_base.endswith('.gz'):
109 decompression_cmd = '/bin/gzip -dc'
110 elif image_file_base.endswith('.bz2'):
111 decompression_cmd = '/bin/bzip2 -dc'
112
113 runner.run('mkdir %s' % build_dir)
114 runner.run('mount -t tmpfs -o size=50%% tmpfs %s' % build_dir)
115 runner.run('wget -O %s %s' % (image_file_base, image_url), timeout=1800)
116
117 if decompression_cmd != None:
118 cmd = '%s %s | dd bs=4M of=%s' % (decompression_cmd, image_file_base, device)
119 else:
120 cmd = 'dd bs=4M if=%s of=%s' % (image_file_base, device)
121
122 runner.run(cmd, timeout=1800)
123 runner.run('umount %s' % build_dir)
124
125 def get_partition(self, runner, partition):
126 if partition == self.config.boot_part:
127 partition = '/dev/disk/by-label/boot'
128 elif partition == self.config.root_part:
129 partition = '/dev/disk/by-label/rootfs'
130 else:
131 raise RuntimeError(
132 'unknown master image partition(%d)' % partition)
133 return partition
134
135
136 @contextlib.contextmanager
137 def file_system(self, partition, directory):
138 logging.info('attempting to access master filesystem %r:%s' %
139 (partition, directory))
140
141 assert directory != '/', "cannot mount entire partition"
142
143 with self._as_master() as runner:
144 runner.run('mkdir -p /mnt')
145 partition = self.get_partition(runner, partition)
146 runner.run('mount %s /mnt' % partition)
147 try:
148 targetdir = '/mnt/%s' % directory
149 runner.run('mkdir -p %s' % targetdir)
150
151 parent_dir, target_name = os.path.split(targetdir)
152
153 runner.run('/bin/tar -cmzf /tmp/fs.tgz -C %s %s' % (parent_dir, target_name))
154 runner.run('cd /tmp') # need to be in same dir as fs.tgz
155
156 url_base = runner.start_http_server()
157
158 url = url_base + '/fs.tgz'
159 logging.info("Fetching url: %s" % url)
160 tf = download_with_retry(self.context, self.scratch_dir, url, False)
161
162 tfdir = os.path.join(self.scratch_dir, str(time.time()))
163
164 try:
165 os.mkdir(tfdir)
166 self.context.run_command('/bin/tar -C %s -xzf %s' % (tfdir, tf))
167 yield os.path.join(tfdir, target_name)
168
169 finally:
170 tf = os.path.join(self.scratch_dir, 'fs.tgz')
171 mk_targz(tf, tfdir)
172 rmtree(tfdir)
173
174 # get the last 2 parts of tf, ie "scratchdir/tf.tgz"
175 tf = '/'.join(tf.split('/')[-2:])
176 runner.run('rm -rf %s' % targetdir)
177 self._target_extract(runner, tf, parent_dir)
178
179 finally:
180 runner.stop_http_server()
181 runner.run('umount /mnt')
182
183 def _target_extract(self, runner, tar_file, dest, timeout=-1):
184 tmpdir = self.context.config.lava_image_tmpdir
185 url = self.context.config.lava_image_url
186 tar_file = tar_file.replace(tmpdir, '')
187 tar_url = '/'.join(u.strip('/') for u in [url, tar_file])
188 self._target_extract_url(runner,tar_url,dest,timeout=timeout)
189
190 def _target_extract_url(self, runner, tar_url, dest, timeout=-1):
191 decompression_cmd = ''
192 if tar_url.endswith('.gz') or tar_url.endswith('.tgz'):
193 decompression_cmd = '| /bin/gzip -dc'
194 elif tar_url.endswith('.bz2'):
195 decompression_cmd = '| /bin/bzip2 -dc'
196 elif tar_url.endswith('.tar'):
197 decompression_cmd = ''
198 else:
199 raise RuntimeError('bad file extension: %s' % tar_url)
200
201 runner.run('wget -O - %s %s | /bin/tar -C %s -xmf -'
202 % (tar_url, decompression_cmd, dest),
203 timeout=timeout)
204
205 @contextlib.contextmanager
206 def _as_master(self):
207 self.bootcontrol.power_on_boot_master()
208
209 # Two reboots seem to be necessary to ensure that pxe boot is used.
210 # Need to identify the cause and fix it
211 self.proc.expect("Hit any key to stop autoboot:")
212 self.proc.sendline('')
213 self.bootcontrol.power_reset_boot_master()
214
215 self.proc.expect("\(initramfs\)")
216 self.proc.sendline('export PS1="%s"' % self.MASTER_PS1)
217 self.proc.expect(self.MASTER_PS1_PATTERN, timeout=180, lava_no_logging=1)
218 runner = HBMasterCommandRunner(self)
219 runner.run(". /scripts/functions")
220 device = "eth0"
221 runner.run("DEVICE=%s configure_networking" % device)
222
223 self.device_version = runner.get_device_version()
224
225 try:
226 yield runner
227 finally:
228 logging.debug("deploy done")
229
230
231target_class = HighbankTarget
232
233
234class HBMasterCommandRunner(MasterCommandRunner):
235 """A CommandRunner to use when the target is booted into the master image.
236 """
237 http_pid = None
238
239 def __init__(self, target):
240 super(HBMasterCommandRunner, self).__init__(target)
241
242 def start_http_server(self):
243 master_ip = self.get_master_ip()
244 if self.http_pid != None:
245 raise OperationFailed("busybox httpd already running with pid %" % self.http_pid)
246 # busybox produces no output to parse for, so run it in the bg and get its pid
247 self.run('busybox httpd -f &')
248 self.run('echo pid:$!:pid',response="pid:(\d+):pid",timeout=10)
249 if self.match_id != 0:
250 raise OperationFailed("busybox httpd did not start")
251 else:
252 self.http_pid = self.match.group(1)
253 url_base = "http://%s" % (master_ip)
254 return url_base
255
256 def stop_http_server(self):
257 if self.http_pid == None:
258 raise OperationFailed("busybox httpd not running, but stop_http_server called.")
259 self.run('kill %s' % self.http_pid)
260 self.http_pid = None
261
0262
=== added file 'lava_dispatcher/ipmi.py'
--- lava_dispatcher/ipmi.py 1970-01-01 00:00:00 +0000
+++ lava_dispatcher/ipmi.py 2013-04-16 02:42:25 +0000
@@ -0,0 +1,85 @@
1# Copyright (C) 2013 Linaro Limited
2#
3# Authors:
4# Antonio Terceiro <antonio.terceiro@linaro.org>
5# Michael Hudson-Doyle <michael.hudson@linaro.org>
6#
7# This file is part of LAVA Dispatcher.
8#
9# LAVA Dispatcher is free software; you can redistribute it and/or modify
10# it under the terms of the GNU General Public License as published by
11# the Free Software Foundation; either version 2 of the License, or
12# (at your option) any later version.
13#
14# LAVA Dispatcher is distributed in the hope that it will be useful,
15# but WITHOUT ANY WARRANTY; without even the implied warranty of
16# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17# GNU General Public License for more details.
18#
19# You should have received a copy of the GNU General Public License
20# along
21# with this program; if not, see <http://www.gnu.org/licenses>.
22
23
24class IPMITool(object):
25 """
26 This class wraps the ipmitool CLI to provide a convenient object-oriented
27 API that can be composed into the implementation of devices that can be
28 managed with IPMI.
29 """
30
31 def __init__(self, context, host, ipmitool="ipmitool"):
32 self.host = host
33 self.context = context
34 self.ipmitool = ipmitool
35
36 def __ipmi(self, command):
37 self.context.run_command(
38 "%s -H %s -U admin -P admin %s" % (
39 self.ipmitool, self.host, command
40 )
41 )
42
43 def set_to_boot_from_disk(self):
44 self.__ipmi("chassis bootdev disk")
45
46 def set_to_boot_from_pxe(self):
47 self.__ipmi("chassis bootdev pxe")
48
49 def power_off(self):
50 self.__ipmi("chassis power off")
51
52 def power_on(self):
53 self.__ipmi("chassis power on")
54
55 def reset(self):
56 self.__ipmi("chassis power reset")
57
58
59class IpmiPxeBoot(object):
60 """
61 This class provides a convenient object-oriented API that can be
62 used to initiate power on/off and boot device selection for pxe
63 and disk boot devices using ipmi commands.
64 """
65
66 def __init__(self, context, host):
67 self.ipmitool = IPMITool(context, host)
68
69 def power_on_boot_master(self):
70 self.ipmitool.set_to_boot_from_pxe()
71 self.ipmitool.power_on()
72 self.ipmitool.reset()
73
74 def power_reset_boot_master(self):
75 self.ipmitool.set_to_boot_from_pxe()
76 self.ipmitool.reset()
77
78 def power_on_boot_image(self):
79 self.ipmitool.set_to_boot_from_disk()
80 self.ipmitool.power_on()
81 self.ipmitool.reset()
82
83 def power_off(self):
84 self.ipmitool.power_off()
85

Subscribers

People subscribed via source and target branches