Offspring

Code review comment for lp:~dooferlad/offspring/linaro_offspring_add_bzr_ssh

linaro_offspring_add_bzr_ssh
Merge into linaro

Revision history for this message

James Tunnicliffe (dooferlad) wrote on 2011-10-03:

Hi,

Thanks for the feedback.

On 30 September 2011 20:07, Guilherme Salgado
<email address hidden> wrote:
>> The SSH user is inserted into the bzr branch URL. lp: URLs are supported, as are bzr+ssh.
>
> Did you consider calling "bzr lp-login <username>" right after running
> ssh-add instead of changing the branch URL when starting the build?
> That would be simpler to implement and more flexible as it'd allow any
> URL schemes, but it'd also allow the build script to fetch other
> branches from LP using the same user.

That is a very good point. I didn't consider that.

I guess using bzr lp-login may be a bit irritating when testing
because it would keep logging us out.

I am not familiar with the mechanics of how lp-login works, but it
seems to use ~/.bazaar/authentication.conf to store a block of
information:

[Launchpad]
host = .launchpad.net
scheme = ssh
user = dooferlad

I can't see any way of overriding this using an environment variable.
If several slaves were being run as the same user, sharing a home
directory, they would all use the user set by the last process to
start so that would cause us some problems.

http://doc.bazaar.canonical.com/latest/en/user-reference/configuration-help.html
http://doc.bazaar.canonical.com/developers/authentication-ring.html
http://doc.bazaar.canonical.com/latest/en/user-reference/authentication-help.html

A quick skim of these says that all URL formats accept the form:
bzr branch <scheme>://<user>@host/path
So we are fine embedding the user in the URL in general and I change
lp:~ to the long form as a separate step, so the current code is safe.
I agree that a more simple solution would be nice though. Will let you
know if I think of anything better.

>> As yet I have not modified the admin interface. You can add a project through the non-admin UI and from there add a user + SSH key. On the admin interface you can't see the key (which has to be not pass phrase protected) or user name. I am planning on adding a sensible way of updating the user and key without leaking the key back out of the interface tomorrow, at which point I think this feature will be complete.
>> differences between files attachment (review-diff.txt)
>> === modified file 'lib/offspring/build/bin/offspring-build'
>> --- lib/offspring/build/bin/offspring-build 2011-09-21 21:06:01 +0000
>> +++ lib/offspring/build/bin/offspring-build 2011-09-29 17:38:30 +0000
>> @@ -168,10 +168,39 @@
>> Stop_logging
>> Catch_faults disable
>>
>> + if [ $RUN_SSH_AGENT = 1 ]; then
>> + echo kill
>
> Was this just for debugging?

Oops, yes. Will delete it!

>> + kill $SSH_AGENT_PID
>> + fi
>
> Given the security implications of failing to kill the ssh agent, I
> think it's really important that we have a test for this. Do you have
> any idea how hard it'd be to write one?

I think it is possible, but I am not sure how simple it will be. I
think it will involve:

1. Start a new, clean shell (assert that no ssh-agent is running in it)
2. Run a job that checks that ssh-agent is running and somehow returns
information about this to the test runner.
3. Check that no ssh-agent is running

If the new shell is created by a python subprocess call, I can monitor
its stdout, which will be fine for checking that things go according
to plan.

It is an interesting point about leaving ssh-agent processes running.
It isn't as long as you trust the users on the machine. We should
treat our users as hostile though, so the assumption would be that one
of them could put together a checkout that just tried to connect to
running ssh-agent processes and log into other users private branches.
To stop this we would have to run each build in some form of sandbox
or as separate users. Perhaps the following would work:

start build wrapper on build machine
build wrapper starts ssh-agent
build wrapper does "ssh -A untrusted@localhost <build script>

So, as always, we need to trust root, but that is OK. All commands
from untrusted users are executed as separate logins to the user
"untrusted" and all the ssh authentication is performed by forwarding
the authentication agent back to a trusted user in a way that can't
easily be hijacked.

BTW, "borrowing" an existing ssh-agent is *really* easy, so we should do this!

>> +
>> Print
>> Print "OK: %s build %s completed at %s." "${PROJECT}" "${BUILDNAME}" "`date`"
>> }
>>
>> +_AddKeys() {
>> + echo AddKeys
>> + INDEX=0
>> + RUN_SSH_AGENT=0
>> + SSH_KEYS[0]=0
>> + INSERT_POINT=0
>> + for ARG in "$@"
>> + do
>> + if [ $INDEX -gt 2 ]; then # key files...
>> + let RUN_SSH_AGENT=1
>> + SSH_KEYS[$INSERT_POINT]="$ARG"
>> + INSERT_POINT=$((INSERT_POINT+1))
>
> Would it be possible to simplify this a bit by checking the number of
> arguments?
>
> if [ $# -gt 2 ]; then
> # We were passed an SSH key to use when fetching the config from LP
> SSH_KEY=$3
> fi
>
> And later you could even use SSH_KEY to decide whether or not run run
> ssh-agent?
>
>> + fi
>> + done
>> +
>> + if [ $RUN_SSH_AGENT = 1 ]; then
>> + eval `ssh-agent`
>> + for KEY in "$SSH_KEYS"
>
> The for loop is not needed here as there can only be one SSH key
> associated with a project, right?

Yep, I can simplify all of the above by assuming that if there is a
third argument, it is a location of an SSH key file.

<snip>
>> class ProjectBuildProcess(MonitoredProcess):
>>
>> - def __init__(self, supervisor, projectName, configBranchURL):
>> - MonitoredProcess.__init__(
>> - self, supervisor, [projectName, configBranchURL],
>> + def __init__(self, supervisor, projectName, configBranchURL, lpSSHKey=None):
>> + self.args = [projectName, configBranchURL]
>> + self.ssh_key_file = None
>> +
>> + if lpSSHKey:
>> + # Write SSH key to disk. tempfile will delete it after the build
>> + # when this class goes out of scope.
>> + self.ssh_key_file = tempfile.NamedTemporaryFile()
>> + self.args.append(self.ssh_key_file.name)
>> + self.ssh_key_file.write(lpSSHKey)
>> + self.ssh_key_file.flush()
>
> I wonder if it wouldn't make more sense to run this in preExecution()
> (or maybe setup(), even).

Probably. Good point.

>> === added file 'lib/offspring/slave/test_private_ssh_keys.py'
>> --- lib/offspring/slave/test_private_ssh_keys.py 1970-01-01 00:00:00 +0000
>> +++ lib/offspring/slave/test_private_ssh_keys.py 2011-09-29 17:38:30 +0000
>
> This should be under a tests/ directory; do you mind creating it and
> moving it there?

No problem. There is another test file in the slave directory, so I
just went with the existing structure, but as soon as there is a
collection a directory definitely makes sense.

>> @@ -0,0 +1,54 @@
>> +# Copyright 2010 Canonical Ltd. This software is licensed under the
>> +# GNU Affero General Public License version 3 (see the file LICENSE).
>> +import os
>> +
>> +import offspring.slave
>> +import unittest
>> +import xmlrpclib
>> +import re
>> +import os.path
>> +
>> +class TestLexbuilderSlavePrivateSSHKeys(unittest.TestCase):
>> +
>> + def setUp(self):
>> + self.slave = offspring.slave.LexbuilderSlave(('localhost', 8760))
>> + self.slave.start()
>> + self.x = xmlrpclib.ServerProxy("http://admin:password@localhost:8760")
>> +
>> + def tearDown(self):
>> + self.slave.stop()
>> +
>> + def test_startBuildWithCustomSSHKey(self):
>> + args = ["project_name", "lp:~dooferlad/offspring/config", "user", "key"]
>> + expected_args = [args[0],
>> + re.sub("lp:", "bzr+ssh://" + args[2] + "@bazaar.launchpad.net/", args[1]),
>> + args[3]]
>> +
>> + self.slave.startBuild(args[0], args[1], args[2], args[3])
>> + passed_args = self.slave.currentBuild.args
>> +
>> + self.assertEqual(expected_args, passed_args)
>> +
>> + def test_startBuildWithCustomSSHKey(self):
>
> This test is overwriting the previous one.
>
>> + args = ["project_name", "lp:~dooferlad/offspring/config", "user", "key"]
>
> Will this cause startBuild() to actually fetch your config branch from
> LP? We must not do that because then we need to be able to connect to
> LP if we want to run the tests.

No, the config branch in the test doesn't exist, but it could do with
a comment to say so! I should also test it without an internet
connection to check to see if I have missed something, like it waiting
on a time-out or something.

>> + expected_args = [args[0],
>> + re.sub("lp:", "bzr+ssh://" + args[2] + "@bazaar.launchpad.net/", args[1]),
>> + args[3]]
>> +
>> + self.slave.startBuild(args[0], args[1], args[2], args[3])
>> +
>> + file_path = self.slave.currentBuild.ssh_key_file.name
>> + read_key = open(file_path).read()
>> +
>> + # We should have a key written to a file for the slave to read...
>> + self.assertEqual(read_key, args[3])
>> +
>> + self.slave.stop()
>
> We don't really need to stop the slave to have the key file removed,
> right?

Actually no. That should be a call to stopBuild().

>> + # Having stopped the slave, the key should have been erased
>> + self.assertFalse(os.path.exists(file_path))
>> +
>> +def test_suite():
>> + loader = unittest.TestLoader()
>> + return loader.loadTestsFromName(__name__)
>
> I'm pretty sure you don't need this as we're using nose as the test
> runner.

Ah, OK. I just copied slave/tests.py. Will fix both up.

--
James Tunnicliffe

Hi,

Thanks for the feedback.

On 30 September 2011 20:07, Guilherme Salgado
<guilherme.salgado@linaro.org> wrote:
>> The SSH user is inserted into the bzr branch URL. lp: URLs are supported, as are bzr+ssh.
>
> Did you consider calling "bzr lp-login <username>" right after running
> ssh-add instead of changing the branch URL when starting the build?
> That would be simpler to implement and more flexible as it'd allow any
> URL schemes, but it'd also allow the build script to fetch other
> branches from LP using the same user.

That is a very good point. I didn't consider that.

I guess using bzr lp-login may be a bit irritating when testing
because it would keep logging us out.

I am not familiar with the mechanics of how lp-login works, but it
seems to use ~/.bazaar/authentication.conf to store a block of
information:

[Launchpad]
host = .launchpad.net
scheme = ssh
user = dooferlad

A quick skim of these says that all URL formats accept the form:
  bzr branch <scheme>://<user>@host/path
So we are fine embedding the user in the URL in general and I change
lp:~ to the long form as a separate step, so the current code is safe.
I agree that a more simple solution would be nice though. Will let you
know if I think of anything better.

>> As yet I have not modified the admin interface. You can add a project through the non-admin UI and from there add a user + SSH key. On the admin interface you can't see the key (which has to be not pass phrase protected) or user name. I am planning on adding a sensible way of updating the user and key without leaking the key back out of the interface tomorrow, at which point I think this feature will be complete.
>> differences between files attachment (review-diff.txt)
>> === modified file 'lib/offspring/build/bin/offspring-build'
>> --- lib/offspring/build/bin/offspring-build   2011-09-21 21:06:01 +0000
>> +++ lib/offspring/build/bin/offspring-build   2011-09-29 17:38:30 +0000
>> @@ -168,10 +168,39 @@
>>      Stop_logging
>>      Catch_faults disable
>>
>> +    if [ $RUN_SSH_AGENT = 1 ]; then
>> +        echo kill
>
> Was this just for debugging?

Oops, yes. Will delete it!

>> +        kill $SSH_AGENT_PID
>> +    fi
>
> Given the security implications of failing to kill the ssh agent, I
> think it's really important that we have a test for this. Do you have
> any idea how hard it'd be to write one?

I think it is possible, but I am not sure how simple it will be. I
think it will involve:

1. Start a new, clean shell (assert that no ssh-agent is running in it)
2. Run a job that checks that ssh-agent is running and somehow returns
   information about this to the test runner.
3. Check that no ssh-agent is running

If the new shell is created by a python subprocess call, I can monitor
its stdout, which will be fine for checking that things go according
to plan.

start build wrapper on build machine
build wrapper starts ssh-agent
build wrapper does "ssh -A untrusted@localhost <build script>

BTW, "borrowing" an existing ssh-agent is *really* easy, so we should do this!

>> +
>>      Print
>>      Print "OK: %s build %s completed at %s." "${PROJECT}" "${BUILDNAME}" "`date`"
>>  }
>>
>> +_AddKeys() {
>> +    echo AddKeys
>> +    INDEX=0
>> +    RUN_SSH_AGENT=0
>> +    SSH_KEYS[0]=0
>> +    INSERT_POINT=0
>> +    for ARG in "$@"
>> +    do
>> +        if [ $INDEX -gt 2 ]; then # key files...
>> +            let RUN_SSH_AGENT=1
>> +            SSH_KEYS[$INSERT_POINT]="$ARG"
>> +            INSERT_POINT=$((INSERT_POINT+1))
>
> Would it be possible to simplify this a bit by checking the number of
> arguments?
>
>  if [ $# -gt 2 ]; then
>    # We were passed an SSH key to use when fetching the config from LP
>    SSH_KEY=$3
>  fi
>
> And later you could even use SSH_KEY to decide whether or not run run
> ssh-agent?
>
>> +        fi
>> +    done
>> +
>> +    if [ $RUN_SSH_AGENT = 1 ]; then
>> +        eval `ssh-agent`
>> +        for KEY in "$SSH_KEYS"
>
> The for loop is not needed here as there can only be one SSH key
> associated with a project, right?

Yep, I can simplify all of the above by assuming that if there is a
third argument, it is a location of an SSH key file.

<snip>
>>  class ProjectBuildProcess(MonitoredProcess):
>>
>> -    def __init__(self, supervisor, projectName, configBranchURL):
>> -        MonitoredProcess.__init__(
>> -            self, supervisor, [projectName, configBranchURL],
>> +    def __init__(self, supervisor, projectName, configBranchURL, lpSSHKey=None):
>> +        self.args = [projectName, configBranchURL]
>> +        self.ssh_key_file = None
>> +
>> +        if lpSSHKey:
>> +            # Write SSH key to disk. tempfile will delete it after the build
>> +            # when this class goes out of scope.
>> +            self.ssh_key_file = tempfile.NamedTemporaryFile()
>> +            self.args.append(self.ssh_key_file.name)
>> +            self.ssh_key_file.write(lpSSHKey)
>> +            self.ssh_key_file.flush()
>
> I wonder if it wouldn't make more sense to run this in preExecution()
> (or maybe setup(), even).

Probably. Good point.

>> === added file 'lib/offspring/slave/test_private_ssh_keys.py'
>> --- lib/offspring/slave/test_private_ssh_keys.py      1970-01-01 00:00:00 +0000
>> +++ lib/offspring/slave/test_private_ssh_keys.py      2011-09-29 17:38:30 +0000
>
> This should be under a tests/ directory; do you mind creating it and
> moving it there?

No problem. There is another test file in the slave directory, so I
just went with the existing structure, but as soon as there is a
collection a directory definitely makes sense.

>> @@ -0,0 +1,54 @@
>> +# Copyright 2010 Canonical Ltd.  This software is licensed under the
>> +# GNU Affero General Public License version 3 (see the file LICENSE).
>> +import os
>> +
>> +import offspring.slave
>> +import unittest
>> +import xmlrpclib
>> +import re
>> +import os.path
>> +
>> +class TestLexbuilderSlavePrivateSSHKeys(unittest.TestCase):
>> +
>> +    def setUp(self):
>> +        self.slave = offspring.slave.LexbuilderSlave(('localhost', 8760))
>> +        self.slave.start()
>> +        self.x = xmlrpclib.ServerProxy("http://admin:password@localhost:8760")
>> +
>> +    def tearDown(self):
>> +        self.slave.stop()
>> +
>> +    def test_startBuildWithCustomSSHKey(self):
>> +        args = ["project_name", "lp:~dooferlad/offspring/config", "user", "key"]
>> +        expected_args = [args[0],
>> +                         re.sub("lp:", "bzr+ssh://" + args[2] + "@bazaar.launchpad.net/", args[1]),
>> +                         args[3]]
>> +
>> +        self.slave.startBuild(args[0], args[1], args[2], args[3])
>> +        passed_args = self.slave.currentBuild.args
>> +
>> +        self.assertEqual(expected_args, passed_args)
>> +
>> +    def test_startBuildWithCustomSSHKey(self):
>
> This test is overwriting the previous one.
>
>> +        args = ["project_name", "lp:~dooferlad/offspring/config", "user", "key"]
>
> Will this cause startBuild() to actually fetch your config branch from
> LP?  We must not do that because then we need to be able to connect to
> LP if we want to run the tests.

>> +        expected_args = [args[0],
>> +                         re.sub("lp:", "bzr+ssh://" + args[2] + "@bazaar.launchpad.net/", args[1]),
>> +                         args[3]]
>> +
>> +        self.slave.startBuild(args[0], args[1], args[2], args[3])
>> +
>> +        file_path = self.slave.currentBuild.ssh_key_file.name
>> +        read_key = open(file_path).read()
>> +
>> +        # We should have a key written to a file for the slave to read...
>> +        self.assertEqual(read_key, args[3])
>> +
>> +        self.slave.stop()
>
> We don't really need to stop the slave to have the key file removed,
> right?

Actually no. That should be a call to stopBuild().

>> +        # Having stopped the slave, the key should have been erased
>> +        self.assertFalse(os.path.exists(file_path))
>> +
>> +def test_suite():
>> +    loader = unittest.TestLoader()
>> +    return loader.loadTestsFromName(__name__)
>
> I'm pretty sure you don't need this as we're using nose as the test
> runner.

Ah, OK. I just copied slave/tests.py. Will fix both up.

-- 
James Tunnicliffe

« Back to merge proposal