Merge lp:~danilo/linaro-android-build-tools/no-seed into lp:linaro-android-build-tools

Proposed by Данило Шеган
Status: Merged
Approved by: James Tunnicliffe
Approved revision: 546
Merged at revision: 537
Proposed branch: lp:~danilo/linaro-android-build-tools/no-seed
Merge into: lp:linaro-android-build-tools
Diff against target: 117 lines (+66/-8)
3 files modified
build-scripts/helpers (+9/-7)
build-scripts/rewrite-manifest.py (+56/-0)
control/setup-control-node (+1/-1)
To merge this branch: bzr merge lp:~danilo/linaro-android-build-tools/no-seed
Reviewer Review Type Date Requested Status
James Tunnicliffe (community) Approve
Review via email: mp+135625@code.launchpad.net

Description of the change

Use scalable git-over-http access
=================================

This branch introduces manifest rewriting to change any references to
git://git.linaro.org/ and git://android.git.linaro.org/ to their
simple, scalable http URLs (http://(android.)git.linaro.org/git-ro).

At the same time, we need to append '.git' to repository paths.

Along the way, I also kill the non-working 'mirror' functionality for
non-seeded builds: we should be able to get similar performance with seeded
builds if we get a http proxy going on amazon ec2 cloud.

To post a comment you must log in.
Revision history for this message
Данило Шеган (danilo) wrote :
545. By Данило Шеган

Minor string reformatting.

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

Looks OK. It feels like this bit should be tidier though:

71 + # Replace git://git.linaro.org/ and http://git.linaro.org/git/
72 + # base URLs with the scalable http://git.linaro.org/git-ro/.
73 + if remote.get('fetch') in (
74 + 'git://git.linaro.org', 'git://git.linaro.org/',
75 + 'http://git.linaro.org/git', 'http://git.linaro.org/git/'):
76 + remote.set('fetch', 'http://git.linaro.org/git-ro/')
77 + remotes_to_handle.add(remote.get('name'))
78 + # Replace git://android.git.linaro.org/ and
79 + # http://android.git.linaro.org/git/ base URLs with the scalable
80 + # http://git.linaro.org/git-ro/.
81 + elif remote.get('fetch') in (
82 + 'git://android.git.linaro.org', 'git://android.git.linaro.org/',
83 + 'http://android.git.linaro.org/git',
84 + 'http://android.git.linaro.org/git/'):
85 + remote.set('fetch', 'http://android.git.linaro.org/git-ro/')
86 + remotes_to_handle.add(remote.get('name'))

Isn't this basically:

url_match = re.search("git://([\w.]*git.linaro.org)(.*?)$", remote.get('fetch'))
if url_match:
    remote.set("fetch", urllib.urljoin("http://" + url_match.group(1)), "git-ro"))

If you wanted to be more careful you could add a check a check that url_match.group(2) is "git/*" or "".

review: Approve
Revision history for this message
Данило Шеган (danilo) wrote :

Thanks for the review James: I want to keep static paths so it's more obvious what's going on (regular expressions are harder to update and always require you to stop and think to ensure they are doing the right thing). I even thought of defining a dict along the lines of

 rewrite_urls = {
   'git://git.linaro.org/': 'http://git.linaro.org/git-ro/',
   'git://git.linaro.org': 'http://git.linaro.org/git-ro',
   ...
 }

and using that instead. What do you think?

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

The dictionary approach is definitely better, but the trailing slash
causing duplications irritates me. You could rstrip('/') if we don't
need it (I assume we don't).

Agree about regexps - often the cause of write only code!

Revision history for this message
Данило Шеган (danilo) wrote :

That would still lead to code that has to check for several conditions: eg. it'd become something like

  remote_url = remote.get('fetch')
  # Try to see if there's a remote URL
  target_url = rewrite_urls.get(remote_url,
                                rewrite_urls.get(remote_url.rstrip('/'), None))
  if target_url is not None:
     remote.set('fetch', target_url)

(Or, I could repeat the if a few times, but I'd hate that even more).

I don't think the above would be clear enough, and would be easy to break (one would have to ensure URLs in rewrite_urls are without the '/': either with more code, or by trying not to forget that :), and all to save a few measly almost-duplicated lines which would lead to:

  remote_url = remote.get('fetch')
  if remote_url in rewrite_urls:
     remote.set('fetch', rewrite_urls[remote_url])

On top of all this, I don't want to worry about whether repo cares or doesn't care about '/', and I want to copy whatever is being done in the manifest (if there's a trailing '/', I want rewritten URL to have it as well). Sure, we can add more code for that, but it'd make it even less clear what's going on, and it would be longer.

At this time, I want to live with this duplication, but if you really insist, I'll add the code to deal with the slash.

546. By Данило Шеган

Split out URLs being rewritten into a dict as per review.

Revision history for this message
James Tunnicliffe (dooferlad) wrote :

OK, you convinced me.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'build-scripts/helpers'
2--- build-scripts/helpers 2012-11-19 11:35:11 +0000
3+++ build-scripts/helpers 2012-11-22 14:42:20 +0000
4@@ -28,19 +28,15 @@
5 mkdir -p out
6 cp .repo/manifest.xml out/source-manifest.xml
7
8- # Allow either specify hostname or pass $HUDSON_URL from Jenkins
9- mirror_host=`python -c 'import urlparse, sys; print urlparse.urlparse(sys.argv[1]).netloc if "/jenkins" in sys.argv[1] else sys.argv[1]' "$1"`
10+ echo Replace Linaro git URLs with lightweight http git URLs.
11+ ${BUILD_SCRIPT_ROOT}/rewrite-manifest.py .repo/manifest.xml > processed-manifest.xml
12+ cp processed-manifest.xml .repo/manifest.xml
13
14- time python -c 'import xmlrpclib, sys; print xmlrpclib.ServerProxy(sys.argv[1]).mirror(open(".repo/manifest.xml").read())' "http://$mirror_host:8080" > .repo/mirror-manifest.xml
15- echo "Received modified manifest:"
16- cat .repo/mirror-manifest.xml
17 echo ----------------------------
18- cp .repo/mirror-manifest.xml .repo/manifest.xml
19 time repo sync $REPO_QUIET -j$SYNC_JOBS
20 # Restore source manifest temporarily to create pinned manifest
21 cp out/source-manifest.xml .repo/manifest.xml
22 repo manifest -r -o out/pinned-manifest.xml
23- cp .repo/mirror-manifest.xml .repo/manifest.xml
24 }
25
26 repo-sync-from-seed () {
27@@ -65,10 +61,16 @@
28 mkdir -p out
29 cp .repo/manifest.xml out/source-manifest.xml
30
31+ echo Replace Linaro git URLs with lightweight http git URLs.
32+ ${BUILD_SCRIPT_ROOT}/rewrite-manifest.py .repo/manifest.xml > processed-manifest.xml
33+ cp processed-manifest.xml .repo/manifest.xml
34+
35 export TIMEFORMAT="TIME: Repo sync (using seed as reference): %lR"
36 time repo sync $REPO_QUIET -j$SYNC_JOBS
37 unset TIMEFORMAT
38
39+ # Restore source manifest temporarily to create pinned manifest
40+ cp out/source-manifest.xml .repo/manifest.xml
41 repo manifest -r -o out/pinned-manifest.xml
42 }
43
44
45=== added file 'build-scripts/rewrite-manifest.py'
46--- build-scripts/rewrite-manifest.py 1970-01-01 00:00:00 +0000
47+++ build-scripts/rewrite-manifest.py 2012-11-22 14:42:20 +0000
48@@ -0,0 +1,56 @@
49+#!/usr/bin/env python
50+
51+from xml.etree.cElementTree import (
52+ ElementTree,
53+ tostring,
54+)
55+import sys
56+
57+rewrite_urls = {
58+ 'git://git.linaro.org': 'http://git.linaro.org/git-ro',
59+ 'git://git.linaro.org/': 'http://git.linaro.org/git-ro/',
60+ 'git://android.git.linaro.org': 'http://android.git.linaro.org/git-ro',
61+ 'git://android.git.linaro.org/': 'http://android.git.linaro.org/git-ro/',
62+}
63+
64+def rewrite_git_urls(root):
65+ """Rewrites all android.git.linaro.org and git.linaro.org URLs.
66+
67+ Makes them use /git-ro and appends '.git' to repo names if they
68+ do not already have it.
69+ """
70+ remotes_to_handle = set([])
71+ default_remote = root.find('default').get('remote', None)
72+
73+ for remote in root.findall("remote"):
74+ if default_remote is None:
75+ default_remote = remote.get('name')
76+
77+ remote_url = remote.get('fetch')
78+ if remote_url in rewrite_urls:
79+ remote.set('fetch', rewrite_urls[remote_url])
80+ remotes_to_handle.add(remote.get('name'))
81+
82+ for project in root.findall("project"):
83+ if (project.get('remote') in remotes_to_handle or
84+ project.get('remote') is None and
85+ default_remote in remotes_to_handle):
86+ if not project.get('name').endswith('.git'):
87+ project.set('name', project.get('name') + '.git')
88+
89+if __name__ == '__main__':
90+ import argparse
91+ parser = argparse.ArgumentParser(
92+ description=(
93+ 'Switch Linaro git URLs to scalable git URLs in repo manifest '
94+ 'files and print the result manifest on stdout.'))
95+ parser.add_argument('manifest', type=str,
96+ help='manifest file to process')
97+ args = parser.parse_args()
98+
99+ # Parse the manifest file.
100+ et = ElementTree()
101+ doc = open(args.manifest)
102+ root = et.parse(doc)
103+ rewrite_git_urls(root)
104+ print tostring(root, 'UTF-8').decode('utf-8')
105
106=== modified file 'control/setup-control-node'
107--- control/setup-control-node 2012-08-02 10:26:24 +0000
108+++ control/setup-control-node 2012-11-22 14:42:20 +0000
109@@ -13,7 +13,7 @@
110 MYPATH=$PWD/`dirname $0`
111 HOSTNAME=$1
112 [ -z "$HOSTNAME" ] && HOSTNAME=`hostname`
113-REPO_MIRROR="--repo-url=git://android.git.linaro.org/tools/repo.git"
114+REPO_MIRROR="--repo-url=http://android.git.linaro.org/git-ro/tools/repo.git"
115
116 #
117 # +++ BASE +++

Subscribers

People subscribed via source and target branches