Merge lp:~jameinel/bzr-builddeb/import-scripts-updates into lp:~james-w/bzr-builddeb/import-scripts

Proposed by John A Meinel
Status: Merged
Merge reported by: James Westby
Merged at revision: not available
Proposed branch: lp:~jameinel/bzr-builddeb/import-scripts-updates
Merge into: lp:~james-w/bzr-builddeb/import-scripts
Diff against target: 320 lines (+113/-60)
2 files modified
icommon.py (+11/-9)
import_package.py (+102/-51)
To merge this branch: bzr merge lp:~jameinel/bzr-builddeb/import-scripts-updates
Reviewer Review Type Date Requested Status
James Westby Needs Fixing
Review via email: mp+19513@code.launchpad.net
To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

This updates icommon.py to add:
1) Create the 'logs/diffs' and 'logs/merges' directories, since they'll be used later.
2) Add 'lp_cache' directory and set the Launchpad instance to use it.

The main win with (2) is that the WADL description of the rest api is something like 1MB, which means that using a cache avoids having to download that whole file everytime I try to run a test import.

I imagine it matters less in production servers (since local bandwidth is cheaper), and LP automatically caches in a temp dir for the duration of a run. But if you are running 3k imports, that is 3GB of content you don't have to download. Especially when there is nothing to do...

Revision history for this message
James Westby (james-w) wrote :

> This updates icommon.py to add:
> 1) Create the 'logs/diffs' and 'logs/merges' directories, since they'll be
> used later.

Great.

> 2) Add 'lp_cache' directory and set the Launchpad instance to use it.

Not so great, see bug 459418.

I don't want to turn this on by default with that bug. In fact I used to have
this, but backed it out once I diagnosed the odd errors (bug 404204) I backed
it out.

I'd be happy with something that did this in a way that I could bypass it on
production.

Thanks,

James

review: Needs Fixing
312. By John A Meinel

Several small updates.
1) handle when madison-lite is not installed more gracefully
2) enable_default_logging() before loading plugins, in case something fails to load.
3) allow passing -Dxxx flags for bzrlib to use for debugging (like -Dhttp, -Derror, and -Dhpss)

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Westby wrote:
> Review: Needs Fixing
>> This updates icommon.py to add:
>> 1) Create the 'logs/diffs' and 'logs/merges' directories, since they'll be
>> used later.
>
> Great.
>
>> 2) Add 'lp_cache' directory and set the Launchpad instance to use it.
>
> Not so great, see bug 459418.
>
> I don't want to turn this on by default with that bug. In fact I used to have
> this, but backed it out once I diagnosed the odd errors (bug 404204) I backed
> it out.
>
> I'd be happy with something that did this in a way that I could bypass it on
> production.
>
> Thanks,
>
> James
>

I'd be happy to provide either an --enable-lp-cache or a
- --disable-lp-cache. Depending on what you want the default to be. For my
interactive testing, I doubt I'd run more than one process, and it
significantly improves the response.

What would you prefer?

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkt8WLIACgkQJdeBCYSNAANEHwCeIQdm6gl3cYbcbClZqagnsAiE
yvYAoM+HA3S69CRRBxDKFC1Yh7ougT7H
=Ci4P
-----END PGP SIGNATURE-----

313. By John A Meinel

Move the option parsing earlier
this way we can find out about the lp_cache directory before we try to open Launchpad.

314. By John A Meinel

add some progress bars for the oh-so-slow getPublishedSources.
Add a bit of caching of opening branches, so that we don't have to
try to re-open the same object hundreds of times.

315. By John A Meinel

Use possible transports for file fetching.

This should allow keep-alive. Though it seems there is a bug wrt redirections.

316. By John A Meinel

Workaround the redirect bug and implement it ourself.
possible_transports provides all the caching that we would hope to get from
the redirection code, and then some. Testing shows that it does, indeed save
some extra connections. Not all of them, mostly because of the 15s timeout.
(We probe launchpad.net, find we need to download a big file from launchpadlibrarian.net
then spend >15s downloading. The next file reconnects to lp.net but not lpl.net)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'icommon.py'
2--- icommon.py 2010-02-11 13:19:55 +0000
3+++ icommon.py 2010-02-18 06:21:15 +0000
4@@ -58,10 +58,12 @@
5 no_lock_returncode = 139
6
7
8-for dir in (packages_dir, logs_dir, retry_dir, priority_dir,
9- lists_dir, revids_dir, locks_dir, updates_dir):
10- if not os.path.exists(dir):
11- os.mkdir(dir)
12+for directory in (packages_dir, logs_dir, retry_dir, priority_dir, lists_dir,
13+ revids_dir, locks_dir, updates_dir,
14+ debian_diffs_dir, ubuntu_merge_dir,
15+ ):
16+ if not os.path.exists(directory):
17+ os.mkdir(directory)
18
19 urllib_launchpad_base_url = "https+urllib://launchpad.net/"
20 urllib_debian_base_url = "http+urllib://ftp.uk.debian.org/debian/"
21@@ -84,14 +86,14 @@
22 default_ubuntu_merge_source = "squeeze"
23
24
25-def get_lp():
26+def get_lp(cache=None):
27 f = open(lp_creds_file, "rb")
28 try:
29 creds = Credentials("package-import")
30 creds.load(f)
31 finally:
32 f.close()
33- lp = Launchpad(creds, service_root=SERVICE_ROOT)
34+ lp = Launchpad(creds, service_root=SERVICE_ROOT, cache=cache)
35 return lp
36
37
38@@ -1448,11 +1450,11 @@
39 try:
40 revid = old_ubuntu_b.last_revision()
41 to_transport.mkdir('.')
42- dir = old_ubuntu_b.bzrdir.sprout(to_transport.base,
43+ abzrdir = old_ubuntu_b.bzrdir.sprout(to_transport.base,
44 revid, possible_transports=possible_transports)
45- ubuntu_b = dir.open_branch()
46+ ubuntu_b = abzrdir.open_branch()
47 tag._merge_tags_if_possible(old_ubuntu_b, ubuntu_b)
48- ubuntu_tree = dir.open_workingtree()
49+ ubuntu_tree = abzrdir.open_workingtree()
50 finally:
51 old_ubuntu_b.unlock()
52 if ubuntu_b is None:
53
54=== modified file 'import_package.py'
55--- import_package.py 2010-02-12 21:51:28 +0000
56+++ import_package.py 2010-02-18 06:21:15 +0000
57@@ -1,6 +1,7 @@
58 #!/usr/bin/python
59
60 import datetime
61+import errno
62 import operator
63 import optparse
64 import os
65@@ -17,10 +18,15 @@
66
67 from launchpadlib.errors import HTTPError
68
69-from bzrlib import branch, bzrdir, errors, merge, tag, trace, transport, ui, urlutils
70+from bzrlib import branch, bzrdir, debug, errors, merge, tag, trace, transport, ui, urlutils
71 from bzrlib.plugin import load_plugins
72-from bzrlib.trace import mutter, log_exception_quietly
73+from bzrlib.trace import mutter, log_exception_quietly, enable_default_logging
74
75+# We may not want to do this, but if there is a failure while loading plugins,
76+# it is the only way to find out. One option would be to call
77+# 'trace.pop_log_file()' immediately after loading plugins.
78+enable_default_logging()
79+mutter('import_package started w/ args: %r' % (sys.argv,))
80 load_plugins()
81
82 from bzrlib.plugins.builddeb import import_dsc
83@@ -30,6 +36,20 @@
84
85 push_lock = threading.Lock()
86
87+parser = optparse.OptionParser()
88+parser.add_option("--push-only", dest="push", action="store_true")
89+parser.add_option("--no-push", dest="no_push", action="store_true")
90+parser.add_option("--check", dest="check", action="store_true")
91+parser.add_option("--no-existing", dest="no_existing", action="store_true")
92+parser.add_option("--extra-debian", dest="extra_debian")
93+parser.add_option("--verbose", dest="verbose", action="store_true")
94+parser.add_option("-D", dest="debug_flags", action="append", default=[],
95+ help="Set debug flags, see 'bzr help debug-flags'")
96+parser.add_option("--lp-cache", dest="lp_cache", default=None,
97+ help="Set the location of a Launchpadlib cache")
98+
99+options, args = parser.parse_args()
100+
101
102 class Killed(Exception):
103
104@@ -122,7 +142,7 @@
105 ubuntu_base_url = launchpad_base_url + "ubuntu/"
106
107
108-lp = icommon.get_lp()
109+lp = icommon.get_lp(options.lp_cache)
110 ubuntu = lp.distributions['ubuntu']
111 u_archive = ubuntu.main_archive
112 debian = lp.distributions['debian']
113@@ -157,17 +177,32 @@
114 publications = icommon.lp_call(icommon.call_with_limited_size,
115 d_archive.getPublishedSources,
116 source_name=package, exact_match=True)
117- for publication in icommon.iterate_collection(publications):
118- assert publication.source_package_name == package
119- release = publication.distro_series.name.lower()
120- pocket = publication.pocket.lower()
121- version = changelog.Version(publication.source_package_version)
122- newp = icommon.PackageToImport(package, version, "debian", release,
123- pocket)
124- vlist.add_if_needed(newp)
125+ pb = ui.ui_factory.nested_progress_bar()
126+ try:
127+ for idx, publication in enumerate(
128+ icommon.iterate_collection(publications)):
129+ pb.update('getting debian publications', idx, idx)
130+ assert publication.source_package_name == package
131+ release = publication.distro_series.name.lower()
132+ pocket = publication.pocket.lower()
133+ version = changelog.Version(publication.source_package_version)
134+ newp = icommon.PackageToImport(package, version, "debian", release,
135+ pocket)
136+ vlist.add_if_needed(newp)
137+ # if idx >= 20:
138+ # break
139+ finally:
140+ pb.finished()
141 update_lists()
142- proc = subprocess.Popen(["/usr/bin/madison-lite", "--mirror", icommon.lists_dir,
143- package], stdout=subprocess.PIPE, preexec_fn=subprocess_setup)
144+ try:
145+ proc = subprocess.Popen(["/usr/bin/madison-lite", "--mirror", icommon.lists_dir,
146+ package], stdout=subprocess.PIPE, preexec_fn=subprocess_setup)
147+ except OSError, e:
148+ if e.errno in (errno.ENOENT,):
149+ sys.stderr.write('You must have "madison-lite" installed.\n'
150+ 'try: sudo apt-get install madison-lite.\n')
151+ sys.exit(1)
152+ raise
153 output, errout = proc.communicate()
154 assert proc.returncode == 0, "Error running madison-lite"
155 for line in output.splitlines():
156@@ -210,14 +245,22 @@
157 publications = icommon.lp_call(icommon.call_with_limited_size,
158 u_archive.getPublishedSources,
159 source_name=package, exact_match=True)
160- for publication in icommon.iterate_collection(publications):
161- assert publication.source_package_name == package
162- release = publication.distro_series.name.lower()
163- pocket = publication.pocket.lower()
164- version = changelog.Version(publication.source_package_version)
165- newp = icommon.PackageToImport(package, version, "ubuntu", release,
166- pocket)
167- vlist.add_if_needed(newp)
168+ pb = ui.ui_factory.nested_progress_bar()
169+ try:
170+ for idx, publication in enumerate(
171+ icommon.iterate_collection(publications)):
172+ pb.update('getting ubuntu publications', idx, idx)
173+ assert publication.source_package_name == package
174+ release = publication.distro_series.name.lower()
175+ pocket = publication.pocket.lower()
176+ version = changelog.Version(publication.source_package_version)
177+ newp = icommon.PackageToImport(package, version, "ubuntu", release,
178+ pocket)
179+ vlist.add_if_needed(newp)
180+ # if idx >= 20:
181+ # break
182+ finally:
183+ pb.finished()
184 vlist.sort()
185 return vlist
186
187@@ -270,19 +313,24 @@
188 return "debian"
189
190
191-def grab_file(location, target_dir):
192+def grab_file(location, target_dir, possible_transports=None):
193 mutter("fetching %s" % location)
194 location_base = urlutils.dirname(location)
195 location_file = urlutils.basename(location)
196 local_path = os.path.join(target_dir, location_file)
197 def get_file(transport):
198 return transport.get(location_file)
199- def redirected(transport, e, redirection_notice):
200- target = urlutils.unescape(e.target).encode('utf-8')
201- return transport._redirected_to(e.source, target)
202- location_f = transport.do_catching_redirections(get_file,
203- transport.get_transport(location_base),
204- redirected)
205+ def redirected(t, e, redirection_notice):
206+ # _redirected_to has a bug that it doesn't support possible_transports,
207+ # so we just call get_transport directly, we know we are just getting a
208+ # file, so we don't have to try too hard.
209+ base, fname = urlutils.split(e.target)
210+ t2 = transport.get_transport(base,
211+ possible_transports=possible_transports)
212+ return t2
213+ t = transport.get_transport(location_base,
214+ possible_transports=possible_transports)
215+ location_f = transport.do_catching_redirections(get_file, t, redirected)
216 try:
217 local_f = open(local_path, "wb")
218 try:
219@@ -293,8 +341,8 @@
220 location_f.close()
221
222
223-def dget(dsc_location, target_dir):
224- grab_file(dsc_location, target_dir)
225+def dget(dsc_location, target_dir, possible_transports=None):
226+ grab_file(dsc_location, target_dir, possible_transports=possible_transports)
227 local_dsc_path = os.path.join(target_dir,
228 urlutils.basename(dsc_location))
229 dsc_f = open(local_dsc_path)
230@@ -305,7 +353,7 @@
231 name = file_info['name']
232 # TODO: md5 check
233 grab_file(urlutils.join(urlutils.dirname(dsc_location), name),
234- target_dir)
235+ target_dir, possible_transports=possible_transports)
236 finally:
237 dsc_f.close()
238 return local_dsc_path
239@@ -510,7 +558,8 @@
240 extract_upstream_branch(update_db, upstream_dir)
241 dl_dir = tempfile.mkdtemp()
242 try:
243- local_dsc_path = dget(importp.get_url(), temp_dir)
244+ local_dsc_path = dget(importp.get_url(), temp_dir,
245+ possible_transports=possible_transports)
246 update_db.import_package(local_dsc_path,
247 use_time_from_changelog=True)
248 finally:
249@@ -531,19 +580,26 @@
250 unimported_versions = icommon.ImportList()
251 imported_suites = set()
252 have_imported_old_debian = False
253+ last_target_info = None
254 for importp in reversed(all_versions.plist):
255 suite = icommon.make_suite(importp.release, importp.pocket)
256 if (importp.distro, suite) in imported_suites:
257 continue
258 target_branch_path = os.path.join(temp_dir, suite)
259- try:
260- target_branch = branch.Branch.open(target_branch_path,
261- possible_transports=possible_transports)
262- has_version = True
263- except errors.NotBranchError:
264- target_branch = bstore.get_branch(importp,
265- possible_transports=possible_transports)
266- has_version = target_branch is not None
267+ if last_target_info is not None:
268+ if target_branch_path == last_target_info[0]:
269+ target_branch = last_target_info[1]
270+ has_version = target_branch is not None
271+ else:
272+ try:
273+ target_branch = branch.Branch.open(target_branch_path,
274+ possible_transports=possible_transports)
275+ has_version = True
276+ except errors.NotBranchError:
277+ target_branch = bstore.get_branch(importp,
278+ possible_transports=possible_transports)
279+ has_version = target_branch is not None
280+ last_target_info = (target_branch_path, target_branch)
281 if has_version:
282 db = import_dsc.DistributionBranch(target_branch, target_branch)
283 has_version = db.has_version(importp.version)
284@@ -764,10 +820,11 @@
285 newdir = format.initialize_on_transport(to_transport)
286 repo = newdir.create_repository(shared=True)
287 repo.set_make_working_trees(True)
288+ mutter("finding all versions of %s" % (package,))
289 versions = get_versions(package, extra_debian=extra_debian)
290 assert len(versions.plist) > 0
291 versions.reverse()
292- mutter(str(versions))
293+ mutter("found %d versions: %s" % (len(versions.plist), versions))
294 possible_transports = []
295 versions = find_unimported_versions(temp_dir, versions, package,
296 revid_db, bstore, possible_transports=possible_transports)
297@@ -818,19 +875,13 @@
298 f.close()
299
300
301-parser = optparse.OptionParser()
302-parser.add_option("--push-only", dest="push", action="store_true")
303-parser.add_option("--no-push", dest="no_push", action="store_true")
304-parser.add_option("--check", dest="check", action="store_true")
305-parser.add_option("--no-existing", dest="no_existing", action="store_true")
306-parser.add_option("--extra-debian", dest="extra_debian")
307-parser.add_option("--verbose", dest="verbose", action="store_true")
308-
309-options, args = parser.parse_args()
310-
311 if options.verbose:
312 trace.set_verbosity_level(1)
313 trace.push_log_file(sys.stderr)
314+if sys.stderr.isatty():
315+ # Don't create a progress bar if we aren't actually in a terminal
316+ ui.ui_factory = ui.make_ui_for_terminal(sys.stdin, sys.stdout, sys.stderr)
317+debug.debug_flags.update(options.debug_flags)
318
319
320 if len(args) < 1:

Subscribers

People subscribed via source and target branches

to all changes: