Merge lp:~xaav/loggerhead/export-tarball into lp:loggerhead

Proposed by xaav
Status: Superseded
Proposed branch: lp:~xaav/loggerhead/export-tarball
Merge into: lp:loggerhead
Diff against target: 242 lines (+81/-6)
9 files modified
.bzrignore (+1/-0)
loggerhead/apps/branch.py (+9/-3)
loggerhead/config.py (+3/-0)
loggerhead/controllers/__init__.py (+1/-1)
loggerhead/controllers/download_ui.py (+18/-0)
loggerhead/controllers/revision_ui.py (+2/-1)
loggerhead/exporter.py (+40/-0)
loggerhead/history.py (+3/-0)
loggerhead/templates/revision.pt (+4/-1)
To merge this branch: bzr merge lp:~xaav/loggerhead/export-tarball
Reviewer Review Type Date Requested Status
Robert Collins Needs Fixing
Martin Albisetti Pending
Review via email: mp+62941@code.launchpad.net

This proposal has been superseded by a proposal from 2011-06-08.

Description of the change

This branch **may** accomplish exporting the tarball using chunked transfer encoding. The code all looks to be correct, but I have not tested it, so I would like your opinion.

Thanks!

To post a comment you must log in.
Revision history for this message
Martin Pool (mbp) wrote :

Thanks very much, that'd be a really useful feature to have. Thanks
also for making it optional, because probably some installations would
not want it on.

This looks broadly reasonable -- I'm not deeply familiar with
loggerhead -- but I am very curious why you apparently reimplemented
the export-to-tarball feature. I'd rather reuse the bzr code and if
necessary change it to let it be reused here.

When you say "not tested" do you mean you haven't even run it, or only
that you didn't add automatic tests?

Revision history for this message
Robert Collins (lifeless) wrote :

This looks very interesting.

Three isues:

1)
+class TarExporterFileObject(object):
+
+ def __init__(self):
+ self._buffer = ''
+
+ def write(self, str):
+ self._buffer += str
+
+ def get_buffer(self):
+ buffer = self._buffer
+ self._buffer = ''
+ return buffer

This is going to be somewhat inefficient. Try this:

+class TarExporterFileObject(object):
+
+ def __init__(self):
+ self._buffer = []
+
+ def write(self, str):
+ self._buffer.append(str)
+
+ def get_buffer(self):
+ try:
+ return ''.join(self._buffer)
+ finally:
+ self._buffer = []

2) There are no tests for this. the test suite is still pretty new, but its a good idea to test things - in particular in cases like this we need to be fairly confident it will be incremental and not block on the export - I can imagine the wsgi layer buffering everything, for instance. [in fact, I'll lay odds it will unless we fix a few things].

3) The export function is a copy-paste-tweak of the core from bzrlib. This will lead to bugs as that code base evolves - we should instead get a supported function in bzrlib lib that we can import and use.

I'm putting this back to WIP - but its a great start. Please keep at it and just shout if you need pointers.

review: Needs Fixing
Revision history for this message
xaav (xaav) wrote :

> 3) The export function is a copy-paste-tweak of the core from bzrlib. This will lead to bugs as
> that code base evolves - we should instead get a supported function in bzrlib lib that we can
> import and use.

Well, there is only one problem with that. According to the WSGI spec, you must return an iterable that will export the blocks. If I were to call the provided function, it would be impossible to break the response into pieces because the provided function would export it all at once. I know that it is a copy and paste tweak, but there is really no way I can inject the 'yield' keyword into the provided function. If you have another suggestion, I would be glad to hear it.

Issue number one I will be glad to fix.

Regarding issue number two, I have not written tests before but I will try my best.

Revision history for this message
Robert Collins (lifeless) wrote :

On Wed, Jun 1, 2011 at 9:46 AM, Geoff <email address hidden> wrote:
>> 3) The export function is a copy-paste-tweak of the core from bzrlib. This will lead to bugs as
>> that code base evolves - we should instead get a supported function in bzrlib lib that we can
>> import and use.
>
> Well, there is only one problem with that. According to the WSGI spec, you must return an iterable that will export the blocks. If I were to call the provided function, it would be impossible to break the response into pieces because the provided function would export it all at once. I know that it is a copy and paste tweak, but there is really no way I can inject the 'yield' keyword into the provided function. If you have another suggestion, I would be glad to hear it.

Extract the function in bzrlib into two parts - a generator (what you
have here) and a consumer than consumes it all triggering the writes.

Then we can reuse the generator.

Revision history for this message
xaav (xaav) wrote :

> On Wed, Jun 1, 2011 at 9:46 AM, Geoff <email address hidden> wrote:
> >> 3) The export function is a copy-paste-tweak of the core from bzrlib. This
> will lead to bugs as
> >> that code base evolves - we should instead get a supported function in
> bzrlib lib that we can
> >> import and use.
> >
> > Well, there is only one problem with that. According to the WSGI spec, you
> must return an iterable that will export the blocks. If I were to call the
> provided function, it would be impossible to break the response into pieces
> because the provided function would export it all at once. I know that it is a
> copy and paste tweak, but there is really no way I can inject the 'yield'
> keyword into the provided function. If you have another suggestion, I would be
> glad to hear it.
>
> Extract the function in bzrlib into two parts - a generator (what you
> have here) and a consumer than consumes it all triggering the writes.
>
> Then we can reuse the generator.

Okay, I see what you mean.

Revision history for this message
xaav (xaav) wrote :

See Bug #791005 for further information on this.

lp:~xaav/loggerhead/export-tarball updated
441. By xaav on 2011-06-01

Cleaned up TarExporterFileObject

442. By xaav on 2011-06-03

Updated exporter to use bzr patch.

443. By xaav on 2011-06-08

Cleaned up files.

Revision history for this message
xaav (xaav) wrote :

Okay, I think this should work, but I haven't tested it yet.

lp:~xaav/loggerhead/export-tarball updated
444. By xaav on 2011-06-08

Finished up.

445. By xaav on 2011-06-13

Added tarfile test.

446. By xaav on 2011-06-27

Fixed import.

447. By xaav on 2011-06-27

Fixed code issues.

448. By xaav <email address hidden> on 2011-06-28

modified bzrignore

449. By xaav <email address hidden> on 2011-06-28

Merged branch

450. By xaav on 2011-06-28

Fixed gzip bug.

451. By xaav <email address hidden> on 2011-07-07

Merged lp:loggerhead

452. By xaav <email address hidden> on 2011-07-07

Fixed extension issue

453. By xaav on 2011-07-10

Fixed buggy merging.

454. By xaav <email address hidden> on 2011-07-10

Fixed buggy merging and removed IDE files

455. By xaav on 2011-07-11

Fixed serve tarballs issue.

456. By xaav on 2011-07-11

Fixed tests.

457. By xaav on 2011-07-11

Added to UI.

458. By xaav on 2011-07-12

UI fixes.

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file '.bzrignore'
2--- .bzrignore 2011-03-10 14:22:12 +0000
3+++ .bzrignore 2011-06-08 21:30:00 +0000
4@@ -8,3 +8,4 @@
5 loggerhead-memprofile
6 ./docs/_build/
7 tags
8+loggerhead/static/downloads/*
9
10=== modified file 'loggerhead/apps/branch.py'
11--- loggerhead/apps/branch.py 2011-03-16 11:40:05 +0000
12+++ loggerhead/apps/branch.py 2011-06-08 21:30:00 +0000
13@@ -33,7 +33,7 @@
14 from loggerhead.controllers.atom_ui import AtomUI
15 from loggerhead.controllers.changelog_ui import ChangeLogUI
16 from loggerhead.controllers.diff_ui import DiffUI
17-from loggerhead.controllers.download_ui import DownloadUI
18+from loggerhead.controllers.download_ui import DownloadUI, DownloadTarballUI
19 from loggerhead.controllers.filediff_ui import FileDiffUI
20 from loggerhead.controllers.inventory_ui import InventoryUI
21 from loggerhead.controllers.revision_ui import RevisionUI
22@@ -49,7 +49,7 @@
23
24 def __init__(self, branch, friendly_name=None, config={},
25 graph_cache=None, branch_link=None, is_root=False,
26- served_url=_DEFAULT, use_cdn=False):
27+ served_url=_DEFAULT, use_cdn=False, export_tarballs=True):
28 self.branch = branch
29 self._config = config
30 self.friendly_name = friendly_name
31@@ -61,6 +61,7 @@
32 self.is_root = is_root
33 self.served_url = served_url
34 self.use_cdn = use_cdn
35+ self.export_tarballs = export_tarballs
36
37 def get_history(self):
38 file_cache = None
39@@ -126,6 +127,7 @@
40 'revision': RevisionUI,
41 'search': SearchUI,
42 'view': ViewUI,
43+ 'tarball': DownloadTarballUI,
44 }
45
46 def last_updated(self):
47@@ -171,10 +173,14 @@
48 try:
49 try:
50 c = cls(self, self.get_history)
51- return c(environ, start_response)
52+ to_ret = c(environ, start_response)
53 except:
54 environ['exc_info'] = sys.exc_info()
55 environ['branch'] = self
56 raise
57+ if type(to_ret) == type(httpexceptions.HTTPSeeOther('/')):
58+ raise to_ret
59+ else:
60+ return to_ret
61 finally:
62 self.branch.unlock()
63
64=== modified file 'loggerhead/config.py'
65--- loggerhead/config.py 2010-05-12 14:38:05 +0000
66+++ loggerhead/config.py 2011-06-08 21:30:00 +0000
67@@ -36,6 +36,7 @@
68 use_cdn=False,
69 sql_dir=None,
70 allow_writes=False,
71+ export_tarballs=True,
72 )
73 parser.add_option("--user-dirs", action="store_true",
74 help="Serve user directories as ~user.")
75@@ -75,6 +76,8 @@
76 help="The directory to place the SQL cache in")
77 parser.add_option("--allow-writes", action="store_true",
78 help="Allow writing to the Bazaar server.")
79+ parser.add_option("--export-tarballs", action="store_true",
80+ help="Allow exporting revisions to tarballs.")
81 return parser
82
83
84
85=== modified file 'loggerhead/controllers/__init__.py'
86--- loggerhead/controllers/__init__.py 2011-03-16 11:40:05 +0000
87+++ loggerhead/controllers/__init__.py 2011-06-08 21:30:00 +0000
88@@ -20,7 +20,7 @@
89 import bzrlib.errors
90 import time
91
92-from paste.httpexceptions import HTTPNotFound
93+from paste.httpexceptions import HTTPNotFound, HTTPSeeOther
94 from paste.request import path_info_pop, parse_querystring
95
96 from loggerhead import util
97
98=== modified file 'loggerhead/controllers/download_ui.py'
99--- loggerhead/controllers/download_ui.py 2010-05-05 19:03:40 +0000
100+++ loggerhead/controllers/download_ui.py 2011-06-08 21:30:00 +0000
101@@ -19,12 +19,14 @@
102
103 import logging
104 import mimetypes
105+import os
106 import urllib
107
108 from paste import httpexceptions
109 from paste.request import path_info_pop
110
111 from loggerhead.controllers import TemplatedBranchView
112+from loggerhead.exporter import export_tarball
113
114 log = logging.getLogger("loggerhead.controllers")
115
116@@ -67,3 +69,19 @@
117 ]
118 start_response('200 OK', headers)
119 return [content]
120+
121+class DownloadTarballUI(TemplatedBranchView):
122+
123+ def get_download(self, ):
124+ """Stream a tarball from a bazaar branch."""
125+
126+ history = self._history
127+ if len(self.args):
128+ revid = history.fix_revid(self.args[0])
129+ else:
130+ revid = self.get_revid()
131+
132+ if self._branch.export_tarballs:
133+ return export_tarball(history, revid)
134+ else:
135+ return httpexceptions.HTTPSeeOther('/')
136
137=== modified file 'loggerhead/controllers/revision_ui.py'
138--- loggerhead/controllers/revision_ui.py 2011-03-02 14:07:21 +0000
139+++ loggerhead/controllers/revision_ui.py 2011-06-08 21:30:00 +0000
140@@ -110,7 +110,7 @@
141 self._branch.friendly_name,
142 self._branch.is_root,
143 'changes'))
144-
145+ can_export = self._branch.export_tarballs
146 return {
147 'branch': self._branch,
148 'revid': revid,
149@@ -132,4 +132,5 @@
150 'compare_revid': compare_revid,
151 'url': self._branch.context_url,
152 'directory_breadcrumbs': directory_breadcrumbs,
153+ 'can_export': can_export,
154 }
155
156=== added file 'loggerhead/exporter.py'
157--- loggerhead/exporter.py 1970-01-01 00:00:00 +0000
158+++ loggerhead/exporter.py 2011-06-08 21:30:00 +0000
159@@ -0,0 +1,40 @@
160+"""Exports an archive from a bazaar branch"""
161+
162+from bzrlib.export import get_export_generator
163+
164+class ExporterFileObject(object):
165+
166+ def __init__(self):
167+ self._buffer = []
168+
169+ def write(self, str):
170+ self._buffer.append(str)
171+
172+ def get_buffer(self):
173+ try:
174+ return ''.join(self._buffer)
175+ finally:
176+ self._buffer = []
177+
178+def export_archive(history, revid, format=".tar.gz"):
179+ """Export tree contents to an archive
180+
181+ :param history: Instance of history to export
182+ :param revid: Revision to export
183+ :param format: Format of the archive
184+ """
185+
186+ fileobj = ExporterFileObject()
187+
188+ tree = history._branch.repository.revision_tree(revid)
189+
190+ for _ in get_export_generator(tree=tree, fileobj=fileobj, format=format):
191+
192+ yield fileobj.get_buffer()
193+
194+ # Might have additonal contents written
195+
196+ yield fileobj.get_buffer()
197+
198+
199+
200\ No newline at end of file
201
202=== modified file 'loggerhead/history.py'
203--- loggerhead/history.py 2011-03-25 13:09:10 +0000
204+++ loggerhead/history.py 2011-06-08 21:30:00 +0000
205@@ -33,6 +33,7 @@
206 import re
207 import textwrap
208 import threading
209+import tarfile
210
211 from bzrlib import tag
212 import bzrlib.branch
213@@ -44,6 +45,7 @@
214 from loggerhead import search
215 from loggerhead import util
216 from loggerhead.wholehistory import compute_whole_history_data
217+from bzrlib.export.tar_exporter import export_tarball
218
219
220 def is_branch(folder):
221@@ -817,3 +819,4 @@
222 removed=sorted(reporter.removed, key=lambda x: x.filename),
223 modified=sorted(reporter.modified, key=lambda x: x.filename),
224 text_changes=sorted(reporter.text_changes, key=lambda x: x.filename))
225+
226
227=== added directory 'loggerhead/static/downloads'
228=== modified file 'loggerhead/templates/revision.pt'
229--- loggerhead/templates/revision.pt 2011-02-23 03:01:15 +0000
230+++ loggerhead/templates/revision.pt 2011-06-08 21:30:00 +0000
231@@ -84,7 +84,10 @@
232 tal:attributes="href python:url(['/diff', change.revno], clear=1)">download diff</a>
233 <a tal:condition="python:compare_revid is not None"
234 tal:attributes="href python:url(['/diff', change.revno, history.get_revno(compare_revid)], clear=1)">download diff</a>
235- </li>
236+ </li>
237+ <li tal:condition="python:can_export">
238+ <a tal:attributes="href python:url(['/tarball', change.revno])">download tarball</a>
239+ </li>
240 <li id="last"><a tal:attributes="href python:url(['/changes', change.revno]);
241 title string:view history from revision ${change/revno}"
242 tal:content="string:view history from revision ${change/revno}"></a></li>

Subscribers

People subscribed via source and target branches