Merge lp:~gholt/swift/lobjects4 into lp:~hudson-openstack/swift/trunk

Proposed by gholt
Status: Merged
Approved by: Jay Payne
Approved revision: 150
Merged at revision: 147
Proposed branch: lp:~gholt/swift/lobjects4
Merge into: lp:~hudson-openstack/swift/trunk
Diff against target: 2409 lines (+1473/-91)
11 files modified
bin/st (+211/-49)
doc/source/index.rst (+1/-0)
doc/source/overview_large_objects.rst (+177/-0)
swift/common/client.py (+17/-10)
swift/common/constraints.py (+11/-0)
swift/obj/server.py (+7/-2)
swift/proxy/server.py (+303/-6)
test/functionalnosetests/test_object.py (+279/-0)
test/unit/common/test_constraints.py (+27/-0)
test/unit/obj/test_server.py (+82/-22)
test/unit/proxy/test_server.py (+358/-2)
To merge this branch: bzr merge lp:~gholt/swift/lobjects4
Reviewer Review Type Date Requested Status
Jay Payne (community) Approve
John Dickinson Pending
clayg Pending
Greg Lange Pending
Review via email: mp+43596@code.launchpad.net

This proposal supersedes a proposal from 2010-11-16.

Description of the change

Added large object support by allowing the client to upload the object in segments and download them all as a single object. Also, made updates client.py and st to support and provide an example of how to use the feature. Finally, there is an overview document that needs reviewing.

To post a comment you must log in.
Revision history for this message
clayg (clay-gerrard) wrote : Posted in a previous version of this proposal

great work gholt.

In SegmentedIterable, _load_next_segment, __iter__ and app_iter_range all log the same message ("ERROR: While processing manifest") - if the exception happens in _load_next_segment it gets logged twice.

Eitherway, the posthooklogger will log the status of the response as successful, even though the client likely blew up cause it didn't get all the expected data.

Just for the purpose of logging, since SegmentedIterable has a backref to the Response - I would suggest we update the status_int to a 503. Even though we can't do anything about the 200 we already sent to the client, we know that whatever bytes were transferred so far won't be billable.

Just a thought - personal preference in control structure - I don't care for the except Exception: if not isisntance(StopIteration): log, raise - I would prefer except StopIteration: pass except Exception: log, raise.

Also in SegmentedIterable, instead of initializing self.response to None, how about an empty Response(). It *should* be overridden (assuming the caller read the docstring) - but initializing to "something" removes the need for an extra test in all the "if self.response: update attribute on response" code. Alternatively you could make response a required named parameter to __init__ and then just require the calling method update it's response's app_iter instead of its new SegmentedIterable's backref.

resp = Response()
resp.app_iter = SegIter(resp)

^ does that work?

The history section mentions the "implied user manifest" that we have now with the consistency issue where the proxy can't ever know if it has all of the users intended objects, but not the rejected "explicit user manifest" when the body of the manifest file includes all of the objects in the order the should be glued together - like amazon uses. Is it worth mentioning why our solution is better?

Thanks for working on this essential feature, you deserve much karama!

Revision history for this message
gholt (gholt) wrote : Posted in a previous version of this proposal

Cool, will apply your stuff tomorrow. The except Exception: if not isinstance was pure lame on my part. The other stuff you already know I like. :) I dunno on the history doc thing; wanna type up what you'd like to see? Amazon's thing is really quite different since with them you still can't exceed the maximum 5G limit they have, glued or not.

Revision history for this message
gholt (gholt) wrote : Posted in a previous version of this proposal

Eh, I was bored so I just pushed the changes described.

Revision history for this message
clayg (clay-gerrard) wrote : Posted in a previous version of this proposal

I think this is ready!

I put up an ether on the large objects history section if anyone wants to suggest any changes: http://etherpad.openstack.org/SwiftLargeObjectSupport

review: Approve
Revision history for this message
gholt (gholt) wrote : Posted in a previous version of this proposal

I'm not sure how I ended up a reviewer on my own branch; but I approve!

review: Approve
Revision history for this message
Greg Lange (greglange) wrote : Posted in a previous version of this proposal

I like the change. My only gripe would be that an explicit handling of the 10,000 segment limit would be better, either allow unlimited segments or error when more than 10,000 segments are uploaded as appropriate. This is being addressed in a bug report.

review: Approve
Revision history for this message
John Dickinson (notmyname) wrote : Posted in a previous version of this proposal

looks great

review: Approve
Revision history for this message
gholt (gholt) wrote :

I felt compelled to resubmit this proposal as the code change since previous reviews. It now supports INFINITE objects (note: that part has not specifically tested.)

lp:~gholt/swift/lobjects4 updated
143. By gholt

Fixed a bug where a HEAD on a really, really large object would give a content-length of 0 instead of transfer-encoding: chunked

144. By gholt

x-copy-from now understands manifest sources and copies details rather than contents

145. By gholt

Limit manifest gets to one segment per second; prevents amplification attacks of tons of tiny segments

146. By gholt

Changed to only limit manifest gets after first 10 segments. Makes tests run faster but does allow amplification 1:10. At least it's not 1:infinity like before.

147. By gholt

Even though isn't 100% related, made st emit a warning if there's a / in a container name

148. By gholt

st: Works with chunked transfer encoded downloads now

149. By gholt

Made stat display of objects suppress content-length, last-modified, and etag if they aren't in the headers

150. By gholt

lobjects: The Last-Modified header is now determined for reasonably segmented objects.

Revision history for this message
Jay Payne (letterj) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/st'
2--- bin/st 2010-12-09 17:10:37 +0000
3+++ bin/st 2010-12-16 18:53:43 +0000
4@@ -581,7 +581,8 @@
5 :param container: container name that the object is in
6 :param name: object name to put
7 :param contents: a string or a file like object to read object data from
8- :param content_length: value to send as content-length header
9+ :param content_length: value to send as content-length header; also limits
10+ the amount read from contents
11 :param etag: etag of contents
12 :param chunk_size: chunk size of data to write
13 :param content_type: value to send as content-type header
14@@ -611,18 +612,24 @@
15 conn.putrequest('PUT', path)
16 for header, value in headers.iteritems():
17 conn.putheader(header, value)
18- if not content_length:
19+ if content_length is None:
20 conn.putheader('Transfer-Encoding', 'chunked')
21- conn.endheaders()
22- chunk = contents.read(chunk_size)
23- while chunk:
24- if not content_length:
25+ conn.endheaders()
26+ chunk = contents.read(chunk_size)
27+ while chunk:
28 conn.send('%x\r\n%s\r\n' % (len(chunk), chunk))
29- else:
30+ chunk = contents.read(chunk_size)
31+ conn.send('0\r\n\r\n')
32+ else:
33+ conn.endheaders()
34+ left = content_length
35+ while left > 0:
36+ size = chunk_size
37+ if size > left:
38+ size = left
39+ chunk = contents.read(size)
40 conn.send(chunk)
41- chunk = contents.read(chunk_size)
42- if not content_length:
43- conn.send('0\r\n\r\n')
44+ left -= len(chunk)
45 else:
46 conn.request('PUT', path, contents, headers)
47 resp = conn.getresponse()
48@@ -860,15 +867,20 @@
49
50
51 st_delete_help = '''
52-delete --all OR delete container [object] [object] ...
53+delete --all OR delete container [--leave-segments] [object] [object] ...
54 Deletes everything in the account (with --all), or everything in a
55- container, or a list of objects depending on the args given.'''.strip('\n')
56+ container, or a list of objects depending on the args given. Segments of
57+ manifest objects will be deleted as well, unless you specify the
58+ --leave-segments option.'''.strip('\n')
59
60
61 def st_delete(parser, args, print_queue, error_queue):
62 parser.add_option('-a', '--all', action='store_true', dest='yes_all',
63 default=False, help='Indicates that you really want to delete '
64 'everything in the account')
65+ parser.add_option('', '--leave-segments', action='store_true',
66+ dest='leave_segments', default=False, help='Indicates that you want '
67+ 'the segments of manifest objects left alone')
68 (options, args) = parse_args(parser, args)
69 args = args[1:]
70 if (not args and not options.yes_all) or (args and options.yes_all):
71@@ -876,11 +888,42 @@
72 (basename(argv[0]), st_delete_help))
73 return
74
75+ def _delete_segment((container, obj), conn):
76+ conn.delete_object(container, obj)
77+ if options.verbose:
78+ print_queue.put('%s/%s' % (container, obj))
79+
80 object_queue = Queue(10000)
81
82 def _delete_object((container, obj), conn):
83 try:
84+ old_manifest = None
85+ if not options.leave_segments:
86+ try:
87+ old_manifest = conn.head_object(container, obj).get(
88+ 'x-object-manifest')
89+ except ClientException, err:
90+ if err.http_status != 404:
91+ raise
92 conn.delete_object(container, obj)
93+ if old_manifest:
94+ segment_queue = Queue(10000)
95+ scontainer, sprefix = old_manifest.split('/', 1)
96+ for delobj in conn.get_container(scontainer,
97+ prefix=sprefix)[1]:
98+ segment_queue.put((scontainer, delobj['name']))
99+ if not segment_queue.empty():
100+ segment_threads = [QueueFunctionThread(segment_queue,
101+ _delete_segment, create_connection()) for _ in
102+ xrange(10)]
103+ for thread in segment_threads:
104+ thread.start()
105+ while not segment_queue.empty():
106+ sleep(0.01)
107+ for thread in segment_threads:
108+ thread.abort = True
109+ while thread.isAlive():
110+ thread.join(0.01)
111 if options.verbose:
112 path = options.yes_all and join(container, obj) or obj
113 if path[:1] in ('/', '\\'):
114@@ -891,6 +934,7 @@
115 raise
116 error_queue.put('Object %s not found' %
117 repr('%s/%s' % (container, obj)))
118+
119 container_queue = Queue(10000)
120
121 def _delete_container(container, conn):
122@@ -956,6 +1000,10 @@
123 raise
124 error_queue.put('Account not found')
125 elif len(args) == 1:
126+ if '/' in args[0]:
127+ print >> stderr, 'WARNING: / in container name; you might have ' \
128+ 'meant %r instead of %r.' % \
129+ (args[0].replace('/', ' ', 1), args[0])
130 conn = create_connection()
131 _delete_container(args[0], conn)
132 else:
133@@ -976,7 +1024,7 @@
134
135
136 st_download_help = '''
137-download --all OR download container [object] [object] ...
138+download --all OR download container [options] [object] [object] ...
139 Downloads everything in the account (with --all), or everything in a
140 container, or a list of objects depending on the args given. For a single
141 object download, you may use the -o [--output] <filename> option to
142@@ -1015,20 +1063,26 @@
143 headers, body = \
144 conn.get_object(container, obj, resp_chunk_size=65536)
145 content_type = headers.get('content-type')
146- content_length = int(headers.get('content-length'))
147+ if 'content-length' in headers:
148+ content_length = int(headers.get('content-length'))
149+ else:
150+ content_length = None
151 etag = headers.get('etag')
152 path = options.yes_all and join(container, obj) or obj
153 if path[:1] in ('/', '\\'):
154 path = path[1:]
155+ md5sum = None
156 make_dir = out_file != "-"
157 if content_type.split(';', 1)[0] == 'text/directory':
158 if make_dir and not isdir(path):
159 mkdirs(path)
160 read_length = 0
161- md5sum = md5()
162+ if 'x-object-manifest' not in headers:
163+ md5sum = md5()
164 for chunk in body:
165 read_length += len(chunk)
166- md5sum.update(chunk)
167+ if md5sum:
168+ md5sum.update(chunk)
169 else:
170 dirpath = dirname(path)
171 if make_dir and dirpath and not isdir(dirpath):
172@@ -1040,16 +1094,18 @@
173 else:
174 fp = open(path, 'wb')
175 read_length = 0
176- md5sum = md5()
177+ if 'x-object-manifest' not in headers:
178+ md5sum = md5()
179 for chunk in body:
180 fp.write(chunk)
181 read_length += len(chunk)
182- md5sum.update(chunk)
183+ if md5sum:
184+ md5sum.update(chunk)
185 fp.close()
186- if md5sum.hexdigest() != etag:
187+ if md5sum and md5sum.hexdigest() != etag:
188 error_queue.put('%s: md5sum != etag, %s != %s' %
189 (path, md5sum.hexdigest(), etag))
190- if read_length != content_length:
191+ if content_length is not None and read_length != content_length:
192 error_queue.put('%s: read_length != content_length, %d != %d' %
193 (path, read_length, content_length))
194 if 'x-object-meta-mtime' in headers and not options.out_file:
195@@ -1110,6 +1166,10 @@
196 raise
197 error_queue.put('Account not found')
198 elif len(args) == 1:
199+ if '/' in args[0]:
200+ print >> stderr, 'WARNING: / in container name; you might have ' \
201+ 'meant %r instead of %r.' % \
202+ (args[0].replace('/', ' ', 1), args[0])
203 _download_container(args[0], create_connection())
204 else:
205 if len(args) == 2:
206@@ -1223,6 +1283,10 @@
207 raise
208 error_queue.put('Account not found')
209 elif len(args) == 1:
210+ if '/' in args[0]:
211+ print >> stderr, 'WARNING: / in container name; you might have ' \
212+ 'meant %r instead of %r.' % \
213+ (args[0].replace('/', ' ', 1), args[0])
214 try:
215 headers = conn.head_container(args[0])
216 object_count = int(headers.get('x-container-object-count', 0))
217@@ -1259,14 +1323,19 @@
218 Account: %s
219 Container: %s
220 Object: %s
221- Content Type: %s
222-Content Length: %s
223- Last Modified: %s
224- ETag: %s'''.strip('\n') % (conn.url.rsplit('/', 1)[-1], args[0],
225- args[1], headers.get('content-type'),
226- headers.get('content-length'),
227- headers.get('last-modified'),
228- headers.get('etag')))
229+ Content Type: %s'''.strip('\n') % (conn.url.rsplit('/', 1)[-1], args[0],
230+ args[1], headers.get('content-type')))
231+ if 'content-length' in headers:
232+ print_queue.put('Content Length: %s' %
233+ headers['content-length'])
234+ if 'last-modified' in headers:
235+ print_queue.put(' Last Modified: %s' %
236+ headers['last-modified'])
237+ if 'etag' in headers:
238+ print_queue.put(' ETag: %s' % headers['etag'])
239+ if 'x-object-manifest' in headers:
240+ print_queue.put(' Manifest: %s' %
241+ headers['x-object-manifest'])
242 for key, value in headers.items():
243 if key.startswith('x-object-meta-'):
244 print_queue.put('%14s: %s' % ('Meta %s' %
245@@ -1274,7 +1343,7 @@
246 for key, value in headers.items():
247 if not key.startswith('x-object-meta-') and key not in (
248 'content-type', 'content-length', 'last-modified',
249- 'etag', 'date'):
250+ 'etag', 'date', 'x-object-manifest'):
251 print_queue.put(
252 '%14s: %s' % (key.title(), value))
253 except ClientException, err:
254@@ -1326,6 +1395,10 @@
255 raise
256 error_queue.put('Account not found')
257 elif len(args) == 1:
258+ if '/' in args[0]:
259+ print >> stderr, 'WARNING: / in container name; you might have ' \
260+ 'meant %r instead of %r.' % \
261+ (args[0].replace('/', ' ', 1), args[0])
262 headers = {}
263 for item in options.meta:
264 split_item = item.split(':')
265@@ -1363,23 +1436,48 @@
266 upload [options] container file_or_directory [file_or_directory] [...]
267 Uploads to the given container the files and directories specified by the
268 remaining args. -c or --changed is an option that will only upload files
269- that have changed since the last upload.'''.strip('\n')
270+ that have changed since the last upload. -S <size> or --segment-size <size>
271+ and --leave-segments are options as well (see --help for more).
272+'''.strip('\n')
273
274
275 def st_upload(options, args, print_queue, error_queue):
276 parser.add_option('-c', '--changed', action='store_true', dest='changed',
277 default=False, help='Will only upload files that have changed since '
278 'the last upload')
279+ parser.add_option('-S', '--segment-size', dest='segment_size', help='Will '
280+ 'upload files in segments no larger than <size> and then create a '
281+ '"manifest" file that will download all the segments as if it were '
282+ 'the original file. The segments will be uploaded to a '
283+ '<container>_segments container so as to not pollute the main '
284+ '<container> listings.')
285+ parser.add_option('', '--leave-segments', action='store_true',
286+ dest='leave_segments', default=False, help='Indicates that you want '
287+ 'the older segments of manifest objects left alone (in the case of '
288+ 'overwrites)')
289 (options, args) = parse_args(parser, args)
290 args = args[1:]
291 if len(args) < 2:
292 error_queue.put('Usage: %s [options] %s' %
293 (basename(argv[0]), st_upload_help))
294 return
295-
296- file_queue = Queue(10000)
297-
298- def _upload_file((path, dir_marker), conn):
299+ object_queue = Queue(10000)
300+
301+ def _segment_job(job, conn):
302+ if job.get('delete', False):
303+ conn.delete_object(job['container'], job['obj'])
304+ else:
305+ fp = open(job['path'], 'rb')
306+ fp.seek(job['segment_start'])
307+ conn.put_object(job.get('container', args[0] + '_segments'),
308+ job['obj'], fp, content_length=job['segment_size'])
309+ if options.verbose and 'log_line' in job:
310+ print_queue.put(job['log_line'])
311+
312+ def _object_job(job, conn):
313+ path = job['path']
314+ container = job.get('container', args[0])
315+ dir_marker = job.get('dir_marker', False)
316 try:
317 obj = path
318 if obj.startswith('./') or obj.startswith('.\\'):
319@@ -1388,7 +1486,7 @@
320 if dir_marker:
321 if options.changed:
322 try:
323- headers = conn.head_object(args[0], obj)
324+ headers = conn.head_object(container, obj)
325 ct = headers.get('content-type')
326 cl = int(headers.get('content-length'))
327 et = headers.get('etag')
328@@ -1401,24 +1499,86 @@
329 except ClientException, err:
330 if err.http_status != 404:
331 raise
332- conn.put_object(args[0], obj, '', content_length=0,
333+ conn.put_object(container, obj, '', content_length=0,
334 content_type='text/directory',
335 headers=put_headers)
336 else:
337- if options.changed:
338+ # We need to HEAD all objects now in case we're overwriting a
339+ # manifest object and need to delete the old segments
340+ # ourselves.
341+ old_manifest = None
342+ if options.changed or not options.leave_segments:
343 try:
344- headers = conn.head_object(args[0], obj)
345+ headers = conn.head_object(container, obj)
346 cl = int(headers.get('content-length'))
347 mt = headers.get('x-object-meta-mtime')
348- if cl == getsize(path) and \
349+ if options.changed and cl == getsize(path) and \
350 mt == put_headers['x-object-meta-mtime']:
351 return
352+ if not options.leave_segments:
353+ old_manifest = headers.get('x-object-manifest')
354 except ClientException, err:
355 if err.http_status != 404:
356 raise
357- conn.put_object(args[0], obj, open(path, 'rb'),
358- content_length=getsize(path),
359- headers=put_headers)
360+ if options.segment_size and \
361+ getsize(path) < options.segment_size:
362+ full_size = getsize(path)
363+ segment_queue = Queue(10000)
364+ segment_threads = [QueueFunctionThread(segment_queue,
365+ _segment_job, create_connection()) for _ in xrange(10)]
366+ for thread in segment_threads:
367+ thread.start()
368+ segment = 0
369+ segment_start = 0
370+ while segment_start < full_size:
371+ segment_size = int(options.segment_size)
372+ if segment_start + segment_size > full_size:
373+ segment_size = full_size - segment_start
374+ segment_queue.put({'path': path,
375+ 'obj': '%s/%s/%s/%08d' % (obj,
376+ put_headers['x-object-meta-mtime'], full_size,
377+ segment),
378+ 'segment_start': segment_start,
379+ 'segment_size': segment_size,
380+ 'log_line': '%s segment %s' % (obj, segment)})
381+ segment += 1
382+ segment_start += segment_size
383+ while not segment_queue.empty():
384+ sleep(0.01)
385+ for thread in segment_threads:
386+ thread.abort = True
387+ while thread.isAlive():
388+ thread.join(0.01)
389+ new_object_manifest = '%s_segments/%s/%s/%s/' % (
390+ container, obj, put_headers['x-object-meta-mtime'],
391+ full_size)
392+ if old_manifest == new_object_manifest:
393+ old_manifest = None
394+ put_headers['x-object-manifest'] = new_object_manifest
395+ conn.put_object(container, obj, '', content_length=0,
396+ headers=put_headers)
397+ else:
398+ conn.put_object(container, obj, open(path, 'rb'),
399+ content_length=getsize(path), headers=put_headers)
400+ if old_manifest:
401+ segment_queue = Queue(10000)
402+ scontainer, sprefix = old_manifest.split('/', 1)
403+ for delobj in conn.get_container(scontainer,
404+ prefix=sprefix)[1]:
405+ segment_queue.put({'delete': True,
406+ 'container': scontainer, 'obj': delobj['name']})
407+ if not segment_queue.empty():
408+ segment_threads = [QueueFunctionThread(segment_queue,
409+ _segment_job, create_connection()) for _ in
410+ xrange(10)]
411+ for thread in segment_threads:
412+ thread.start()
413+ while not segment_queue.empty():
414+ sleep(0.01)
415+ for thread in segment_threads:
416+ thread.abort = True
417+ while thread.isAlive():
418+ thread.join(0.01)
419 if options.verbose:
420 print_queue.put(obj)
421 except OSError, err:
422@@ -1429,22 +1589,22 @@
423 def _upload_dir(path):
424 names = listdir(path)
425 if not names:
426- file_queue.put((path, True)) # dir_marker = True
427+ object_queue.put({'path': path, 'dir_marker': True})
428 else:
429 for name in listdir(path):
430 subpath = join(path, name)
431 if isdir(subpath):
432 _upload_dir(subpath)
433 else:
434- file_queue.put((subpath, False)) # dir_marker = False
435+ object_queue.put({'path': subpath})
436
437 url, token = get_auth(options.auth, options.user, options.key,
438 snet=options.snet)
439 create_connection = lambda: Connection(options.auth, options.user,
440 options.key, preauthurl=url, preauthtoken=token, snet=options.snet)
441- file_threads = [QueueFunctionThread(file_queue, _upload_file,
442+ object_threads = [QueueFunctionThread(object_queue, _object_job,
443 create_connection()) for _ in xrange(10)]
444- for thread in file_threads:
445+ for thread in object_threads:
446 thread.start()
447 conn = create_connection()
448 # Try to create the container, just in case it doesn't exist. If this
449@@ -1453,6 +1613,8 @@
450 # it'll surface on the first object PUT.
451 try:
452 conn.put_container(args[0])
453+ if options.segment_size is not None:
454+ conn.put_container(args[0] + '_segments')
455 except:
456 pass
457 try:
458@@ -1460,10 +1622,10 @@
459 if isdir(arg):
460 _upload_dir(arg)
461 else:
462- file_queue.put((arg, False)) # dir_marker = False
463- while not file_queue.empty():
464+ object_queue.put({'path': arg})
465+ while not object_queue.empty():
466 sleep(0.01)
467- for thread in file_threads:
468+ for thread in object_threads:
469 thread.abort = True
470 while thread.isAlive():
471 thread.join(0.01)
472
473=== modified file 'doc/source/index.rst'
474--- doc/source/index.rst 2010-11-12 18:54:07 +0000
475+++ doc/source/index.rst 2010-12-16 18:53:43 +0000
476@@ -44,6 +44,7 @@
477 overview_replication
478 overview_stats
479 ratelimit
480+ overview_large_objects
481
482 Developer Documentation
483 =======================
484
485=== added file 'doc/source/overview_large_objects.rst'
486--- doc/source/overview_large_objects.rst 1970-01-01 00:00:00 +0000
487+++ doc/source/overview_large_objects.rst 2010-12-16 18:53:43 +0000
488@@ -0,0 +1,177 @@
489+====================
490+Large Object Support
491+====================
492+
493+--------
494+Overview
495+--------
496+
497+Swift has a limit on the size of a single uploaded object; by default this is
498+5GB. However, the download size of a single object is virtually unlimited with
499+the concept of segmentation. Segments of the larger object are uploaded and a
500+special manifest file is created that, when downloaded, sends all the segments
501+concatenated as a single object. This also offers much greater upload speed
502+with the possibility of parallel uploads of the segments.
503+
504+----------------------------------
505+Using ``st`` for Segmented Objects
506+----------------------------------
507+
508+The quickest way to try out this feature is use the included ``st`` Swift Tool.
509+You can use the ``-S`` option to specify the segment size to use when splitting
510+a large file. For example::
511+
512+ st upload test_container -S 1073741824 large_file
513+
514+This would split the large_file into 1G segments and begin uploading those
515+segments in parallel. Once all the segments have been uploaded, ``st`` will
516+then create the manifest file so the segments can be downloaded as one.
517+
518+So now, the following ``st`` command would download the entire large object::
519+
520+ st download test_container large_file
521+
522+``st`` uses a strict convention for its segmented object support. In the above
523+example it will upload all the segments into a second container named
524+test_container_segments. These segments will have names like
525+large_file/1290206778.25/21474836480/00000000,
526+large_file/1290206778.25/21474836480/00000001, etc.
527+
528+The main benefit for using a separate container is that the main container
529+listings will not be polluted with all the segment names. The reason for using
530+the segment name format of <name>/<timestamp>/<size>/<segment> is so that an
531+upload of a new file with the same name won't overwrite the contents of the
532+first until the last moment when the manifest file is updated.
533+
534+``st`` will manage these segment files for you, deleting old segments on
535+deletes and overwrites, etc. You can override this behavior with the
536+``--leave-segments`` option if desired; this is useful if you want to have
537+multiple versions of the same large object available.
538+
539+----------
540+Direct API
541+----------
542+
543+You can also work with the segments and manifests directly with HTTP requests
544+instead of having ``st`` do that for you. You can just upload the segments like
545+you would any other object and the manifest is just a zero-byte file with an
546+extra ``X-Object-Manifest`` header.
547+
548+All the object segments need to be in the same container, have a common object
549+name prefix, and their names sort in the order they should be concatenated.
550+They don't have to be in the same container as the manifest file will be, which
551+is useful to keep container listings clean as explained above with ``st``.
552+
553+The manifest file is simply a zero-byte file with the extra
554+``X-Object-Manifest: <container>/<prefix>`` header, where ``<container>`` is
555+the container the object segments are in and ``<prefix>`` is the common prefix
556+for all the segments.
557+
558+It is best to upload all the segments first and then create or update the
559+manifest. In this way, the full object won't be available for downloading until
560+the upload is complete. Also, you can upload a new set of segments to a second
561+location and then update the manifest to point to this new location. During the
562+upload of the new segments, the original manifest will still be available to
563+download the first set of segments.
564+
565+Here's an example using ``curl`` with tiny 1-byte segments::
566+
567+ # First, upload the segments
568+ curl -X PUT -H 'X-Auth-Token: <token>' \
569+ http://<storage_url>/container/myobject/1 --data-binary '1'
570+ curl -X PUT -H 'X-Auth-Token: <token>' \
571+ http://<storage_url>/container/myobject/2 --data-binary '2'
572+ curl -X PUT -H 'X-Auth-Token: <token>' \
573+ http://<storage_url>/container/myobject/3 --data-binary '3'
574+
575+ # Next, create the manifest file
576+ curl -X PUT -H 'X-Auth-Token: <token>' \
577+ -H 'X-Object-Manifest: container/myobject/' \
578+ http://<storage_url>/container/myobject --data-binary ''
579+
580+ # And now we can download the segments as a single object
581+ curl -H 'X-Auth-Token: <token>' \
582+ http://<storage_url>/container/myobject
583+
584+----------------
585+Additional Notes
586+----------------
587+
588+* With a ``GET`` or ``HEAD`` of a manifest file, the ``X-Object-Manifest:
589+ <container>/<prefix>`` header will be returned with the concatenated object
590+ so you can tell where it's getting its segments from.
591+
592+* The response's ``Content-Length`` for a ``GET`` or ``HEAD`` on the manifest
593+ file will be the sum of all the segments in the ``<container>/<prefix>``
594+ listing, dynamically. So, uploading additional segments after the manifest is
595+ created will cause the concatenated object to be that much larger; there's no
596+ need to recreate the manifest file.
597+
598+* The response's ``Content-Type`` for a ``GET`` or ``HEAD`` on the manifest
599+ will be the same as the ``Content-Type`` set during the ``PUT`` request that
600+ created the manifest. You can easily change the ``Content-Type`` by reissuing
601+ the ``PUT``.
602+
603+* The response's ``ETag`` for a ``GET`` or ``HEAD`` on the manifest file will
604+ be the MD5 sum of the concatenated string of ETags for each of the segments
605+ in the ``<container>/<prefix>`` listing, dynamically. Usually in Swift the
606+ ETag is the MD5 sum of the contents of the object, and that holds true for
607+ each segment independently. But, it's not feasible to generate such an ETag
608+ for the manifest itself, so this method was chosen to at least offer change
609+ detection.
610+
611+-------
612+History
613+-------
614+
615+Large object support has gone through various iterations before settling on
616+this implementation.
617+
618+The primary factor driving the limitation of object size in swift is
619+maintaining balance among the partitions of the ring. To maintain an even
620+dispersion of disk usage throughout the cluster the obvious storage pattern
621+was to simply split larger objects into smaller segments, which could then be
622+glued together during a read.
623+
624+Before the introduction of large object support some applications were already
625+splitting their uploads into segments and re-assembling them on the client
626+side after retrieving the individual pieces. This design allowed the client
627+to support backup and archiving of large data sets, but was also frequently
628+employed to improve performance or reduce errors due to network interruption.
629+The major disadvantage of this method is that knowledge of the original
630+partitioning scheme is required to properly reassemble the object, which is
631+not practical for some use cases, such as CDN origination.
632+
633+In order to eliminate any barrier to entry for clients wanting to store
634+objects larger than 5GB, initially we also prototyped fully transparent
635+support for large object uploads. A fully transparent implementation would
636+support a larger max size by automatically splitting objects into segments
637+during upload within the proxy without any changes to the client API. All
638+segments were completely hidden from the client API.
639+
640+This solution introduced a number of challenging failure conditions into the
641+cluster, wouldn't provide the client with any option to do parallel uploads,
642+and had no basis for a resume feature. The transparent implementation was
643+deemed just too complex for the benefit.
644+
645+The current "user manifest" design was chosen in order to provide a
646+transparent download of large objects to the client and still provide the
647+uploading client a clean API to support segmented uploads.
648+
649+Alternative "explicit" user manifest options were discussed which would have
650+required a pre-defined format for listing the segments to "finalize" the
651+segmented upload. While this may offer some potential advantages, it was
652+decided that pushing an added burden onto the client which could potentially
653+limit adoption should be avoided in favor of a simpler "API" (essentially just
654+the format of the 'X-Object-Manifest' header).
655+
656+During development it was noted that this "implicit" user manifest approach
657+which is based on the path prefix can be potentially affected by the eventual
658+consistency window of the container listings, which could theoretically cause
659+a GET on the manifest object to return an invalid whole object for that short
660+term. In reality you're unlikely to encounter this scenario unless you're
661+running very high concurrency uploads against a small testing environment
662+which isn't running the object-updaters or container-replicators.
663+
664+Like all of swift, Large Object Support is living feature which will continue
665+to improve and may change over time.
666
667=== modified file 'swift/common/client.py'
668--- swift/common/client.py 2010-11-18 21:40:40 +0000
669+++ swift/common/client.py 2010-12-16 18:53:43 +0000
670@@ -569,7 +569,8 @@
671 :param container: container name that the object is in
672 :param name: object name to put
673 :param contents: a string or a file like object to read object data from
674- :param content_length: value to send as content-length header
675+ :param content_length: value to send as content-length header; also limits
676+ the amount read from contents
677 :param etag: etag of contents
678 :param chunk_size: chunk size of data to write
679 :param content_type: value to send as content-type header
680@@ -599,18 +600,24 @@
681 conn.putrequest('PUT', path)
682 for header, value in headers.iteritems():
683 conn.putheader(header, value)
684- if not content_length:
685+ if content_length is None:
686 conn.putheader('Transfer-Encoding', 'chunked')
687- conn.endheaders()
688- chunk = contents.read(chunk_size)
689- while chunk:
690- if not content_length:
691+ conn.endheaders()
692+ chunk = contents.read(chunk_size)
693+ while chunk:
694 conn.send('%x\r\n%s\r\n' % (len(chunk), chunk))
695- else:
696+ chunk = contents.read(chunk_size)
697+ conn.send('0\r\n\r\n')
698+ else:
699+ conn.endheaders()
700+ left = content_length
701+ while left > 0:
702+ size = chunk_size
703+ if size > left:
704+ size = left
705+ chunk = contents.read(size)
706 conn.send(chunk)
707- chunk = contents.read(chunk_size)
708- if not content_length:
709- conn.send('0\r\n\r\n')
710+ left -= len(chunk)
711 else:
712 conn.request('PUT', path, contents, headers)
713 resp = conn.getresponse()
714
715=== modified file 'swift/common/constraints.py'
716--- swift/common/constraints.py 2010-10-26 15:13:14 +0000
717+++ swift/common/constraints.py 2010-12-16 18:53:43 +0000
718@@ -113,6 +113,17 @@
719 if not check_utf8(req.headers['Content-Type']):
720 return HTTPBadRequest(request=req, body='Invalid Content-Type',
721 content_type='text/plain')
722+ if 'x-object-manifest' in req.headers:
723+ value = req.headers['x-object-manifest']
724+ container = prefix = None
725+ try:
726+ container, prefix = value.split('/', 1)
727+ except ValueError:
728+ pass
729+ if not container or not prefix or '?' in value or '&' in value or \
730+ prefix[0] == '/':
731+ return HTTPBadRequest(request=req,
732+ body='X-Object-Manifest must in the format container/prefix')
733 return check_metadata(req, 'object')
734
735
736
737=== modified file 'swift/obj/server.py'
738--- swift/obj/server.py 2010-11-01 21:47:48 +0000
739+++ swift/obj/server.py 2010-12-16 18:53:43 +0000
740@@ -391,6 +391,9 @@
741 'ETag': etag,
742 'Content-Length': str(os.fstat(fd).st_size),
743 }
744+ if 'x-object-manifest' in request.headers:
745+ metadata['X-Object-Manifest'] = \
746+ request.headers['x-object-manifest']
747 metadata.update(val for val in request.headers.iteritems()
748 if val[0].lower().startswith('x-object-meta-') and
749 len(val[0]) > 14)
750@@ -460,7 +463,8 @@
751 'application/octet-stream'), app_iter=file,
752 request=request, conditional_response=True)
753 for key, value in file.metadata.iteritems():
754- if key.lower().startswith('x-object-meta-'):
755+ if key == 'X-Object-Manifest' or \
756+ key.lower().startswith('x-object-meta-'):
757 response.headers[key] = value
758 response.etag = file.metadata['ETag']
759 response.last_modified = float(file.metadata['X-Timestamp'])
760@@ -488,7 +492,8 @@
761 response = Response(content_type=file.metadata['Content-Type'],
762 request=request, conditional_response=True)
763 for key, value in file.metadata.iteritems():
764- if key.lower().startswith('x-object-meta-'):
765+ if key == 'X-Object-Manifest' or \
766+ key.lower().startswith('x-object-meta-'):
767 response.headers[key] = value
768 response.etag = file.metadata['ETag']
769 response.last_modified = float(file.metadata['X-Timestamp'])
770
771=== modified file 'swift/proxy/server.py'
772--- swift/proxy/server.py 2010-12-13 18:25:09 +0000
773+++ swift/proxy/server.py 2010-12-16 18:53:43 +0000
774@@ -14,15 +14,23 @@
775 # limitations under the License.
776
777 from __future__ import with_statement
778+try:
779+ import simplejson as json
780+except ImportError:
781+ import json
782 import mimetypes
783 import os
784+import re
785 import time
786 import traceback
787 from ConfigParser import ConfigParser
788+from datetime import datetime
789 from urllib import unquote, quote
790 import uuid
791 import functools
792+from hashlib import md5
793
794+from eventlet import sleep
795 from eventlet.timeout import Timeout
796 from webob.exc import HTTPBadRequest, HTTPMethodNotAllowed, \
797 HTTPNotFound, HTTPPreconditionFailed, \
798@@ -36,8 +44,8 @@
799 cache_from_env
800 from swift.common.bufferedhttp import http_connect
801 from swift.common.constraints import check_metadata, check_object_creation, \
802- check_utf8, MAX_ACCOUNT_NAME_LENGTH, MAX_CONTAINER_NAME_LENGTH, \
803- MAX_FILE_SIZE
804+ check_utf8, CONTAINER_LISTING_LIMIT, MAX_ACCOUNT_NAME_LENGTH, \
805+ MAX_CONTAINER_NAME_LENGTH, MAX_FILE_SIZE
806 from swift.common.exceptions import ChunkReadTimeout, \
807 ChunkWriteTimeout, ConnectionTimeout
808
809@@ -95,6 +103,161 @@
810 return 'container/%s/%s' % (account, container)
811
812
813+class SegmentedIterable(object):
814+ """
815+ Iterable that returns the object contents for a segmented object in Swift.
816+
817+ If set, the response's `bytes_transferred` value will be updated (used to
818+ log the size of the request). Also, if there's a failure that cuts the
819+ transfer short, the response's `status_int` will be updated (again, just
820+ for logging since the original status would have already been sent to the
821+ client).
822+
823+ :param controller: The ObjectController instance to work with.
824+ :param container: The container the object segments are within.
825+ :param listing: The listing of object segments to iterate over; this may
826+ be an iterator or list that returns dicts with 'name' and
827+ 'bytes' keys.
828+ :param response: The webob.Response this iterable is associated with, if
829+ any (default: None)
830+ """
831+
832+ def __init__(self, controller, container, listing, response=None):
833+ self.controller = controller
834+ self.container = container
835+ self.listing = iter(listing)
836+ self.segment = -1
837+ self.segment_dict = None
838+ self.segment_peek = None
839+ self.seek = 0
840+ self.segment_iter = None
841+ self.position = 0
842+ self.response = response
843+ if not self.response:
844+ self.response = Response()
845+ self.next_get_time = 0
846+
847+ def _load_next_segment(self):
848+ """
849+ Loads the self.segment_iter with the next object segment's contents.
850+
851+ :raises: StopIteration when there are no more object segments.
852+ """
853+ try:
854+ self.segment += 1
855+ self.segment_dict = self.segment_peek or self.listing.next()
856+ self.segment_peek = None
857+ partition, nodes = self.controller.app.object_ring.get_nodes(
858+ self.controller.account_name, self.container,
859+ self.segment_dict['name'])
860+ path = '/%s/%s/%s' % (self.controller.account_name, self.container,
861+ self.segment_dict['name'])
862+ req = Request.blank(path)
863+ if self.seek:
864+ req.range = 'bytes=%s-' % self.seek
865+ self.seek = 0
866+ if self.segment > 10:
867+ sleep(max(self.next_get_time - time.time(), 0))
868+ self.next_get_time = time.time() + 1
869+ resp = self.controller.GETorHEAD_base(req, 'Object', partition,
870+ self.controller.iter_nodes(partition, nodes,
871+ self.controller.app.object_ring), path,
872+ self.controller.app.object_ring.replica_count)
873+ if resp.status_int // 100 != 2:
874+ raise Exception('Could not load object segment %s: %s' % (path,
875+ resp.status_int))
876+ self.segment_iter = resp.app_iter
877+ except StopIteration:
878+ raise
879+ except Exception, err:
880+ if not getattr(err, 'swift_logged', False):
881+ self.controller.app.logger.exception('ERROR: While processing '
882+ 'manifest /%s/%s/%s %s' % (self.controller.account_name,
883+ self.controller.container_name,
884+ self.controller.object_name, self.controller.trans_id))
885+ err.swift_logged = True
886+ self.response.status_int = 503
887+ raise
888+
889+ def next(self):
890+ return iter(self).next()
891+
892+ def __iter__(self):
893+ """ Standard iterator function that returns the object's contents. """
894+ try:
895+ while True:
896+ if not self.segment_iter:
897+ self._load_next_segment()
898+ while True:
899+ with ChunkReadTimeout(self.controller.app.node_timeout):
900+ try:
901+ chunk = self.segment_iter.next()
902+ break
903+ except StopIteration:
904+ self._load_next_segment()
905+ self.position += len(chunk)
906+ self.response.bytes_transferred = getattr(self.response,
907+ 'bytes_transferred', 0) + len(chunk)
908+ yield chunk
909+ except StopIteration:
910+ raise
911+ except Exception, err:
912+ if not getattr(err, 'swift_logged', False):
913+ self.controller.app.logger.exception('ERROR: While processing '
914+ 'manifest /%s/%s/%s %s' % (self.controller.account_name,
915+ self.controller.container_name,
916+ self.controller.object_name, self.controller.trans_id))
917+ err.swift_logged = True
918+ self.response.status_int = 503
919+ raise
920+
921+ def app_iter_range(self, start, stop):
922+ """
923+ Non-standard iterator function for use with Webob in serving Range
924+ requests more quickly. This will skip over segments and do a range
925+ request on the first segment to return data from, if needed.
926+
927+ :param start: The first byte (zero-based) to return. None for 0.
928+ :param stop: The last byte (zero-based) to return. None for end.
929+ """
930+ try:
931+ if start:
932+ self.segment_peek = self.listing.next()
933+ while start >= self.position + self.segment_peek['bytes']:
934+ self.segment += 1
935+ self.position += self.segment_peek['bytes']
936+ self.segment_peek = self.listing.next()
937+ self.seek = start - self.position
938+ else:
939+ start = 0
940+ if stop is not None:
941+ length = stop - start
942+ else:
943+ length = None
944+ for chunk in self:
945+ if length is not None:
946+ length -= len(chunk)
947+ if length < 0:
948+ # Chop off the extra:
949+ self.response.bytes_transferred = \
950+ getattr(self.response, 'bytes_transferred', 0) \
951+ + length
952+ yield chunk[:length]
953+ break
954+ yield chunk
955+ except StopIteration:
956+ raise
957+ except Exception, err:
958+ if not getattr(err, 'swift_logged', False):
959+ self.controller.app.logger.exception('ERROR: While processing '
960+ 'manifest /%s/%s/%s %s' % (self.controller.account_name,
961+ self.controller.container_name,
962+ self.controller.object_name, self.controller.trans_id))
963+ err.swift_logged = True
964+ self.response.status_int = 503
965+ raise
966+
967+
968 class Controller(object):
969 """Base WSGI controller class for the proxy"""
970
971@@ -538,9 +701,137 @@
972 return aresp
973 partition, nodes = self.app.object_ring.get_nodes(
974 self.account_name, self.container_name, self.object_name)
975- return self.GETorHEAD_base(req, 'Object', partition,
976+ resp = self.GETorHEAD_base(req, 'Object', partition,
977 self.iter_nodes(partition, nodes, self.app.object_ring),
978 req.path_info, self.app.object_ring.replica_count)
979+ # If we get a 416 Requested Range Not Satisfiable we have to check if
980+ # we were actually requesting a manifest object and then redo the range
981+ # request on the whole object.
982+ if resp.status_int == 416:
983+ req_range = req.range
984+ req.range = None
985+ resp2 = self.GETorHEAD_base(req, 'Object', partition,
986+ self.iter_nodes(partition, nodes, self.app.object_ring),
987+ req.path_info, self.app.object_ring.replica_count)
988+ if 'x-object-manifest' not in resp2.headers:
989+ return resp
990+ resp = resp2
991+ req.range = req_range
992+
993+ if 'x-object-manifest' in resp.headers:
994+ lcontainer, lprefix = \
995+ resp.headers['x-object-manifest'].split('/', 1)
996+ lpartition, lnodes = self.app.container_ring.get_nodes(
997+ self.account_name, lcontainer)
998+ marker = ''
999+ listing = []
1000+ while True:
1001+ lreq = Request.blank('/%s/%s?prefix=%s&format=json&marker=%s' %
1002+ (quote(self.account_name), quote(lcontainer),
1003+ quote(lprefix), quote(marker)))
1004+ lresp = self.GETorHEAD_base(lreq, 'Container', lpartition,
1005+ lnodes, lreq.path_info,
1006+ self.app.container_ring.replica_count)
1007+ if lresp.status_int // 100 != 2:
1008+ lresp = HTTPNotFound(request=req)
1009+ lresp.headers['X-Object-Manifest'] = \
1010+ resp.headers['x-object-manifest']
1011+ return lresp
1012+ if 'swift.authorize' in req.environ:
1013+ req.acl = lresp.headers.get('x-container-read')
1014+ aresp = req.environ['swift.authorize'](req)
1015+ if aresp:
1016+ return aresp
1017+ sublisting = json.loads(lresp.body)
1018+ if not sublisting:
1019+ break
1020+ listing.extend(sublisting)
1021+ if len(listing) > CONTAINER_LISTING_LIMIT:
1022+ break
1023+ marker = sublisting[-1]['name']
1024+
1025+ if len(listing) > CONTAINER_LISTING_LIMIT:
1026+ # We will serve large objects with a ton of segments with
1027+ # chunked transfer encoding.
1028+
1029+ def listing_iter():
1030+ marker = ''
1031+ while True:
1032+ lreq = Request.blank(
1033+ '/%s/%s?prefix=%s&format=json&marker=%s' %
1034+ (quote(self.account_name), quote(lcontainer),
1035+ quote(lprefix), quote(marker)))
1036+ lresp = self.GETorHEAD_base(lreq, 'Container',
1037+ lpartition, lnodes, lreq.path_info,
1038+ self.app.container_ring.replica_count)
1039+ if lresp.status_int // 100 != 2:
1040+ raise Exception('Object manifest GET could not '
1041+ 'continue listing: %s %s' %
1042+ (req.path, lreq.path))
1043+ if 'swift.authorize' in req.environ:
1044+ req.acl = lresp.headers.get('x-container-read')
1045+ aresp = req.environ['swift.authorize'](req)
1046+ if aresp:
1047+ raise Exception('Object manifest GET could '
1048+ 'not continue listing: %s %s' %
1049+ (req.path, aresp))
1050+ sublisting = json.loads(lresp.body)
1051+ if not sublisting:
1052+ break
1053+ for obj in sublisting:
1054+ yield obj
1055+ marker = sublisting[-1]['name']
1056+
1057+ headers = {
1058+ 'X-Object-Manifest': resp.headers['x-object-manifest'],
1059+ 'Content-Type': resp.content_type}
1060+ for key, value in resp.headers.iteritems():
1061+ if key.lower().startswith('x-object-meta-'):
1062+ headers[key] = value
1063+ resp = Response(headers=headers, request=req,
1064+ conditional_response=True)
1065+ if req.method == 'HEAD':
1066+ # These shenanigans are because webob translates the HEAD
1067+ # request into a webob EmptyResponse for the body, which
1068+ # has a len, which eventlet translates as needing a
1069+ # content-length header added. So we call the original
1070+ # webob resp for the headers but return an empty iterator
1071+ # for the body.
1072+
1073+ def head_response(environ, start_response):
1074+ resp(environ, start_response)
1075+ return iter([])
1076+
1077+ head_response.status_int = resp.status_int
1078+ return head_response
1079+ else:
1080+ resp.app_iter = SegmentedIterable(self, lcontainer,
1081+ listing_iter(), resp)
1082+
1083+ else:
1084+ # For objects with a reasonable number of segments, we'll serve
1085+ # them with a set content-length and computed etag.
1086+ content_length = sum(o['bytes'] for o in listing)
1087+ last_modified = max(o['last_modified'] for o in listing)
1088+ last_modified = \
1089+ datetime(*map(int, re.split('[^\d]', last_modified)[:-1]))
1090+ etag = md5('"'.join(o['hash'] for o in listing)).hexdigest()
1091+ headers = {
1092+ 'X-Object-Manifest': resp.headers['x-object-manifest'],
1093+ 'Content-Type': resp.content_type,
1094+ 'Content-Length': content_length,
1095+ 'ETag': etag}
1096+ for key, value in resp.headers.iteritems():
1097+ if key.lower().startswith('x-object-meta-'):
1098+ headers[key] = value
1099+ resp = Response(headers=headers, request=req,
1100+ conditional_response=True)
1101+ resp.app_iter = SegmentedIterable(self, lcontainer, listing,
1102+ resp)
1103+ resp.content_length = content_length
1104+ resp.last_modified = last_modified
1105+
1106+ return resp
1107
1108 @public
1109 @delay_denial
1110@@ -654,11 +945,17 @@
1111 return source_resp
1112 self.object_name = orig_obj_name
1113 self.container_name = orig_container_name
1114- data_source = source_resp.app_iter
1115 new_req = Request.blank(req.path_info,
1116 environ=req.environ, headers=req.headers)
1117- new_req.content_length = source_resp.content_length
1118- new_req.etag = source_resp.etag
1119+ if 'x-object-manifest' in source_resp.headers:
1120+ data_source = iter([''])
1121+ new_req.content_length = 0
1122+ new_req.headers['X-Object-Manifest'] = \
1123+ source_resp.headers['x-object-manifest']
1124+ else:
1125+ data_source = source_resp.app_iter
1126+ new_req.content_length = source_resp.content_length
1127+ new_req.etag = source_resp.etag
1128 # we no longer need the X-Copy-From header
1129 del new_req.headers['X-Copy-From']
1130 for k, v in source_resp.headers.items():
1131
1132=== modified file 'test/functionalnosetests/test_object.py'
1133--- test/functionalnosetests/test_object.py 2010-11-04 19:39:29 +0000
1134+++ test/functionalnosetests/test_object.py 2010-12-16 18:53:43 +0000
1135@@ -16,6 +16,7 @@
1136 if skip:
1137 raise SkipTest
1138 self.container = uuid4().hex
1139+
1140 def put(url, token, parsed, conn):
1141 conn.request('PUT', parsed.path + '/' + self.container, '',
1142 {'X-Auth-Token': token})
1143@@ -24,6 +25,7 @@
1144 resp.read()
1145 self.assertEquals(resp.status, 201)
1146 self.obj = uuid4().hex
1147+
1148 def put(url, token, parsed, conn):
1149 conn.request('PUT', '%s/%s/%s' % (parsed.path, self.container,
1150 self.obj), 'test', {'X-Auth-Token': token})
1151@@ -35,6 +37,7 @@
1152 def tearDown(self):
1153 if skip:
1154 raise SkipTest
1155+
1156 def delete(url, token, parsed, conn):
1157 conn.request('DELETE', '%s/%s/%s' % (parsed.path, self.container,
1158 self.obj), '', {'X-Auth-Token': token})
1159@@ -42,6 +45,7 @@
1160 resp = retry(delete)
1161 resp.read()
1162 self.assertEquals(resp.status, 204)
1163+
1164 def delete(url, token, parsed, conn):
1165 conn.request('DELETE', parsed.path + '/' + self.container, '',
1166 {'X-Auth-Token': token})
1167@@ -53,6 +57,7 @@
1168 def test_public_object(self):
1169 if skip:
1170 raise SkipTest
1171+
1172 def get(url, token, parsed, conn):
1173 conn.request('GET',
1174 '%s/%s/%s' % (parsed.path, self.container, self.obj))
1175@@ -62,6 +67,7 @@
1176 raise Exception('Should not have been able to GET')
1177 except Exception, err:
1178 self.assert_(str(err).startswith('No result after '))
1179+
1180 def post(url, token, parsed, conn):
1181 conn.request('POST', parsed.path + '/' + self.container, '',
1182 {'X-Auth-Token': token,
1183@@ -73,6 +79,7 @@
1184 resp = retry(get)
1185 resp.read()
1186 self.assertEquals(resp.status, 200)
1187+
1188 def post(url, token, parsed, conn):
1189 conn.request('POST', parsed.path + '/' + self.container, '',
1190 {'X-Auth-Token': token, 'X-Container-Read': ''})
1191@@ -89,6 +96,7 @@
1192 def test_private_object(self):
1193 if skip or skip3:
1194 raise SkipTest
1195+
1196 # Ensure we can't access the object with the third account
1197 def get(url, token, parsed, conn):
1198 conn.request('GET', '%s/%s/%s' % (parsed.path, self.container,
1199@@ -98,8 +106,10 @@
1200 resp = retry(get, use_account=3)
1201 resp.read()
1202 self.assertEquals(resp.status, 403)
1203+
1204 # create a shared container writable by account3
1205 shared_container = uuid4().hex
1206+
1207 def put(url, token, parsed, conn):
1208 conn.request('PUT', '%s/%s' % (parsed.path,
1209 shared_container), '',
1210@@ -110,6 +120,7 @@
1211 resp = retry(put)
1212 resp.read()
1213 self.assertEquals(resp.status, 201)
1214+
1215 # verify third account can not copy from private container
1216 def copy(url, token, parsed, conn):
1217 conn.request('PUT', '%s/%s/%s' % (parsed.path,
1218@@ -123,6 +134,7 @@
1219 resp = retry(copy, use_account=3)
1220 resp.read()
1221 self.assertEquals(resp.status, 403)
1222+
1223 # verify third account can write "obj1" to shared container
1224 def put(url, token, parsed, conn):
1225 conn.request('PUT', '%s/%s/%s' % (parsed.path, shared_container,
1226@@ -131,6 +143,7 @@
1227 resp = retry(put, use_account=3)
1228 resp.read()
1229 self.assertEquals(resp.status, 201)
1230+
1231 # verify third account can copy "obj1" to shared container
1232 def copy2(url, token, parsed, conn):
1233 conn.request('COPY', '%s/%s/%s' % (parsed.path,
1234@@ -143,6 +156,7 @@
1235 resp = retry(copy2, use_account=3)
1236 resp.read()
1237 self.assertEquals(resp.status, 201)
1238+
1239 # verify third account STILL can not copy from private container
1240 def copy3(url, token, parsed, conn):
1241 conn.request('COPY', '%s/%s/%s' % (parsed.path,
1242@@ -155,6 +169,7 @@
1243 resp = retry(copy3, use_account=3)
1244 resp.read()
1245 self.assertEquals(resp.status, 403)
1246+
1247 # clean up "obj1"
1248 def delete(url, token, parsed, conn):
1249 conn.request('DELETE', '%s/%s/%s' % (parsed.path, shared_container,
1250@@ -163,6 +178,7 @@
1251 resp = retry(delete)
1252 resp.read()
1253 self.assertEquals(resp.status, 204)
1254+
1255 # clean up shared_container
1256 def delete(url, token, parsed, conn):
1257 conn.request('DELETE',
1258@@ -173,6 +189,269 @@
1259 resp.read()
1260 self.assertEquals(resp.status, 204)
1261
1262+ def test_manifest(self):
1263+ if skip:
1264+ raise SkipTest
1265+ # Data for the object segments
1266+ segments1 = ['one', 'two', 'three', 'four', 'five']
1267+ segments2 = ['six', 'seven', 'eight']
1268+ segments3 = ['nine', 'ten', 'eleven']
1269+
1270+ # Upload the first set of segments
1271+ def put(url, token, parsed, conn, objnum):
1272+ conn.request('PUT', '%s/%s/segments1/%s' % (parsed.path,
1273+ self.container, str(objnum)), segments1[objnum],
1274+ {'X-Auth-Token': token})
1275+ return check_response(conn)
1276+ for objnum in xrange(len(segments1)):
1277+ resp = retry(put, objnum)
1278+ resp.read()
1279+ self.assertEquals(resp.status, 201)
1280+
1281+ # Upload the manifest
1282+ def put(url, token, parsed, conn):
1283+ conn.request('PUT', '%s/%s/manifest' % (parsed.path,
1284+ self.container), '', {'X-Auth-Token': token,
1285+ 'X-Object-Manifest': '%s/segments1/' % self.container,
1286+ 'Content-Type': 'text/jibberish', 'Content-Length': '0'})
1287+ return check_response(conn)
1288+ resp = retry(put)
1289+ resp.read()
1290+ self.assertEquals(resp.status, 201)
1291+
1292+ # Get the manifest (should get all the segments as the body)
1293+ def get(url, token, parsed, conn):
1294+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1295+ self.container), '', {'X-Auth-Token': token})
1296+ return check_response(conn)
1297+ resp = retry(get)
1298+ self.assertEquals(resp.read(), ''.join(segments1))
1299+ self.assertEquals(resp.status, 200)
1300+ self.assertEquals(resp.getheader('content-type'), 'text/jibberish')
1301+
1302+ # Get with a range at the start of the second segment
1303+ def get(url, token, parsed, conn):
1304+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1305+ self.container), '', {'X-Auth-Token': token, 'Range':
1306+ 'bytes=3-'})
1307+ return check_response(conn)
1308+ resp = retry(get)
1309+ self.assertEquals(resp.read(), ''.join(segments1[1:]))
1310+ self.assertEquals(resp.status, 206)
1311+
1312+ # Get with a range in the middle of the second segment
1313+ def get(url, token, parsed, conn):
1314+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1315+ self.container), '', {'X-Auth-Token': token, 'Range':
1316+ 'bytes=5-'})
1317+ return check_response(conn)
1318+ resp = retry(get)
1319+ self.assertEquals(resp.read(), ''.join(segments1)[5:])
1320+ self.assertEquals(resp.status, 206)
1321+
1322+ # Get with a full start and stop range
1323+ def get(url, token, parsed, conn):
1324+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1325+ self.container), '', {'X-Auth-Token': token, 'Range':
1326+ 'bytes=5-10'})
1327+ return check_response(conn)
1328+ resp = retry(get)
1329+ self.assertEquals(resp.read(), ''.join(segments1)[5:11])
1330+ self.assertEquals(resp.status, 206)
1331+
1332+ # Upload the second set of segments
1333+ def put(url, token, parsed, conn, objnum):
1334+ conn.request('PUT', '%s/%s/segments2/%s' % (parsed.path,
1335+ self.container, str(objnum)), segments2[objnum],
1336+ {'X-Auth-Token': token})
1337+ return check_response(conn)
1338+ for objnum in xrange(len(segments2)):
1339+ resp = retry(put, objnum)
1340+ resp.read()
1341+ self.assertEquals(resp.status, 201)
1342+
1343+ # Get the manifest (should still be the first segments of course)
1344+ def get(url, token, parsed, conn):
1345+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1346+ self.container), '', {'X-Auth-Token': token})
1347+ return check_response(conn)
1348+ resp = retry(get)
1349+ self.assertEquals(resp.read(), ''.join(segments1))
1350+ self.assertEquals(resp.status, 200)
1351+
1352+ # Update the manifest
1353+ def put(url, token, parsed, conn):
1354+ conn.request('PUT', '%s/%s/manifest' % (parsed.path,
1355+ self.container), '', {'X-Auth-Token': token,
1356+ 'X-Object-Manifest': '%s/segments2/' % self.container,
1357+ 'Content-Length': '0'})
1358+ return check_response(conn)
1359+ resp = retry(put)
1360+ resp.read()
1361+ self.assertEquals(resp.status, 201)
1362+
1363+ # Get the manifest (should be the second set of segments now)
1364+ def get(url, token, parsed, conn):
1365+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1366+ self.container), '', {'X-Auth-Token': token})
1367+ return check_response(conn)
1368+ resp = retry(get)
1369+ self.assertEquals(resp.read(), ''.join(segments2))
1370+ self.assertEquals(resp.status, 200)
1371+
1372+ if not skip3:
1373+
1374+ # Ensure we can't access the manifest with the third account
1375+ def get(url, token, parsed, conn):
1376+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1377+ self.container), '', {'X-Auth-Token': token})
1378+ return check_response(conn)
1379+ resp = retry(get, use_account=3)
1380+ resp.read()
1381+ self.assertEquals(resp.status, 403)
1382+
1383+ # Grant access to the third account
1384+ def post(url, token, parsed, conn):
1385+ conn.request('POST', '%s/%s' % (parsed.path, self.container),
1386+ '', {'X-Auth-Token': token, 'X-Container-Read':
1387+ swift_test_user[2]})
1388+ return check_response(conn)
1389+ resp = retry(post)
1390+ resp.read()
1391+ self.assertEquals(resp.status, 204)
1392+
1393+ # The third account should be able to get the manifest now
1394+ def get(url, token, parsed, conn):
1395+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1396+ self.container), '', {'X-Auth-Token': token})
1397+ return check_response(conn)
1398+ resp = retry(get, use_account=3)
1399+ self.assertEquals(resp.read(), ''.join(segments2))
1400+ self.assertEquals(resp.status, 200)
1401+
1402+ # Create another container for the third set of segments
1403+ acontainer = uuid4().hex
1404+
1405+ def put(url, token, parsed, conn):
1406+ conn.request('PUT', parsed.path + '/' + acontainer, '',
1407+ {'X-Auth-Token': token})
1408+ return check_response(conn)
1409+ resp = retry(put)
1410+ resp.read()
1411+ self.assertEquals(resp.status, 201)
1412+
1413+ # Upload the third set of segments in the other container
1414+ def put(url, token, parsed, conn, objnum):
1415+ conn.request('PUT', '%s/%s/segments3/%s' % (parsed.path,
1416+ acontainer, str(objnum)), segments3[objnum],
1417+ {'X-Auth-Token': token})
1418+ return check_response(conn)
1419+ for objnum in xrange(len(segments3)):
1420+ resp = retry(put, objnum)
1421+ resp.read()
1422+ self.assertEquals(resp.status, 201)
1423+
1424+ # Update the manifest
1425+ def put(url, token, parsed, conn):
1426+ conn.request('PUT', '%s/%s/manifest' % (parsed.path,
1427+ self.container), '', {'X-Auth-Token': token,
1428+ 'X-Object-Manifest': '%s/segments3/' % acontainer,
1429+ 'Content-Length': '0'})
1430+ return check_response(conn)
1431+ resp = retry(put)
1432+ resp.read()
1433+ self.assertEquals(resp.status, 201)
1434+
1435+ # Get the manifest to ensure it's the third set of segments
1436+ def get(url, token, parsed, conn):
1437+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1438+ self.container), '', {'X-Auth-Token': token})
1439+ return check_response(conn)
1440+ resp = retry(get)
1441+ self.assertEquals(resp.read(), ''.join(segments3))
1442+ self.assertEquals(resp.status, 200)
1443+
1444+ if not skip3:
1445+
1446+ # Ensure we can't access the manifest with the third account
1447+ # (because the segments are in a protected container even if the
1448+ # manifest itself is not).
1449+
1450+ def get(url, token, parsed, conn):
1451+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1452+ self.container), '', {'X-Auth-Token': token})
1453+ return check_response(conn)
1454+ resp = retry(get, use_account=3)
1455+ resp.read()
1456+ self.assertEquals(resp.status, 403)
1457+
1458+ # Grant access to the third account
1459+ def post(url, token, parsed, conn):
1460+ conn.request('POST', '%s/%s' % (parsed.path, acontainer),
1461+ '', {'X-Auth-Token': token, 'X-Container-Read':
1462+ swift_test_user[2]})
1463+ return check_response(conn)
1464+ resp = retry(post)
1465+ resp.read()
1466+ self.assertEquals(resp.status, 204)
1467+
1468+ # The third account should be able to get the manifest now
1469+ def get(url, token, parsed, conn):
1470+ conn.request('GET', '%s/%s/manifest' % (parsed.path,
1471+ self.container), '', {'X-Auth-Token': token})
1472+ return check_response(conn)
1473+ resp = retry(get, use_account=3)
1474+ self.assertEquals(resp.read(), ''.join(segments3))
1475+ self.assertEquals(resp.status, 200)
1476+
1477+ # Delete the manifest
1478+ def delete(url, token, parsed, conn, objnum):
1479+ conn.request('DELETE', '%s/%s/manifest' % (parsed.path,
1480+ self.container), '', {'X-Auth-Token': token})
1481+ return check_response(conn)
1482+ resp = retry(delete, objnum)
1483+ resp.read()
1484+ self.assertEquals(resp.status, 204)
1485+
1486+ # Delete the third set of segments
1487+ def delete(url, token, parsed, conn, objnum):
1488+ conn.request('DELETE', '%s/%s/segments3/%s' % (parsed.path,
1489+ acontainer, str(objnum)), '', {'X-Auth-Token': token})
1490+ return check_response(conn)
1491+ for objnum in xrange(len(segments3)):
1492+ resp = retry(delete, objnum)
1493+ resp.read()
1494+ self.assertEquals(resp.status, 204)
1495+
1496+ # Delete the second set of segments
1497+ def delete(url, token, parsed, conn, objnum):
1498+ conn.request('DELETE', '%s/%s/segments2/%s' % (parsed.path,
1499+ self.container, str(objnum)), '', {'X-Auth-Token': token})
1500+ return check_response(conn)
1501+ for objnum in xrange(len(segments2)):
1502+ resp = retry(delete, objnum)
1503+ resp.read()
1504+ self.assertEquals(resp.status, 204)
1505+
1506+ # Delete the first set of segments
1507+ def delete(url, token, parsed, conn, objnum):
1508+ conn.request('DELETE', '%s/%s/segments1/%s' % (parsed.path,
1509+ self.container, str(objnum)), '', {'X-Auth-Token': token})
1510+ return check_response(conn)
1511+ for objnum in xrange(len(segments1)):
1512+ resp = retry(delete, objnum)
1513+ resp.read()
1514+ self.assertEquals(resp.status, 204)
1515+
1516+ # Delete the extra container
1517+ def delete(url, token, parsed, conn):
1518+ conn.request('DELETE', '%s/%s' % (parsed.path, acontainer), '',
1519+ {'X-Auth-Token': token})
1520+ return check_response(conn)
1521+ resp = retry(delete)
1522+ resp.read()
1523+ self.assertEquals(resp.status, 204)
1524+
1525
1526 if __name__ == '__main__':
1527 unittest.main()
1528
1529=== modified file 'test/unit/common/test_constraints.py'
1530--- test/unit/common/test_constraints.py 2010-08-16 22:30:27 +0000
1531+++ test/unit/common/test_constraints.py 2010-12-16 18:53:43 +0000
1532@@ -22,6 +22,7 @@
1533
1534 from swift.common import constraints
1535
1536+
1537 class TestConstraints(unittest.TestCase):
1538
1539 def test_check_metadata_empty(self):
1540@@ -137,6 +138,32 @@
1541 self.assert_(isinstance(resp, HTTPBadRequest))
1542 self.assert_('Content-Type' in resp.body)
1543
1544+ def test_check_object_manifest_header(self):
1545+ resp = constraints.check_object_creation(Request.blank('/',
1546+ headers={'X-Object-Manifest': 'container/prefix', 'Content-Length':
1547+ '0', 'Content-Type': 'text/plain'}), 'manifest')
1548+ self.assert_(not resp)
1549+ resp = constraints.check_object_creation(Request.blank('/',
1550+ headers={'X-Object-Manifest': 'container', 'Content-Length': '0',
1551+ 'Content-Type': 'text/plain'}), 'manifest')
1552+ self.assert_(isinstance(resp, HTTPBadRequest))
1553+ resp = constraints.check_object_creation(Request.blank('/',
1554+ headers={'X-Object-Manifest': '/container/prefix',
1555+ 'Content-Length': '0', 'Content-Type': 'text/plain'}), 'manifest')
1556+ self.assert_(isinstance(resp, HTTPBadRequest))
1557+ resp = constraints.check_object_creation(Request.blank('/',
1558+ headers={'X-Object-Manifest': 'container/prefix?query=param',
1559+ 'Content-Length': '0', 'Content-Type': 'text/plain'}), 'manifest')
1560+ self.assert_(isinstance(resp, HTTPBadRequest))
1561+ resp = constraints.check_object_creation(Request.blank('/',
1562+ headers={'X-Object-Manifest': 'container/prefix&query=param',
1563+ 'Content-Length': '0', 'Content-Type': 'text/plain'}), 'manifest')
1564+ self.assert_(isinstance(resp, HTTPBadRequest))
1565+ resp = constraints.check_object_creation(Request.blank('/',
1566+ headers={'X-Object-Manifest': 'http://host/container/prefix',
1567+ 'Content-Length': '0', 'Content-Type': 'text/plain'}), 'manifest')
1568+ self.assert_(isinstance(resp, HTTPBadRequest))
1569+
1570 def test_check_mount(self):
1571 self.assertFalse(constraints.check_mount('', ''))
1572 constraints.os = MockTrue() # mock os module
1573
1574=== modified file 'test/unit/obj/test_server.py'
1575--- test/unit/obj/test_server.py 2010-10-13 21:26:43 +0000
1576+++ test/unit/obj/test_server.py 2010-12-16 18:53:43 +0000
1577@@ -42,7 +42,7 @@
1578 self.path_to_test_xfs = os.environ.get('PATH_TO_TEST_XFS')
1579 if not self.path_to_test_xfs or \
1580 not os.path.exists(self.path_to_test_xfs):
1581- print >>sys.stderr, 'WARNING: PATH_TO_TEST_XFS not set or not ' \
1582+ print >> sys.stderr, 'WARNING: PATH_TO_TEST_XFS not set or not ' \
1583 'pointing to a valid directory.\n' \
1584 'Please set PATH_TO_TEST_XFS to a directory on an XFS file ' \
1585 'system for testing.'
1586@@ -77,7 +77,8 @@
1587 self.assertEquals(resp.status_int, 201)
1588
1589 timestamp = normalize_timestamp(time())
1590- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'POST'},
1591+ req = Request.blank('/sda1/p/a/c/o',
1592+ environ={'REQUEST_METHOD': 'POST'},
1593 headers={'X-Timestamp': timestamp,
1594 'X-Object-Meta-3': 'Three',
1595 'X-Object-Meta-4': 'Four',
1596@@ -95,7 +96,8 @@
1597 if not self.path_to_test_xfs:
1598 raise SkipTest
1599 timestamp = normalize_timestamp(time())
1600- req = Request.blank('/sda1/p/a/c/fail', environ={'REQUEST_METHOD': 'POST'},
1601+ req = Request.blank('/sda1/p/a/c/fail',
1602+ environ={'REQUEST_METHOD': 'POST'},
1603 headers={'X-Timestamp': timestamp,
1604 'X-Object-Meta-1': 'One',
1605 'X-Object-Meta-2': 'Two',
1606@@ -116,29 +118,37 @@
1607 def test_POST_container_connection(self):
1608 if not self.path_to_test_xfs:
1609 raise SkipTest
1610+
1611 def mock_http_connect(response, with_exc=False):
1612+
1613 class FakeConn(object):
1614+
1615 def __init__(self, status, with_exc):
1616 self.status = status
1617 self.reason = 'Fake'
1618 self.host = '1.2.3.4'
1619 self.port = '1234'
1620 self.with_exc = with_exc
1621+
1622 def getresponse(self):
1623 if self.with_exc:
1624 raise Exception('test')
1625 return self
1626+
1627 def read(self, amt=None):
1628 return ''
1629+
1630 return lambda *args, **kwargs: FakeConn(response, with_exc)
1631+
1632 old_http_connect = object_server.http_connect
1633 try:
1634 timestamp = normalize_timestamp(time())
1635- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'POST'},
1636- headers={'X-Timestamp': timestamp, 'Content-Type': 'text/plain',
1637- 'Content-Length': '0'})
1638+ req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD':
1639+ 'POST'}, headers={'X-Timestamp': timestamp, 'Content-Type':
1640+ 'text/plain', 'Content-Length': '0'})
1641 resp = self.object_controller.PUT(req)
1642- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'POST'},
1643+ req = Request.blank('/sda1/p/a/c/o',
1644+ environ={'REQUEST_METHOD': 'POST'},
1645 headers={'X-Timestamp': timestamp,
1646 'X-Container-Host': '1.2.3.4:0',
1647 'X-Container-Partition': '3',
1648@@ -148,7 +158,8 @@
1649 object_server.http_connect = mock_http_connect(202)
1650 resp = self.object_controller.POST(req)
1651 self.assertEquals(resp.status_int, 202)
1652- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'POST'},
1653+ req = Request.blank('/sda1/p/a/c/o',
1654+ environ={'REQUEST_METHOD': 'POST'},
1655 headers={'X-Timestamp': timestamp,
1656 'X-Container-Host': '1.2.3.4:0',
1657 'X-Container-Partition': '3',
1658@@ -158,7 +169,8 @@
1659 object_server.http_connect = mock_http_connect(202, with_exc=True)
1660 resp = self.object_controller.POST(req)
1661 self.assertEquals(resp.status_int, 202)
1662- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'POST'},
1663+ req = Request.blank('/sda1/p/a/c/o',
1664+ environ={'REQUEST_METHOD': 'POST'},
1665 headers={'X-Timestamp': timestamp,
1666 'X-Container-Host': '1.2.3.4:0',
1667 'X-Container-Partition': '3',
1668@@ -226,7 +238,8 @@
1669 timestamp + '.data')
1670 self.assert_(os.path.isfile(objfile))
1671 self.assertEquals(open(objfile).read(), 'VERIFY')
1672- self.assertEquals(pickle.loads(getxattr(objfile, object_server.METADATA_KEY)),
1673+ self.assertEquals(pickle.loads(getxattr(objfile,
1674+ object_server.METADATA_KEY)),
1675 {'X-Timestamp': timestamp,
1676 'Content-Length': '6',
1677 'ETag': '0b4c12d7e0a73840c1c4f148fda3b037',
1678@@ -258,7 +271,8 @@
1679 timestamp + '.data')
1680 self.assert_(os.path.isfile(objfile))
1681 self.assertEquals(open(objfile).read(), 'VERIFY TWO')
1682- self.assertEquals(pickle.loads(getxattr(objfile, object_server.METADATA_KEY)),
1683+ self.assertEquals(pickle.loads(getxattr(objfile,
1684+ object_server.METADATA_KEY)),
1685 {'X-Timestamp': timestamp,
1686 'Content-Length': '10',
1687 'ETag': 'b381a4c5dab1eaa1eb9711fa647cd039',
1688@@ -270,17 +284,17 @@
1689 if not self.path_to_test_xfs:
1690 raise SkipTest
1691 req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'PUT'},
1692- headers={'X-Timestamp': normalize_timestamp(time()),
1693- 'Content-Type': 'text/plain'})
1694+ headers={'X-Timestamp': normalize_timestamp(time()),
1695+ 'Content-Type': 'text/plain'})
1696 req.body = 'test'
1697 resp = self.object_controller.PUT(req)
1698 self.assertEquals(resp.status_int, 201)
1699
1700 def test_PUT_invalid_etag(self):
1701 req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'PUT'},
1702- headers={'X-Timestamp': normalize_timestamp(time()),
1703- 'Content-Type': 'text/plain',
1704- 'ETag': 'invalid'})
1705+ headers={'X-Timestamp': normalize_timestamp(time()),
1706+ 'Content-Type': 'text/plain',
1707+ 'ETag': 'invalid'})
1708 req.body = 'test'
1709 resp = self.object_controller.PUT(req)
1710 self.assertEquals(resp.status_int, 422)
1711@@ -304,7 +318,8 @@
1712 timestamp + '.data')
1713 self.assert_(os.path.isfile(objfile))
1714 self.assertEquals(open(objfile).read(), 'VERIFY THREE')
1715- self.assertEquals(pickle.loads(getxattr(objfile, object_server.METADATA_KEY)),
1716+ self.assertEquals(pickle.loads(getxattr(objfile,
1717+ object_server.METADATA_KEY)),
1718 {'X-Timestamp': timestamp,
1719 'Content-Length': '12',
1720 'ETag': 'b114ab7b90d9ccac4bd5d99cc7ebb568',
1721@@ -316,25 +331,33 @@
1722 def test_PUT_container_connection(self):
1723 if not self.path_to_test_xfs:
1724 raise SkipTest
1725+
1726 def mock_http_connect(response, with_exc=False):
1727+
1728 class FakeConn(object):
1729+
1730 def __init__(self, status, with_exc):
1731 self.status = status
1732 self.reason = 'Fake'
1733 self.host = '1.2.3.4'
1734 self.port = '1234'
1735 self.with_exc = with_exc
1736+
1737 def getresponse(self):
1738 if self.with_exc:
1739 raise Exception('test')
1740 return self
1741+
1742 def read(self, amt=None):
1743 return ''
1744+
1745 return lambda *args, **kwargs: FakeConn(response, with_exc)
1746+
1747 old_http_connect = object_server.http_connect
1748 try:
1749 timestamp = normalize_timestamp(time())
1750- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'POST'},
1751+ req = Request.blank('/sda1/p/a/c/o',
1752+ environ={'REQUEST_METHOD': 'POST'},
1753 headers={'X-Timestamp': timestamp,
1754 'X-Container-Host': '1.2.3.4:0',
1755 'X-Container-Partition': '3',
1756@@ -555,7 +578,8 @@
1757 self.assertEquals(resp.status_int, 200)
1758 self.assertEquals(resp.etag, etag)
1759
1760- req = Request.blank('/sda1/p/a/c/o2', environ={'REQUEST_METHOD': 'GET'},
1761+ req = Request.blank('/sda1/p/a/c/o2',
1762+ environ={'REQUEST_METHOD': 'GET'},
1763 headers={'If-Match': '*'})
1764 resp = self.object_controller.GET(req)
1765 self.assertEquals(resp.status_int, 412)
1766@@ -715,7 +739,8 @@
1767 """ Test swift.object_server.ObjectController.DELETE """
1768 if not self.path_to_test_xfs:
1769 raise SkipTest
1770- req = Request.blank('/sda1/p/a/c', environ={'REQUEST_METHOD': 'DELETE'})
1771+ req = Request.blank('/sda1/p/a/c',
1772+ environ={'REQUEST_METHOD': 'DELETE'})
1773 resp = self.object_controller.DELETE(req)
1774 self.assertEquals(resp.status_int, 400)
1775
1776@@ -916,21 +941,26 @@
1777 def test_disk_file_mkstemp_creates_dir(self):
1778 tmpdir = os.path.join(self.testdir, 'sda1', 'tmp')
1779 os.rmdir(tmpdir)
1780- with object_server.DiskFile(self.testdir, 'sda1', '0', 'a', 'c', 'o').mkstemp():
1781+ with object_server.DiskFile(self.testdir, 'sda1', '0', 'a', 'c',
1782+ 'o').mkstemp():
1783 self.assert_(os.path.exists(tmpdir))
1784
1785 def test_max_upload_time(self):
1786 if not self.path_to_test_xfs:
1787 raise SkipTest
1788+
1789 class SlowBody():
1790+
1791 def __init__(self):
1792 self.sent = 0
1793+
1794 def read(self, size=-1):
1795 if self.sent < 4:
1796 sleep(0.1)
1797 self.sent += 1
1798 return ' '
1799 return ''
1800+
1801 req = Request.blank('/sda1/p/a/c/o',
1802 environ={'REQUEST_METHOD': 'PUT', 'wsgi.input': SlowBody()},
1803 headers={'X-Timestamp': normalize_timestamp(time()),
1804@@ -946,14 +976,18 @@
1805 self.assertEquals(resp.status_int, 408)
1806
1807 def test_short_body(self):
1808+
1809 class ShortBody():
1810+
1811 def __init__(self):
1812 self.sent = False
1813+
1814 def read(self, size=-1):
1815 if not self.sent:
1816 self.sent = True
1817 return ' '
1818 return ''
1819+
1820 req = Request.blank('/sda1/p/a/c/o',
1821 environ={'REQUEST_METHOD': 'PUT', 'wsgi.input': ShortBody()},
1822 headers={'X-Timestamp': normalize_timestamp(time()),
1823@@ -1001,11 +1035,37 @@
1824 resp = self.object_controller.GET(req)
1825 self.assertEquals(resp.status_int, 200)
1826 self.assertEquals(resp.headers['content-encoding'], 'gzip')
1827- req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'HEAD'})
1828+ req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD':
1829+ 'HEAD'})
1830 resp = self.object_controller.HEAD(req)
1831 self.assertEquals(resp.status_int, 200)
1832 self.assertEquals(resp.headers['content-encoding'], 'gzip')
1833
1834+ def test_manifest_header(self):
1835+ if not self.path_to_test_xfs:
1836+ raise SkipTest
1837+ timestamp = normalize_timestamp(time())
1838+ req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'PUT'},
1839+ headers={'X-Timestamp': timestamp,
1840+ 'Content-Type': 'text/plain',
1841+ 'Content-Length': '0',
1842+ 'X-Object-Manifest': 'c/o/'})
1843+ resp = self.object_controller.PUT(req)
1844+ self.assertEquals(resp.status_int, 201)
1845+ objfile = os.path.join(self.testdir, 'sda1',
1846+ storage_directory(object_server.DATADIR, 'p', hash_path('a', 'c',
1847+ 'o')), timestamp + '.data')
1848+ self.assert_(os.path.isfile(objfile))
1849+ self.assertEquals(pickle.loads(getxattr(objfile,
1850+ object_server.METADATA_KEY)), {'X-Timestamp': timestamp,
1851+ 'Content-Length': '0', 'Content-Type': 'text/plain', 'name':
1852+ '/a/c/o', 'X-Object-Manifest': 'c/o/', 'ETag':
1853+ 'd41d8cd98f00b204e9800998ecf8427e'})
1854+ req = Request.blank('/sda1/p/a/c/o', environ={'REQUEST_METHOD': 'GET'})
1855+ resp = self.object_controller.GET(req)
1856+ self.assertEquals(resp.status_int, 200)
1857+ self.assertEquals(resp.headers.get('x-object-manifest'), 'c/o/')
1858+
1859
1860 if __name__ == '__main__':
1861 unittest.main()
1862
1863=== modified file 'test/unit/proxy/test_server.py'
1864--- test/unit/proxy/test_server.py 2010-12-06 20:02:43 +0000
1865+++ test/unit/proxy/test_server.py 2010-12-16 18:53:43 +0000
1866@@ -35,8 +35,8 @@
1867 from eventlet import sleep, spawn, TimeoutError, util, wsgi, listen
1868 from eventlet.timeout import Timeout
1869 import simplejson
1870-from webob import Request
1871-from webob.exc import HTTPUnauthorized
1872+from webob import Request, Response
1873+from webob.exc import HTTPNotFound, HTTPUnauthorized
1874
1875 from test.unit import connect_tcp, readuntil2crlfs
1876 from swift.proxy import server as proxy_server
1877@@ -53,7 +53,9 @@
1878
1879
1880 def fake_http_connect(*code_iter, **kwargs):
1881+
1882 class FakeConn(object):
1883+
1884 def __init__(self, status, etag=None, body=''):
1885 self.status = status
1886 self.reason = 'Fake'
1887@@ -160,6 +162,7 @@
1888
1889
1890 class FakeMemcache(object):
1891+
1892 def __init__(self):
1893 self.store = {}
1894
1895@@ -372,9 +375,12 @@
1896 class TestProxyServer(unittest.TestCase):
1897
1898 def test_unhandled_exception(self):
1899+
1900 class MyApp(proxy_server.Application):
1901+
1902 def get_controller(self, path):
1903 raise Exception('this shouldnt be caught')
1904+
1905 app = MyApp(None, FakeMemcache(), account_ring=FakeRing(),
1906 container_ring=FakeRing(), object_ring=FakeRing())
1907 req = Request.blank('/account', environ={'REQUEST_METHOD': 'HEAD'})
1908@@ -497,8 +503,11 @@
1909 test_status_map((200, 200, 204, 500, 404), 503)
1910
1911 def test_PUT_connect_exceptions(self):
1912+
1913 def mock_http_connect(*code_iter, **kwargs):
1914+
1915 class FakeConn(object):
1916+
1917 def __init__(self, status):
1918 self.status = status
1919 self.reason = 'Fake'
1920@@ -518,6 +527,7 @@
1921 if self.status == -3:
1922 return FakeConn(507)
1923 return FakeConn(100)
1924+
1925 code_iter = iter(code_iter)
1926
1927 def connect(*args, **ckwargs):
1928@@ -525,7 +535,9 @@
1929 if status == -1:
1930 raise HTTPException()
1931 return FakeConn(status)
1932+
1933 return connect
1934+
1935 with save_globals():
1936 controller = proxy_server.ObjectController(self.app, 'account',
1937 'container', 'object')
1938@@ -546,8 +558,11 @@
1939 test_status_map((200, 200, 503, 503, -1), 503)
1940
1941 def test_PUT_send_exceptions(self):
1942+
1943 def mock_http_connect(*code_iter, **kwargs):
1944+
1945 class FakeConn(object):
1946+
1947 def __init__(self, status):
1948 self.status = status
1949 self.reason = 'Fake'
1950@@ -611,8 +626,11 @@
1951 self.assertEquals(res.status_int, 413)
1952
1953 def test_PUT_getresponse_exceptions(self):
1954+
1955 def mock_http_connect(*code_iter, **kwargs):
1956+
1957 class FakeConn(object):
1958+
1959 def __init__(self, status):
1960 self.status = status
1961 self.reason = 'Fake'
1962@@ -807,6 +825,7 @@
1963 dev['port'] = 1
1964
1965 class SlowBody():
1966+
1967 def __init__(self):
1968 self.sent = 0
1969
1970@@ -816,6 +835,7 @@
1971 self.sent += 1
1972 return ' '
1973 return ''
1974+
1975 req = Request.blank('/a/c/o',
1976 environ={'REQUEST_METHOD': 'PUT', 'wsgi.input': SlowBody()},
1977 headers={'Content-Length': '4', 'Content-Type': 'text/plain'})
1978@@ -854,11 +874,13 @@
1979 dev['port'] = 1
1980
1981 class SlowBody():
1982+
1983 def __init__(self):
1984 self.sent = 0
1985
1986 def read(self, size=-1):
1987 raise Exception('Disconnected')
1988+
1989 req = Request.blank('/a/c/o',
1990 environ={'REQUEST_METHOD': 'PUT', 'wsgi.input': SlowBody()},
1991 headers={'Content-Length': '4', 'Content-Type': 'text/plain'})
1992@@ -1508,7 +1530,9 @@
1993
1994 def test_chunked_put(self):
1995 # quick test of chunked put w/o PATH_TO_TEST_XFS
1996+
1997 class ChunkedFile():
1998+
1999 def __init__(self, bytes):
2000 self.bytes = bytes
2001 self.read_bytes = 0
2002@@ -1576,6 +1600,7 @@
2003 mkdirs(os.path.join(testdir, 'sdb1'))
2004 mkdirs(os.path.join(testdir, 'sdb1', 'tmp'))
2005 try:
2006+ orig_container_listing_limit = proxy_server.CONTAINER_LISTING_LIMIT
2007 conf = {'devices': testdir, 'swift_dir': testdir,
2008 'mount_check': 'false'}
2009 prolis = listen(('localhost', 0))
2010@@ -1669,8 +1694,10 @@
2011 self.assertEquals(headers[:len(exp)], exp)
2012 # Check unhandled exception
2013 orig_update_request = prosrv.update_request
2014+
2015 def broken_update_request(env, req):
2016 raise Exception('fake')
2017+
2018 prosrv.update_request = broken_update_request
2019 sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2020 fd = sock.makefile()
2021@@ -1719,8 +1746,10 @@
2022 # in a test for logging x-forwarded-for (first entry only).
2023
2024 class Logger(object):
2025+
2026 def info(self, msg):
2027 self.msg = msg
2028+
2029 orig_logger = prosrv.logger
2030 prosrv.logger = Logger()
2031 sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2032@@ -1742,8 +1771,10 @@
2033 # Turn on header logging.
2034
2035 class Logger(object):
2036+
2037 def info(self, msg):
2038 self.msg = msg
2039+
2040 orig_logger = prosrv.logger
2041 prosrv.logger = Logger()
2042 prosrv.log_headers = True
2043@@ -1900,6 +1931,70 @@
2044 self.assertEquals(headers[:len(exp)], exp)
2045 body = fd.read()
2046 self.assertEquals(body, 'oh hai123456789abcdef')
2047+ # Create a container for our segmented/manifest object testing
2048+ sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2049+ fd = sock.makefile()
2050+ fd.write('PUT /v1/a/segmented HTTP/1.1\r\nHost: localhost\r\n'
2051+ 'Connection: close\r\nX-Storage-Token: t\r\n'
2052+ 'Content-Length: 0\r\n\r\n')
2053+ fd.flush()
2054+ headers = readuntil2crlfs(fd)
2055+ exp = 'HTTP/1.1 201'
2056+ self.assertEquals(headers[:len(exp)], exp)
2057+ # Create the object segments
2058+ for segment in xrange(5):
2059+ sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2060+ fd = sock.makefile()
2061+ fd.write('PUT /v1/a/segmented/name/%s HTTP/1.1\r\nHost: '
2062+ 'localhost\r\nConnection: close\r\nX-Storage-Token: '
2063+ 't\r\nContent-Length: 5\r\n\r\n1234 ' % str(segment))
2064+ fd.flush()
2065+ headers = readuntil2crlfs(fd)
2066+ exp = 'HTTP/1.1 201'
2067+ self.assertEquals(headers[:len(exp)], exp)
2068+ # Create the object manifest file
2069+ sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2070+ fd = sock.makefile()
2071+ fd.write('PUT /v1/a/segmented/name HTTP/1.1\r\nHost: '
2072+ 'localhost\r\nConnection: close\r\nX-Storage-Token: '
2073+ 't\r\nContent-Length: 0\r\nX-Object-Manifest: '
2074+ 'segmented/name/\r\nContent-Type: text/jibberish\r\n\r\n')
2075+ fd.flush()
2076+ headers = readuntil2crlfs(fd)
2077+ exp = 'HTTP/1.1 201'
2078+ self.assertEquals(headers[:len(exp)], exp)
2079+ # Ensure retrieving the manifest file gets the whole object
2080+ sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2081+ fd = sock.makefile()
2082+ fd.write('GET /v1/a/segmented/name HTTP/1.1\r\nHost: '
2083+ 'localhost\r\nConnection: close\r\nX-Auth-Token: '
2084+ 't\r\n\r\n')
2085+ fd.flush()
2086+ headers = readuntil2crlfs(fd)
2087+ exp = 'HTTP/1.1 200'
2088+ self.assertEquals(headers[:len(exp)], exp)
2089+ self.assert_('X-Object-Manifest: segmented/name/' in headers)
2090+ self.assert_('Content-Type: text/jibberish' in headers)
2091+ body = fd.read()
2092+ self.assertEquals(body, '1234 1234 1234 1234 1234 ')
2093+ # Do it again but exceeding the container listing limit
2094+ proxy_server.CONTAINER_LISTING_LIMIT = 2
2095+ sock = connect_tcp(('localhost', prolis.getsockname()[1]))
2096+ fd = sock.makefile()
2097+ fd.write('GET /v1/a/segmented/name HTTP/1.1\r\nHost: '
2098+ 'localhost\r\nConnection: close\r\nX-Auth-Token: '
2099+ 't\r\n\r\n')
2100+ fd.flush()
2101+ headers = readuntil2crlfs(fd)
2102+ exp = 'HTTP/1.1 200'
2103+ self.assertEquals(headers[:len(exp)], exp)
2104+ self.assert_('X-Object-Manifest: segmented/name/' in headers)
2105+ self.assert_('Content-Type: text/jibberish' in headers)
2106+ body = fd.read()
2107+ # A bit fragile of a test; as it makes the assumption that all
2108+ # will be sent in a single chunk.
2109+ self.assertEquals(body,
2110+ '19\r\n1234 1234 1234 1234 1234 \r\n0\r\n\r\n')
2111 finally:
2112 prospa.kill()
2113 acc1spa.kill()
2114@@ -1909,6 +2004,7 @@
2115 obj1spa.kill()
2116 obj2spa.kill()
2117 finally:
2118+ proxy_server.CONTAINER_LISTING_LIMIT = orig_container_listing_limit
2119 rmtree(testdir)
2120
2121 def test_mismatched_etags(self):
2122@@ -2111,6 +2207,7 @@
2123 res = controller.COPY(req)
2124 self.assert_(called[0])
2125
2126+
2127 class TestContainerController(unittest.TestCase):
2128 "Test swift.proxy_server.ContainerController"
2129
2130@@ -2254,7 +2351,9 @@
2131 self.assertEquals(resp.status_int, 404)
2132
2133 def test_put_locking(self):
2134+
2135 class MockMemcache(FakeMemcache):
2136+
2137 def __init__(self, allow_lock=None):
2138 self.allow_lock = allow_lock
2139 super(MockMemcache, self).__init__()
2140@@ -2265,6 +2364,7 @@
2141 yield True
2142 else:
2143 raise MemcacheLockError()
2144+
2145 with save_globals():
2146 controller = proxy_server.ContainerController(self.app, 'account',
2147 'container')
2148@@ -2870,5 +2970,261 @@
2149 test_status_map((204, 500, 404), 503)
2150
2151
2152+class FakeObjectController(object):
2153+
2154+ def __init__(self):
2155+ self.app = self
2156+ self.logger = self
2157+ self.account_name = 'a'
2158+ self.container_name = 'c'
2159+ self.object_name = 'o'
2160+ self.trans_id = 'tx1'
2161+ self.object_ring = FakeRing()
2162+ self.node_timeout = 1
2163+
2164+ def exception(self, *args):
2165+ self.exception_args = args
2166+ self.exception_info = sys.exc_info()
2167+
2168+ def GETorHEAD_base(self, *args):
2169+ self.GETorHEAD_base_args = args
2170+ req = args[0]
2171+ path = args[4]
2172+ body = data = path[-1] * int(path[-1])
2173+ if req.range and req.range.ranges:
2174+ body = ''
2175+ for start, stop in req.range.ranges:
2176+ body += data[start:stop]
2177+ resp = Response(app_iter=iter(body))
2178+ return resp
2179+
2180+ def iter_nodes(self, partition, nodes, ring):
2181+ for node in nodes:
2182+ yield node
2183+ for node in ring.get_more_nodes(partition):
2184+ yield node
2185+
2186+
2187+class Stub(object):
2188+ pass
2189+
2190+
2191+class TestSegmentedIterable(unittest.TestCase):
2192+
2193+ def setUp(self):
2194+ self.controller = FakeObjectController()
2195+
2196+ def test_load_next_segment_unexpected_error(self):
2197+ # Iterator value isn't a dict
2198+ self.assertRaises(Exception,
2199+ proxy_server.SegmentedIterable(self.controller, None,
2200+ [None])._load_next_segment)
2201+ self.assertEquals(self.controller.exception_args[0],
2202+ 'ERROR: While processing manifest /a/c/o tx1')
2203+
2204+ def test_load_next_segment_with_no_segments(self):
2205+ self.assertRaises(StopIteration,
2206+ proxy_server.SegmentedIterable(self.controller, 'lc',
2207+ [])._load_next_segment)
2208+
2209+ def test_load_next_segment_with_one_segment(self):
2210+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2211+ 'o1'}])
2212+ segit._load_next_segment()
2213+ self.assertEquals(self.controller.GETorHEAD_base_args[4], '/a/lc/o1')
2214+ data = ''.join(segit.segment_iter)
2215+ self.assertEquals(data, '1')
2216+
2217+ def test_load_next_segment_with_two_segments(self):
2218+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2219+ 'o1'}, {'name': 'o2'}])
2220+ segit._load_next_segment()
2221+ self.assertEquals(self.controller.GETorHEAD_base_args[4], '/a/lc/o1')
2222+ data = ''.join(segit.segment_iter)
2223+ self.assertEquals(data, '1')
2224+ segit._load_next_segment()
2225+ self.assertEquals(self.controller.GETorHEAD_base_args[4], '/a/lc/o2')
2226+ data = ''.join(segit.segment_iter)
2227+ self.assertEquals(data, '22')
2228+
2229+ def test_load_next_segment_with_two_segments_skip_first(self):
2230+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2231+ 'o1'}, {'name': 'o2'}])
2232+ segit.segment = 0
2233+ segit.listing.next()
2234+ segit._load_next_segment()
2235+ self.assertEquals(self.controller.GETorHEAD_base_args[4], '/a/lc/o2')
2236+ data = ''.join(segit.segment_iter)
2237+ self.assertEquals(data, '22')
2238+
2239+ def test_load_next_segment_with_seek(self):
2240+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2241+ 'o1'}, {'name': 'o2'}])
2242+ segit.segment = 0
2243+ segit.listing.next()
2244+ segit.seek = 1
2245+ segit._load_next_segment()
2246+ self.assertEquals(self.controller.GETorHEAD_base_args[4], '/a/lc/o2')
2247+ self.assertEquals(str(self.controller.GETorHEAD_base_args[0].range),
2248+ 'bytes=1-')
2249+ data = ''.join(segit.segment_iter)
2250+ self.assertEquals(data, '2')
2251+
2252+ def test_load_next_segment_with_get_error(self):
2253+
2254+ def local_GETorHEAD_base(*args):
2255+ return HTTPNotFound()
2256+
2257+ self.controller.GETorHEAD_base = local_GETorHEAD_base
2258+ self.assertRaises(Exception,
2259+ proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2260+ 'o1'}])._load_next_segment)
2261+ self.assertEquals(self.controller.exception_args[0],
2262+ 'ERROR: While processing manifest /a/c/o tx1')
2263+ self.assertEquals(str(self.controller.exception_info[1]),
2264+ 'Could not load object segment /a/lc/o1: 404')
2265+
2266+ def test_iter_unexpected_error(self):
2267+ # Iterator value isn't a dict
2268+ self.assertRaises(Exception, ''.join,
2269+ proxy_server.SegmentedIterable(self.controller, None, [None]))
2270+ self.assertEquals(self.controller.exception_args[0],
2271+ 'ERROR: While processing manifest /a/c/o tx1')
2272+
2273+ def test_iter_with_no_segments(self):
2274+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [])
2275+ self.assertEquals(''.join(segit), '')
2276+
2277+ def test_iter_with_one_segment(self):
2278+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2279+ 'o1'}])
2280+ segit.response = Stub()
2281+ self.assertEquals(''.join(segit), '1')
2282+ self.assertEquals(segit.response.bytes_transferred, 1)
2283+
2284+ def test_iter_with_two_segments(self):
2285+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2286+ 'o1'}, {'name': 'o2'}])
2287+ segit.response = Stub()
2288+ self.assertEquals(''.join(segit), '122')
2289+ self.assertEquals(segit.response.bytes_transferred, 3)
2290+
2291+ def test_iter_with_get_error(self):
2292+
2293+ def local_GETorHEAD_base(*args):
2294+ return HTTPNotFound()
2295+
2296+ self.controller.GETorHEAD_base = local_GETorHEAD_base
2297+ self.assertRaises(Exception, ''.join,
2298+ proxy_server.SegmentedIterable(self.controller, 'lc', [{'name':
2299+ 'o1'}]))
2300+ self.assertEquals(self.controller.exception_args[0],
2301+ 'ERROR: While processing manifest /a/c/o tx1')
2302+ self.assertEquals(str(self.controller.exception_info[1]),
2303+ 'Could not load object segment /a/lc/o1: 404')
2304+
2305+ def test_app_iter_range_unexpected_error(self):
2306+ # Iterator value isn't a dict
2307+ self.assertRaises(Exception,
2308+ proxy_server.SegmentedIterable(self.controller, None,
2309+ [None]).app_iter_range(None, None).next)
2310+ self.assertEquals(self.controller.exception_args[0],
2311+ 'ERROR: While processing manifest /a/c/o tx1')
2312+
2313+ def test_app_iter_range_with_no_segments(self):
2314+ self.assertEquals(''.join(proxy_server.SegmentedIterable(
2315+ self.controller, 'lc', []).app_iter_range(None, None)), '')
2316+ self.assertEquals(''.join(proxy_server.SegmentedIterable(
2317+ self.controller, 'lc', []).app_iter_range(3, None)), '')
2318+ self.assertEquals(''.join(proxy_server.SegmentedIterable(
2319+ self.controller, 'lc', []).app_iter_range(3, 5)), '')
2320+ self.assertEquals(''.join(proxy_server.SegmentedIterable(
2321+ self.controller, 'lc', []).app_iter_range(None, 5)), '')
2322+
2323+ def test_app_iter_range_with_one_segment(self):
2324+ listing = [{'name': 'o1', 'bytes': 1}]
2325+
2326+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2327+ segit.response = Stub()
2328+ self.assertEquals(''.join(segit.app_iter_range(None, None)), '1')
2329+ self.assertEquals(segit.response.bytes_transferred, 1)
2330+
2331+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2332+ self.assertEquals(''.join(segit.app_iter_range(3, None)), '')
2333+
2334+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2335+ self.assertEquals(''.join(segit.app_iter_range(3, 5)), '')
2336+
2337+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2338+ segit.response = Stub()
2339+ self.assertEquals(''.join(segit.app_iter_range(None, 5)), '1')
2340+ self.assertEquals(segit.response.bytes_transferred, 1)
2341+
2342+ def test_app_iter_range_with_two_segments(self):
2343+ listing = [{'name': 'o1', 'bytes': 1}, {'name': 'o2', 'bytes': 2}]
2344+
2345+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2346+ segit.response = Stub()
2347+ self.assertEquals(''.join(segit.app_iter_range(None, None)), '122')
2348+ self.assertEquals(segit.response.bytes_transferred, 3)
2349+
2350+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2351+ segit.response = Stub()
2352+ self.assertEquals(''.join(segit.app_iter_range(1, None)), '22')
2353+ self.assertEquals(segit.response.bytes_transferred, 2)
2354+
2355+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2356+ segit.response = Stub()
2357+ self.assertEquals(''.join(segit.app_iter_range(1, 5)), '22')
2358+ self.assertEquals(segit.response.bytes_transferred, 2)
2359+
2360+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2361+ segit.response = Stub()
2362+ self.assertEquals(''.join(segit.app_iter_range(None, 2)), '12')
2363+ self.assertEquals(segit.response.bytes_transferred, 2)
2364+
2365+ def test_app_iter_range_with_many_segments(self):
2366+ listing = [{'name': 'o1', 'bytes': 1}, {'name': 'o2', 'bytes': 2},
2367+ {'name': 'o3', 'bytes': 3}, {'name': 'o4', 'bytes': 4}, {'name':
2368+ 'o5', 'bytes': 5}]
2369+
2370+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2371+ segit.response = Stub()
2372+ self.assertEquals(''.join(segit.app_iter_range(None, None)),
2373+ '122333444455555')
2374+ self.assertEquals(segit.response.bytes_transferred, 15)
2375+
2376+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2377+ segit.response = Stub()
2378+ self.assertEquals(''.join(segit.app_iter_range(3, None)),
2379+ '333444455555')
2380+ self.assertEquals(segit.response.bytes_transferred, 12)
2381+
2382+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2383+ segit.response = Stub()
2384+ self.assertEquals(''.join(segit.app_iter_range(5, None)), '3444455555')
2385+ self.assertEquals(segit.response.bytes_transferred, 10)
2386+
2387+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2388+ segit.response = Stub()
2389+ self.assertEquals(''.join(segit.app_iter_range(None, 6)), '122333')
2390+ self.assertEquals(segit.response.bytes_transferred, 6)
2391+
2392+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2393+ segit.response = Stub()
2394+ self.assertEquals(''.join(segit.app_iter_range(None, 7)), '1223334')
2395+ self.assertEquals(segit.response.bytes_transferred, 7)
2396+
2397+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2398+ segit.response = Stub()
2399+ self.assertEquals(''.join(segit.app_iter_range(3, 7)), '3334')
2400+ self.assertEquals(segit.response.bytes_transferred, 4)
2401+
2402+ segit = proxy_server.SegmentedIterable(self.controller, 'lc', listing)
2403+ segit.response = Stub()
2404+ self.assertEquals(''.join(segit.app_iter_range(5, 7)), '34')
2405+ self.assertEquals(segit.response.bytes_transferred, 2)
2406+
2407+
2408 if __name__ == '__main__':
2409 unittest.main()