Merge lp:~gaurav-gangalwar/swift/copy_manifest into lp:~hudson-openstack/swift/trunk

Proposed by Gaurav Gangalwar
Status: Rejected
Rejected by: John Dickinson
Proposed branch: lp:~gaurav-gangalwar/swift/copy_manifest
Merge into: lp:~hudson-openstack/swift/trunk
Diff against target: 75 lines (+21/-6)
3 files modified
swift/common/utils.py (+7/-0)
swift/proxy/server.py (+13/-5)
test/unit/proxy/test_server.py (+1/-1)
To merge this branch: bzr merge lp:~gaurav-gangalwar/swift/copy_manifest
Reviewer Review Type Date Requested Status
gholt (community) Disapprove
Review via email: mp+70436@code.launchpad.net

Description of the change

We are consolidating the segments of manifest object in COPY operation.
This will not work if the object is large object (>5GB).
It will also affect manifest objects on POST operations in "post_as_copy".

I have made the changes to honour segmentation in COPY operation also.

To post a comment you must log in.
Revision history for this message
gholt (gholt) wrote :

Actually, that was sort of the point of the way the code currently works.

A COPY of a segmented object will make a new consolidated version of the object, if the total length is less than the single object cluster max (5G).

If you just want another segmented object copy, it's pretty quick and easy to PUT a new manifest file with X-Object-Manifest pointing to the same segments list.

review: Disapprove
Revision history for this message
Gaurav Gangalwar (gaurav-gangalwar) wrote :

Ok..but is there any specific reason for this kind of behaviour?
and what will be the behaviour in case of large objects?

On Thu, Aug 4, 2011 at 7:04 PM, gholt <email address hidden> wrote:

> Review: Disapprove
> Actually, that was sort of the point of the way the code currently works.
>
> A COPY of a segmented object will make a new consolidated version of the
> object, if the total length is less than the single object cluster max (5G).
>
> If you just want another segmented object copy, it's pretty quick and easy
> to PUT a new manifest file with X-Object-Manifest pointing to the same
> segments list.
> --
>
> https://code.launchpad.net/~gaurav-gangalwar/swift/copy_manifest/+merge/70436<https://code.launchpad.net/%7Egaurav-gangalwar/swift/copy_manifest/+merge/70436>
> You are the owner of lp:~gaurav-gangalwar/swift/copy_manifest.
>

Revision history for this message
gholt (gholt) wrote :

Yes, the specific reason is that there are a few things you may want to do with an existing segmented object:

1) Make a "shallow" copy where there's just a new manifest file pointing to the same segments.

2) Make a "deep" copy where the segments are all copied to new locations as well as a new manifest.

3) Make a "consolidated" copy where a bunch of segments are concatenated into one new object. The original object may have been segmented to make uploading easier (smaller chunks) or faster (parallel segment uploading) but the final object might be better to work with consolidated.

For 1) it is as simple as a 0-byte PUT with X-Object-Manifest set to the same segments path.

For 2) you have to copy each segment independently and then do the new 0-byte PUT with the X-Object-Manifest set to the new segments path.

For 3) you can do a COPY from the segmented object and the Destination will end up with a single segment object.

If you make COPY do essentially what 1) is, there is no way to do 3) server-side.

Revision history for this message
Gaurav Gangalwar (gaurav-gangalwar) wrote :

Ok, but in case(3) if object is copied to itself, we are keeping segments as
it is, even after consolidating the object data, this means user has to
explicitly delete those segments, right?
So user has to delete segments after POST and COPY operations.
But after POST and COPY manifest info is lost, so how user will be able to
delete those segments?

Thanks,
Gaurav

On Thu, Aug 4, 2011 at 7:41 PM, gholt <email address hidden> wrote:

> Yes, the specific reason is that there are a few things you may want to do
> with an existing segmented object:
>
> 1) Make a "shallow" copy where there's just a new manifest file pointing to
> the same segments.
>
> 2) Make a "deep" copy where the segments are all copied to new locations as
> well as a new manifest.
>
> 3) Make a "consolidated" copy where a bunch of segments are concatenated
> into one new object. The original object may have been segmented to make
> uploading easier (smaller chunks) or faster (parallel segment uploading) but
> the final object might be better to work with consolidated.
>
> For 1) it is as simple as a 0-byte PUT with X-Object-Manifest set to the
> same segments path.
>
> For 2) you have to copy each segment independently and then do the new
> 0-byte PUT with the X-Object-Manifest set to the new segments path.
>
> For 3) you can do a COPY from the segmented object and the Destination will
> end up with a single segment object.
>
> If you make COPY do essentially what 1) is, there is no way to do 3)
> server-side.
>
> --
>
> https://code.launchpad.net/~gaurav-gangalwar/swift/copy_manifest/+merge/70436<https://code.launchpad.net/%7Egaurav-gangalwar/swift/copy_manifest/+merge/70436>
> You are the owner of lp:~gaurav-gangalwar/swift/copy_manifest.
>

Revision history for this message
gholt (gholt) wrote :

Segments are really just regular objects, so they can be deleted however, they will show up in listings where they were uploaded whether or not a manifest points to them.

Revision history for this message
Gaurav Gangalwar (gaurav-gangalwar) wrote :

Ok, just thought of deleting these segments after consolidation, it will
make users life easy and also save that extra disk space.

Thanks,
Gaurav

On Thu, Aug 4, 2011 at 9:46 PM, gholt <email address hidden> wrote:

> Segments are really just regular objects, so they can be deleted however,
> they will show up in listings where they were uploaded whether or not a
> manifest points to them.
> --
>
> https://code.launchpad.net/~gaurav-gangalwar/swift/copy_manifest/+merge/70436<https://code.launchpad.net/%7Egaurav-gangalwar/swift/copy_manifest/+merge/70436>
> You are the owner of lp:~gaurav-gangalwar/swift/copy_manifest.
>

Unmerged revisions

338. By <email address hidden>

Don't consolidate segments for manifest object copy.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'swift/common/utils.py'
2--- swift/common/utils.py 2011-07-22 19:58:22 +0000
3+++ swift/common/utils.py 2011-08-04 13:24:39 +0000
4@@ -1011,3 +1011,10 @@
5 if index == -1:
6 return '%d' % value
7 return '%d%si' % (round(value), suffixes[index])
8+
9+
10+
11+def dummy_app_iter(data):
12+ for index in range(len(data)):
13+ yield data[index]
14+
15
16=== modified file 'swift/proxy/server.py'
17--- swift/proxy/server.py 2011-07-14 18:43:08 +0000
18+++ swift/proxy/server.py 2011-08-04 13:24:39 +0000
19@@ -42,7 +42,8 @@
20
21 from swift.common.ring import Ring
22 from swift.common.utils import cache_from_env, ContextPool, get_logger, \
23- get_remote_client, normalize_timestamp, split_path, TRUE_VALUES
24+ get_remote_client, normalize_timestamp, split_path, TRUE_VALUES, \
25+ dummy_app_iter
26 from swift.common.bufferedhttp import http_connect
27 from swift.common.constraints import check_metadata, check_object_creation, \
28 check_utf8, CONTAINER_LISTING_LIMIT, MAX_ACCOUNT_NAME_LENGTH, \
29@@ -874,7 +875,6 @@
30 resp.content_length = content_length
31 resp.last_modified = last_modified
32 resp.headers['accept-ranges'] = 'bytes'
33-
34 return resp
35
36 @public
37@@ -1044,14 +1044,22 @@
38 self.container_name = orig_container_name
39 new_req = Request.blank(req.path_info,
40 environ=req.environ, headers=req.headers)
41- data_source = source_resp.app_iter
42- new_req.content_length = source_resp.content_length
43- if new_req.content_length is None:
44+
45+ if source_resp.content_length is None:
46 # This indicates a transfer-encoding: chunked source object,
47 # which currently only happens because there are more than
48 # CONTAINER_LISTING_LIMIT segments in a segmented object. In
49 # this case, we're going to refuse to do the server-side copy.
50 return HTTPRequestEntityTooLarge(request=req)
51+
52+ if 'x-object-manifest' not in source_resp.headers:
53+ data_source = source_resp.app_iter
54+ new_req.content_length = source_resp.content_length
55+ else:
56+ new_req.headers['X-Object-Manifest'] = source_resp.headers['x-object-manifest']
57+ data_source = source_resp.app_iter = dummy_app_iter('')
58+ new_req.content_length = 0
59+
60 new_req.etag = source_resp.etag
61 # we no longer need the X-Copy-From header
62 del new_req.headers['X-Copy-From']
63
64=== modified file 'test/unit/proxy/test_server.py'
65--- test/unit/proxy/test_server.py 2011-07-22 17:54:54 +0000
66+++ test/unit/proxy/test_server.py 2011-08-04 13:24:39 +0000
67@@ -2512,7 +2512,7 @@
68 headers = readuntil2crlfs(fd)
69 exp = 'HTTP/1.1 200'
70 self.assertEquals(headers[:len(exp)], exp)
71- self.assert_('x-object-manifest:' not in headers.lower())
72+ self.assert_('x-object-manifest:' in headers.lower())
73 self.assert_('Content-Length: 25\r' in headers)
74 body = fd.read()
75 self.assertEquals(body, '1234 1234 1234 1234 1234 ')