Merge lp:~statik/txaws/here-have-some-s4 into lp:~txawsteam/txaws/trunk

Proposed by Elliot Murphy
Status: Work in progress
Proposed branch: lp:~statik/txaws/here-have-some-s4
Merge into: lp:~txawsteam/txaws/trunk
Diff against target: None lines
To merge this branch: bzr merge lp:~statik/txaws/here-have-some-s4
Reviewer Review Type Date Requested Status
Original txAWS Team Pending
Review via email: mp+10388@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Elliot Murphy (statik) wrote :

The ubuntu one team would like to contribute the S4 (Simple Storage Service Simulator) code that we use as a stub to test against when developing software that relies on S3.

Revision history for this message
Robert Collins (lifeless) wrote :
Download full text (5.0 KiB)

Wow, code explosion. Uhm, This perhaps would be easier to review in
parts...

Firstly, the module should be txaws.storage.simulator, I think. Or
storage.server.simulator.

Secondly, copyright headers. txaws does a much leaner one;

# Copyright (C) 2009 $PUT_YOUR_NAMES_HERE
# Licenced under the txaws licence available at /LICENSE in the txaws
source.

Please use that - it prevents skew between different modules.

See txaws/ec2/client.py, for instance.

Thirdly, I'm not sure what python versions we're aiming for. I note that
the simulator code depends on 'with', which carries an implicit version
requirement. Please document the version that the simulator supports
both in /README and perhaps the docstring for the simulator.

On Wed, 2009-08-19 at 14:45 +0000, Elliot Murphy wrote:
> === modified file 'README'
> --- README 2009-04-26 08:32:36 +0000
> +++ README 2009-08-19 14:36:56 +0000
> @@ -7,6 +7,10 @@
>
> * The epsilon python package (python-epsilon on Debian or similar
> systems)
>
> +* The S4 test server has a dependency on boto (python-boto) on Debian
> or similar)
> + This dependency should go away in favor of using txaws
> infrastructure (s4 was
> + originally developed separately from txaws)

This is a problem :). I'd much rather see code land without having a
boto dependency at any point: boto is rather ugly, and the code will
likely be a lot nicer right from the get-go if we don't have it.

> === added file 'txaws/s4/README'

see above - the location doesn't fit with the txaws source layout. This
is a storage server module (contrast with txaws.storage.client).

Having a README in a python module is odd. The content would be better
put in the __init__.py's docstring, so that pydoctor etc can show it.

> --- txaws/s4/README 1970-01-01 00:00:00 +0000
> +++ txaws/s4/README 2009-08-19 14:36:56 +0000
> @@ -0,0 +1,30 @@
> +S4 - a S3 storage system stub
> +=============================
> +
> +the server comes with some sample scripts so you can see how to use
> it.
> +
> +Using twistd
> +------------
> +
> +to start: ./start-s4.sh
> +to stop: ./stop-s4.sh
> +
> +the sample S4.tac defaults to port 8080. if you want to change that
> you can create your own S4.tac.

Given that this is a twisted process, it would be nice for the docs to
say
'to start: twistd -FOO -BAR -BAZ' rather than referring to a shell
script which by its nature won't work on windows.

>
> === added directory 'txaws/s4/contrib'
> === added file 'txaws/s4/contrib/S3.py'
> --- txaws/s4/contrib/S3.py 1970-01-01 00:00:00 +0000
> +++ txaws/s4/contrib/S3.py 2009-08-19 14:36:56 +0000

....

What is this file for? how is it used? It looks like a lot of
duplication with existing code in txaws.
The different (C) terms means it will need to be mentioned in some way
at the top level portion.
I suspect we need to add an AUTHORS file too.

> === added file 'txaws/s4/s4.py'
> ...+if __name__ == "__main__":
> + root = Root()
> + site = server.Site(root)
> + reactor.listenTCP(8808, site)
> + reactor.run()

I've skipped most of this file pending the boto dependency being
removed. But I thought I'd mention that this fragment above is highl...

Read more...

Revision history for this message
Elliot Murphy (statik) wrote :
Download full text (6.5 KiB)

Thanks a lot for the quick review. The code is very much in the state
it was being used internally, and I think your comments all make sense
and will improve the code. I differ on the license header thing - I
explicitly chose not to copy the existing indirect way of specifying
the license. You'd need to go back to the copyright holder to change
the license anyway, so specifying the license that way is not a good
idea IMO.

Just to set expectations, I don't expect to have time to remove the
boto dependency or work on the more involved changes requested before
Karmic ships. Duncan asked about the code and I got it published in
the spirit of jfdi; now that it is free to the world in this branch
I'm back to more pressing Karmic related hacking. I think it's
entirely reasonable for you to want the boto dependency dropped before
it's merged, I just want to be up front and explain that this branch
will probably sit for a couple of months before lucio or I or
Christian will be able to give it that level of attention. I'd
actually like to kill the whole contrib directory too.
--
Elliot Murphy

On Aug 19, 2009, at 9:45 PM, Robert Collins
<email address hidden> wrote:

> Wow, code explosion. Uhm, This perhaps would be easier to review in
> parts...
>
> Firstly, the module should be txaws.storage.simulator, I think. Or
> storage.server.simulator.
>
> Secondly, copyright headers. txaws does a much leaner one;
>
> # Copyright (C) 2009 $PUT_YOUR_NAMES_HERE
> # Licenced under the txaws licence available at /LICENSE in the txaws
> source.
>
> Please use that - it prevents skew between different modules.
>
> See txaws/ec2/client.py, for instance.
>
>
> Thirdly, I'm not sure what python versions we're aiming for. I note
> that
> the simulator code depends on 'with', which carries an implicit
> version
> requirement. Please document the version that the simulator supports
> both in /README and perhaps the docstring for the simulator.
>
>
>
> On Wed, 2009-08-19 at 14:45 +0000, Elliot Murphy wrote:
>> === modified file 'README'
>> --- README 2009-04-26 08:32:36 +0000
>> +++ README 2009-08-19 14:36:56 +0000
>> @@ -7,6 +7,10 @@
>>
>> * The epsilon python package (python-epsilon on Debian or similar
>> systems)
>>
>> +* The S4 test server has a dependency on boto (python-boto) on
>> Debian
>> or similar)
>> + This dependency should go away in favor of using txaws
>> infrastructure (s4 was
>> + originally developed separately from txaws)
>
> This is a problem :). I'd much rather see code land without having a
> boto dependency at any point: boto is rather ugly, and the code will
> likely be a lot nicer right from the get-go if we don't have it.
>
>
>> === added file 'txaws/s4/README'
>
> see above - the location doesn't fit with the txaws source layout.
> This
> is a storage server module (contrast with txaws.storage.client).
>
> Having a README in a python module is odd. The content would be better
> put in the __init__.py's docstring, so that pydoctor etc can show it.
>
>
>> --- txaws/s4/README 1970-01-01 00:00:00 +0000
>> +++ txaws/s4/README 2009-08-19 14:36:56 +0000
>> @@ -0,0 +1,30 @@
>> +S4 - a S3...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

On Thu, 2009-08-20 at 03:36 +0000, Elliot Murphy wrote:
> Thanks a lot for the quick review. The code is very much in the state
> it was being used internally, and I think your comments all make sense
> and will improve the code. I differ on the license header thing - I
> explicitly chose not to copy the existing indirect way of specifying
> the license. You'd need to go back to the copyright holder to change
> the license anyway, so specifying the license that way is not a good
> idea IMO.

Licensing is important - while we build free/open source software on the
hack of using copyright to ensure the right to copy :). I think you'll
find all the existing code in txaws uses the pithy approach; the work as
a whole is whats licensed : the centralisation isn't to make /changing/
the license easy - its to make auditing, checking and reading code
easier. Debian for instance, wants to be sure that all relevant licences
are listed; having the licence duplicated in many source files makes
that harder. There is also a DRY aspect to it.

If you believe there is a risk that the code won't be properly protected
(from what - the project licence is MIT - essentially public domain)
then we should certainly investigate that further. Otherwise, I don't
see what is gained by having the same text duplicated in each file, and
really think the shorter reference is much more pleasant. (When I first
joined the project, I'm not even sure there _was_ a license :)).

> Just to set expectations, ...o be up front and explain that this branch
> will probably sit for a couple of months before lucio or I or
> Christian will be able to give it that level of attention. I'd
> actually like to kill the whole contrib directory too.
> --

That's fine with me too - now its out there its possible for someone to
stand up and clean it up too.

-Rob

Revision history for this message
Jamu Kakar (jkakar) wrote :

There's been no movement on this for quite some time, so I'm going to
mark it as 'Work in Progress'.

Unmerged revisions

9. By Elliot Murphy

Dropping start-s4 and stop-s4, those aren't really a good fit for
including directly into txaws.

8. By Elliot Murphy

Contributing the initial version of S4 (previously unpublished, used
internally by Ubuntu One for testing since 2008).

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'README'
--- README 2009-04-26 08:32:36 +0000
+++ README 2009-08-19 14:36:56 +0000
@@ -7,6 +7,10 @@
77
8* The epsilon python package (python-epsilon on Debian or similar systems)8* The epsilon python package (python-epsilon on Debian or similar systems)
99
10* The S4 test server has a dependency on boto (python-boto) on Debian or similar)
11 This dependency should go away in favor of using txaws infrastructure (s4 was
12 originally developed separately from txaws)
13
10Things present here14Things present here
11-------------------15-------------------
1216
1317
=== added directory 'txaws/s4'
=== added file 'txaws/s4/README'
--- txaws/s4/README 1970-01-01 00:00:00 +0000
+++ txaws/s4/README 2009-08-19 14:36:56 +0000
@@ -0,0 +1,30 @@
1S4 - a S3 storage system stub
2=============================
3
4the server comes with some sample scripts so you can see how to use it.
5
6Using twistd
7------------
8
9to start: ./start-s4.sh
10to stop: ./stop-s4.sh
11
12the sample S4.tac defaults to port 8080. if you want to change that you can create your own S4.tac.
13
14For tests or inside another script
15----------------------------------
16
17see s4.tests.test_S4.S4TestBase
18
19all tests run in a random unused port.
20
21
22
23Notes:
24======
25Based on twisted
26Storage is in memory
27Its not optimal by any means, its just for testing other code.
28For now, it just implements REST put and GET
29it comes with a default /test/ bucket already created and a /size/ bucket with virtual objects the size of its name (ie, /size/100 == "0"*100)
30
031
=== added file 'txaws/s4/S4.tac'
--- txaws/s4/S4.tac 1970-01-01 00:00:00 +0000
+++ txaws/s4/S4.tac 2009-08-19 14:36:56 +0000
@@ -0,0 +1,74 @@
1# -*- python -*-
2# Copyright 2008-2009 Canonical Ltd.
3# Permission is hereby granted, free of charge, to any person obtaining
4# a copy of this software and associated documentation files (the
5# "Software"), to deal in the Software without restriction, including
6# without limitation the rights to use, copy, modify, merge, publish,
7# distribute, sublicense, and/or sell copies of the Software, and to
8# permit persons to whom the Software is furnished to do so, subject to
9# the following conditions:
10#
11# The above copyright notice and this permission notice shall be
12# included in all copies or substantial portions of the Software.
13#
14# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21
22from __future__ import with_statement
23
24import os
25import logging
26from optparse import OptionParser
27
28import twisted.web.server
29from twisted.internet import reactor
30from twisted.application import internet, service
31
32from utils import get_arbitrary_port
33from ubuntuone.config import config
34
35logger = logging.getLogger("UbuntuOne.S4")
36logger.setLevel(config.general.log_level)
37log_folder = config.general.log_folder
38log_filename = config.s_four.log_filename
39if log_folder is not None and log_filename is not None:
40 if not os.access(log_folder, os.F_OK):
41 os.mkdir(log_folder)
42 s = logging.FileHandler(os.path.join(log_folder, log_filename))
43else:
44 s = logging.StreamHandler(sys.stderr)
45s.setFormatter(logging.Formatter(config.general.log_format))
46logger.addHandler(s)
47
48from s4 import s4
49
50if config.s_four.storagepath:
51 storedir = os.path.join(config.root, config.s_four.storagepath)
52else:
53 storedir = os.path.join(config.root, "tmp", "s4storage")
54if not os.path.exists(storedir):
55 logger.debug("creating S4 storage directory %s" % storedir)
56 os.mkdir(storedir)
57application = service.Application('s4')
58root = s4.Root(storagedir=storedir)
59# make sure "the bucket" is created
60root._add_bucket(config.api_server.s3_bucket)
61site = twisted.web.server.Site(root)
62
63port = os.getenv('S4PORT', config.aws_s3.port)
64if port:
65 port = int(port)
66# we test again in case the initial value was the "0" as a string
67if not port:
68 port = get_arbitrary_port()
69
70with open(os.path.join(config.root, "tmp", "s4.port"), "w") as s4pf:
71 s4pf.write("%d\n" % port)
72
73internet.TCPServer(port, site).setServiceParent(
74 service.IServiceCollection(application))
075
=== added file 'txaws/s4/__init__.py'
--- txaws/s4/__init__.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/__init__.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,1 @@
1""" S4 - a S3 storage system stub """
02
=== added directory 'txaws/s4/contrib'
=== added file 'txaws/s4/contrib/S3.py'
--- txaws/s4/contrib/S3.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/contrib/S3.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,627 @@
1#!/usr/bin/env python
2
3# This software code is made available "AS IS" without warranties of any
4# kind. You may copy, display, modify and redistribute the software
5# code either by itself or as incorporated into your code; provided that
6# you do not remove any proprietary notices. Your use of this software
7# code is at your own risk and you waive any claim against Amazon
8# Digital Services, Inc. or its affiliates with respect to your use of
9# this software code. (c) 2006-2007 Amazon Digital Services, Inc. or its
10# affiliates.
11
12import base64
13import hmac
14import httplib
15import re
16import sha
17import sys
18import time
19import urllib
20import urlparse
21import xml.sax
22
23DEFAULT_HOST = 's3.amazonaws.com'
24PORTS_BY_SECURITY = { True: 443, False: 80 }
25METADATA_PREFIX = 'x-amz-meta-'
26AMAZON_HEADER_PREFIX = 'x-amz-'
27
28# generates the aws canonical string for the given parameters
29def canonical_string(method, bucket="", key="", query_args={}, headers={}, expires=None):
30 interesting_headers = {}
31 for header_key in headers:
32 lk = header_key.lower()
33 if lk in ['content-md5', 'content-type', 'date'] or lk.startswith(AMAZON_HEADER_PREFIX):
34 interesting_headers[lk] = headers[header_key].strip()
35
36 # these keys get empty strings if they don't exist
37 if not interesting_headers.has_key('content-type'):
38 interesting_headers['content-type'] = ''
39 if not interesting_headers.has_key('content-md5'):
40 interesting_headers['content-md5'] = ''
41
42 # just in case someone used this. it's not necessary in this lib.
43 if interesting_headers.has_key('x-amz-date'):
44 interesting_headers['date'] = ''
45
46 # if you're using expires for query string auth, then it trumps date
47 # (and x-amz-date)
48 if expires:
49 interesting_headers['date'] = str(expires)
50
51 sorted_header_keys = interesting_headers.keys()
52 sorted_header_keys.sort()
53
54 buf = "%s\n" % method
55 for header_key in sorted_header_keys:
56 if header_key.startswith(AMAZON_HEADER_PREFIX):
57 buf += "%s:%s\n" % (header_key, interesting_headers[header_key])
58 else:
59 buf += "%s\n" % interesting_headers[header_key]
60
61 # append the bucket if it exists
62 if bucket != "":
63 buf += "/%s" % bucket
64
65 # add the key. even if it doesn't exist, add the slash
66 buf += "/%s" % urllib.quote_plus(key)
67
68 # handle special query string arguments
69
70 if query_args.has_key("acl"):
71 buf += "?acl"
72 elif query_args.has_key("torrent"):
73 buf += "?torrent"
74 elif query_args.has_key("logging"):
75 buf += "?logging"
76 elif query_args.has_key("location"):
77 buf += "?location"
78
79 return buf
80
81# computes the base64'ed hmac-sha hash of the canonical string and the secret
82# access key, optionally urlencoding the result
83def encode(aws_secret_access_key, str, urlencode=False):
84 b64_hmac = base64.encodestring(hmac.new(aws_secret_access_key, str, sha).digest()).strip()
85 if urlencode:
86 return urllib.quote_plus(b64_hmac)
87 else:
88 return b64_hmac
89
90def merge_meta(headers, metadata):
91 final_headers = headers.copy()
92 for k in metadata.keys():
93 final_headers[METADATA_PREFIX + k] = metadata[k]
94
95 return final_headers
96
97# builds the query arg string
98def query_args_hash_to_string(query_args):
99 query_string = ""
100 pairs = []
101 for k, v in query_args.items():
102 piece = k
103 if v != None:
104 piece += "=%s" % urllib.quote_plus(str(v))
105 pairs.append(piece)
106
107 return '&'.join(pairs)
108
109
110class CallingFormat:
111 PATH = 1
112 SUBDOMAIN = 2
113 VANITY = 3
114
115 def build_url_base(protocol, server, port, bucket, calling_format):
116 url_base = '%s://' % protocol
117
118 if bucket == '':
119 url_base += server
120 elif calling_format == CallingFormat.SUBDOMAIN:
121 url_base += "%s.%s" % (bucket, server)
122 elif calling_format == CallingFormat.VANITY:
123 url_base += bucket
124 else:
125 url_base += server
126
127 url_base += ":%s" % port
128
129 if (bucket != '') and (calling_format == CallingFormat.PATH):
130 url_base += "/%s" % bucket
131
132 return url_base
133
134 build_url_base = staticmethod(build_url_base)
135
136
137
138class Location:
139 DEFAULT = None
140 EU = 'EU'
141
142
143
144class AWSAuthConnection:
145 def __init__(self, aws_access_key_id, aws_secret_access_key, is_secure=True,
146 server=DEFAULT_HOST, port=None, calling_format=CallingFormat.SUBDOMAIN):
147
148 if not port:
149 port = PORTS_BY_SECURITY[is_secure]
150
151 self.aws_access_key_id = aws_access_key_id
152 self.aws_secret_access_key = aws_secret_access_key
153 self.is_secure = is_secure
154 self.server = server
155 self.port = port
156 self.calling_format = calling_format
157
158 def create_bucket(self, bucket, headers={}):
159 return Response(self._make_request('PUT', bucket, '', {}, headers))
160
161 def create_located_bucket(self, bucket, location=Location.DEFAULT, headers={}):
162 if location == Location.DEFAULT:
163 body = ""
164 else:
165 body = "<CreateBucketConstraint><LocationConstraint>" + \
166 location + \
167 "</LocationConstraint></CreateBucketConstraint>"
168 return Response(self._make_request('PUT', bucket, '', {}, headers, body))
169
170 def check_bucket_exists(self, bucket):
171 return self._make_request('HEAD', bucket, '', {}, {})
172
173 def list_bucket(self, bucket, options={}, headers={}):
174 return ListBucketResponse(self._make_request('GET', bucket, '', options, headers))
175
176 def delete_bucket(self, bucket, headers={}):
177 return Response(self._make_request('DELETE', bucket, '', {}, headers))
178
179 def put(self, bucket, key, object, headers={}):
180 if not isinstance(object, S3Object):
181 object = S3Object(object)
182
183 return Response(
184 self._make_request(
185 'PUT',
186 bucket,
187 key,
188 {},
189 headers,
190 object.data,
191 object.metadata))
192
193 def get(self, bucket, key, headers={}):
194 return GetResponse(
195 self._make_request('GET', bucket, key, {}, headers))
196
197 def delete(self, bucket, key, headers={}):
198 return Response(
199 self._make_request('DELETE', bucket, key, {}, headers))
200
201 def get_bucket_logging(self, bucket, headers={}):
202 return GetResponse(self._make_request('GET', bucket, '', { 'logging': None }, headers))
203
204 def put_bucket_logging(self, bucket, logging_xml_doc, headers={}):
205 return Response(self._make_request('PUT', bucket, '', { 'logging': None }, headers, logging_xml_doc))
206
207 def get_bucket_acl(self, bucket, headers={}):
208 return self.get_acl(bucket, '', headers)
209
210 def get_acl(self, bucket, key, headers={}):
211 return GetResponse(
212 self._make_request('GET', bucket, key, { 'acl': None }, headers))
213
214 def put_bucket_acl(self, bucket, acl_xml_document, headers={}):
215 return self.put_acl(bucket, '', acl_xml_document, headers)
216
217 def put_acl(self, bucket, key, acl_xml_document, headers={}):
218 return Response(
219 self._make_request(
220 'PUT',
221 bucket,
222 key,
223 { 'acl': None },
224 headers,
225 acl_xml_document))
226
227 def list_all_my_buckets(self, headers={}):
228 return ListAllMyBucketsResponse(self._make_request('GET', '', '', {}, headers))
229
230 def get_bucket_location(self, bucket):
231 return LocationResponse(self._make_request('GET', bucket, '', {'location' : None}))
232
233 # end public methods
234
235 def _make_request(self, method, bucket='', key='', query_args={}, headers={}, data='', metadata={}):
236
237 server = ''
238 if bucket == '':
239 server = self.server
240 elif self.calling_format == CallingFormat.SUBDOMAIN:
241 server = "%s.%s" % (bucket, self.server)
242 elif self.calling_format == CallingFormat.VANITY:
243 server = bucket
244 else:
245 server = self.server
246
247 path = ''
248
249 if (bucket != '') and (self.calling_format == CallingFormat.PATH):
250 path += "/%s" % bucket
251
252 # add the slash after the bucket regardless
253 # the key will be appended if it is non-empty
254 path += "/%s" % urllib.quote_plus(key)
255
256
257 # build the path_argument string
258 # add the ? in all cases since
259 # signature and credentials follow path args
260 if len(query_args):
261 path += "?" + query_args_hash_to_string(query_args)
262
263 is_secure = self.is_secure
264 host = "%s:%d" % (server, self.port)
265 while True:
266 if (is_secure):
267 connection = httplib.HTTPSConnection(host)
268 else:
269 connection = httplib.HTTPConnection(host)
270
271 final_headers = merge_meta(headers, metadata);
272 # add auth header
273 self._add_aws_auth_header(final_headers, method, bucket, key, query_args)
274
275 connection.request(method, path, data, final_headers)
276 resp = connection.getresponse()
277 if resp.status < 300 or resp.status >= 400:
278 return resp
279 # handle redirect
280 location = resp.getheader('location')
281 if not location:
282 return resp
283 # (close connection)
284 resp.read()
285 scheme, host, path, params, query, fragment \
286 = urlparse.urlparse(location)
287 if scheme == "http": is_secure = True
288 elif scheme == "https": is_secure = False
289 else: raise invalidURL("Not http/https: " + location)
290 if query: path += "?" + query
291 # retry with redirect
292
293 def _add_aws_auth_header(self, headers, method, bucket, key, query_args):
294 if not headers.has_key('Date'):
295 headers['Date'] = time.strftime("%a, %d %b %Y %X GMT", time.gmtime())
296
297 c_string = canonical_string(method, bucket, key, query_args, headers)
298 headers['Authorization'] = \
299 "AWS %s:%s" % (self.aws_access_key_id, encode(self.aws_secret_access_key, c_string))
300
301
302class QueryStringAuthGenerator:
303 # by default, expire in 1 minute
304 DEFAULT_EXPIRES_IN = 60
305
306 def __init__(self, aws_access_key_id, aws_secret_access_key, is_secure=True,
307 server=DEFAULT_HOST, port=None, calling_format=CallingFormat.SUBDOMAIN):
308
309 if not port:
310 port = PORTS_BY_SECURITY[is_secure]
311
312 self.aws_access_key_id = aws_access_key_id
313 self.aws_secret_access_key = aws_secret_access_key
314 if (is_secure):
315 self.protocol = 'https'
316 else:
317 self.protocol = 'http'
318
319 self.is_secure = is_secure
320 self.server = server
321 self.port = port
322 self.calling_format = calling_format
323 self.__expires_in = QueryStringAuthGenerator.DEFAULT_EXPIRES_IN
324 self.__expires = None
325
326 # for backwards compatibility with older versions
327 self.server_name = "%s:%s" % (self.server, self.port)
328
329 def set_expires_in(self, expires_in):
330 self.__expires_in = expires_in
331 self.__expires = None
332
333 def set_expires(self, expires):
334 self.__expires = expires
335 self.__expires_in = None
336
337 def create_bucket(self, bucket, headers={}):
338 return self.generate_url('PUT', bucket, '', {}, headers)
339
340 def list_bucket(self, bucket, options={}, headers={}):
341 return self.generate_url('GET', bucket, '', options, headers)
342
343 def delete_bucket(self, bucket, headers={}):
344 return self.generate_url('DELETE', bucket, '', {}, headers)
345
346 def put(self, bucket, key, object, headers={}):
347 if not isinstance(object, S3Object):
348 object = S3Object(object)
349
350 return self.generate_url(
351 'PUT',
352 bucket,
353 key,
354 {},
355 merge_meta(headers, object.metadata))
356
357 def get(self, bucket, key, headers={}):
358 return self.generate_url('GET', bucket, key, {}, headers)
359
360 def delete(self, bucket, key, headers={}):
361 return self.generate_url('DELETE', bucket, key, {}, headers)
362
363 def get_bucket_logging(self, bucket, headers={}):
364 return self.generate_url('GET', bucket, '', { 'logging': None }, headers)
365
366 def put_bucket_logging(self, bucket, logging_xml_doc, headers={}):
367 return self.generate_url('PUT', bucket, '', { 'logging': None }, headers)
368
369 def get_bucket_acl(self, bucket, headers={}):
370 return self.get_acl(bucket, '', headers)
371
372 def get_acl(self, bucket, key='', headers={}):
373 return self.generate_url('GET', bucket, key, { 'acl': None }, headers)
374
375 def put_bucket_acl(self, bucket, acl_xml_document, headers={}):
376 return self.put_acl(bucket, '', acl_xml_document, headers)
377
378 # don't really care what the doc is here.
379 def put_acl(self, bucket, key, acl_xml_document, headers={}):
380 return self.generate_url('PUT', bucket, key, { 'acl': None }, headers)
381
382 def list_all_my_buckets(self, headers={}):
383 return self.generate_url('GET', '', '', {}, headers)
384
385 def make_bare_url(self, bucket, key=''):
386 full_url = self.generate_url(self, bucket, key)
387 return full_url[:full_url.index('?')]
388
389 def generate_url(self, method, bucket='', key='', query_args={}, headers={}):
390 expires = 0
391 if self.__expires_in != None:
392 expires = int(time.time() + self.__expires_in)
393 elif self.__expires != None:
394 expires = int(self.__expires)
395 else:
396 raise "Invalid expires state"
397
398 canonical_str = canonical_string(method, bucket, key, query_args, headers, expires)
399 encoded_canonical = encode(self.aws_secret_access_key, canonical_str)
400
401 url = CallingFormat.build_url_base(self.protocol, self.server, self.port, bucket, self.calling_format)
402
403 url += "/%s" % urllib.quote_plus(key)
404
405 query_args['Signature'] = encoded_canonical
406 query_args['Expires'] = expires
407 query_args['AWSAccessKeyId'] = self.aws_access_key_id
408
409 url += "?%s" % query_args_hash_to_string(query_args)
410
411 return url
412
413
414class S3Object:
415 def __init__(self, data, metadata={}):
416 self.data = data
417 self.metadata = metadata
418
419class Owner:
420 def __init__(self, id='', display_name=''):
421 self.id = id
422 self.display_name = display_name
423
424class ListEntry:
425 def __init__(self, key='', last_modified=None, etag='', size=0, storage_class='', owner=None):
426 self.key = key
427 self.last_modified = last_modified
428 self.etag = etag
429 self.size = size
430 self.storage_class = storage_class
431 self.owner = owner
432
433class CommonPrefixEntry:
434 def __init(self, prefix=''):
435 self.prefix = prefix
436
437class Bucket:
438 def __init__(self, name='', creation_date=''):
439 self.name = name
440 self.creation_date = creation_date
441
442class Response:
443 def __init__(self, http_response):
444 self.http_response = http_response
445 # you have to do this read, even if you don't expect a body.
446 # otherwise, the next request fails.
447 self.body = http_response.read()
448 if http_response.status >= 300 and self.body:
449 self.message = self.body
450 else:
451 self.message = "%03d %s" % (http_response.status, http_response.reason)
452
453
454
455class ListBucketResponse(Response):
456 def __init__(self, http_response):
457 Response.__init__(self, http_response)
458 if http_response.status < 300:
459 handler = ListBucketHandler()
460 xml.sax.parseString(self.body, handler)
461 self.entries = handler.entries
462 self.common_prefixes = handler.common_prefixes
463 self.name = handler.name
464 self.marker = handler.marker
465 self.prefix = handler.prefix
466 self.is_truncated = handler.is_truncated
467 self.delimiter = handler.delimiter
468 self.max_keys = handler.max_keys
469 self.next_marker = handler.next_marker
470 else:
471 self.entries = []
472
473class ListAllMyBucketsResponse(Response):
474 def __init__(self, http_response):
475 Response.__init__(self, http_response)
476 if http_response.status < 300:
477 handler = ListAllMyBucketsHandler()
478 xml.sax.parseString(self.body, handler)
479 self.entries = handler.entries
480 else:
481 self.entries = []
482
483class GetResponse(Response):
484 def __init__(self, http_response):
485 Response.__init__(self, http_response)
486 response_headers = http_response.msg # older pythons don't have getheaders
487 metadata = self.get_aws_metadata(response_headers)
488 self.object = S3Object(self.body, metadata)
489
490 def get_aws_metadata(self, headers):
491 metadata = {}
492 for hkey in headers.keys():
493 if hkey.lower().startswith(METADATA_PREFIX):
494 metadata[hkey[len(METADATA_PREFIX):]] = headers[hkey]
495 del headers[hkey]
496
497 return metadata
498
499class LocationResponse(Response):
500 def __init__(self, http_response):
501 Response.__init__(self, http_response)
502 if http_response.status < 300:
503 handler = LocationHandler()
504 xml.sax.parseString(self.body, handler)
505 self.location = handler.location
506
507class ListBucketHandler(xml.sax.ContentHandler):
508 def __init__(self):
509 self.entries = []
510 self.curr_entry = None
511 self.curr_text = ''
512 self.common_prefixes = []
513 self.curr_common_prefix = None
514 self.name = ''
515 self.marker = ''
516 self.prefix = ''
517 self.is_truncated = False
518 self.delimiter = ''
519 self.max_keys = 0
520 self.next_marker = ''
521 self.is_echoed_prefix_set = False
522
523 def startElement(self, name, attrs):
524 if name == 'Contents':
525 self.curr_entry = ListEntry()
526 elif name == 'Owner':
527 self.curr_entry.owner = Owner()
528 elif name == 'CommonPrefixes':
529 self.curr_common_prefix = CommonPrefixEntry()
530
531
532 def endElement(self, name):
533 if name == 'Contents':
534 self.entries.append(self.curr_entry)
535 elif name == 'CommonPrefixes':
536 self.common_prefixes.append(self.curr_common_prefix)
537 elif name == 'Key':
538 self.curr_entry.key = self.curr_text
539 elif name == 'LastModified':
540 self.curr_entry.last_modified = self.curr_text
541 elif name == 'ETag':
542 self.curr_entry.etag = self.curr_text
543 elif name == 'Size':
544 self.curr_entry.size = int(self.curr_text)
545 elif name == 'ID':
546 self.curr_entry.owner.id = self.curr_text
547 elif name == 'DisplayName':
548 self.curr_entry.owner.display_name = self.curr_text
549 elif name == 'StorageClass':
550 self.curr_entry.storage_class = self.curr_text
551 elif name == 'Name':
552 self.name = self.curr_text
553 elif name == 'Prefix' and self.is_echoed_prefix_set:
554 self.curr_common_prefix.prefix = self.curr_text
555 elif name == 'Prefix':
556 self.prefix = self.curr_text
557 self.is_echoed_prefix_set = True
558 elif name == 'Marker':
559 self.marker = self.curr_text
560 elif name == 'IsTruncated':
561 self.is_truncated = self.curr_text == 'true'
562 elif name == 'Delimiter':
563 self.delimiter = self.curr_text
564 elif name == 'MaxKeys':
565 self.max_keys = int(self.curr_text)
566 elif name == 'NextMarker':
567 self.next_marker = self.curr_text
568
569 self.curr_text = ''
570
571 def characters(self, content):
572 self.curr_text += content
573
574
575class ListAllMyBucketsHandler(xml.sax.ContentHandler):
576 def __init__(self):
577 self.entries = []
578 self.curr_entry = None
579 self.curr_text = ''
580
581 def startElement(self, name, attrs):
582 if name == 'Bucket':
583 self.curr_entry = Bucket()
584
585 def endElement(self, name):
586 if name == 'Name':
587 self.curr_entry.name = self.curr_text
588 elif name == 'CreationDate':
589 self.curr_entry.creation_date = self.curr_text
590 elif name == 'Bucket':
591 self.entries.append(self.curr_entry)
592
593 def characters(self, content):
594 self.curr_text = content
595
596
597class LocationHandler(xml.sax.ContentHandler):
598 def __init__(self):
599 self.location = None
600 self.state = 'init'
601
602 def startElement(self, name, attrs):
603 if self.state == 'init':
604 if name == 'LocationConstraint':
605 self.state = 'tag_location'
606 self.location = ''
607 else: self.state = 'bad'
608 else: self.state = 'bad'
609
610 def endElement(self, name):
611 if self.state == 'tag_location' and name == 'LocationConstraint':
612 self.state = 'done'
613 else: self.state = 'bad'
614
615 def characters(self, content):
616 if self.state == 'tag_location':
617 self.location += content
618
619if __name__=="__main__":
620 keys = raw_input("Enter access and secret key (separated by a space): ")
621 access_key, secret_key = keys.split(" ")
622 s3 = AWSAuthConnection(access_key, secret_key)
623 bucket = "test_s3_lib"
624 m = s3.put(bucket, "sample", "hola mundo", {"Content-Type":"text/lame"})
625 print m.http_response.status, m.http_response.reason
626 print m.http_response.getheaders()
627 print m.body
0628
=== added file 'txaws/s4/contrib/__init__.py'
=== added file 'txaws/s4/s4.py'
--- txaws/s4/s4.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/s4.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,742 @@
1# Copyright 2009 Canonical Ltd.
2#
3# Permission is hereby granted, free of charge, to any person obtaining
4# a copy of this software and associated documentation files (the
5# "Software"), to deal in the Software without restriction, including
6# without limitation the rights to use, copy, modify, merge, publish,
7# distribute, sublicense, and/or sell copies of the Software, and to
8# permit persons to whom the Software is furnished to do so, subject to
9# the following conditions:
10#
11# The above copyright notice and this permission notice shall be
12# included in all copies or substantial portions of the Software.
13#
14# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21
22""" S4 - a S3 storage system stub
23
24This module implementes a stub for Amazons S3 storage system.
25
26Not all functionality is provided, just enough to test the client.
27
28"""
29from __future__ import with_statement
30
31import os
32import hmac
33import time
34import base64
35import logging
36import hashlib
37from urllib import urlencode
38
39# cPickle would be faster here, but working around its relative
40# imports issues for this module requires extra hacking
41import pickle
42
43from boto.utils import canonical_string as canonical_path_string
44# pylint: disable-msg=W0611
45from boto.s3.connection import OrdinaryCallingFormat as CallingFormat
46
47from twisted.web import server, resource, error, http
48from twisted.internet import reactor, interfaces
49# pylint and zope dont work
50# pylint: disable-msg=E0611
51# pylint: disable-msg=F0401
52from zope.interface import implements
53
54# xml namespace response header required
55XMLNS = "http://s3.amazonaws.com/doc/2006-03-01"
56
57AWS_DEFAULT_ACCESS_KEY_ID = 'aws_key'
58AWS_DEFAULT_SECRET_ACCESS_KEY = 'aws_secret'
59AMAZON_HEADER_PREFIX = 'x-amz-'
60AMAZON_META_PREFIX = "x-amz-meta-"
61
62BLOCK_SIZE = 2**16
63
64S4_STATE_FILE = ".s4_state"
65
66logger = logging.getLogger('UbuntuOne.S4')
67
68# pylint: disable-msg=W0403
69import s4_xml
70
71class S4StorageException(Exception):
72 """ exception raised when S4 backend store runs into trouble """
73
74class FakeContent(object):
75 """A content that can be accesed by slicing but will never exists in memory
76 """
77 def __init__(self, char, size):
78 """Create the content as char*size."""
79 self.char = char
80 self.size = size
81
82 def __getitem__(self, slice):
83 """Get a piece of the content."""
84 size = min(slice.stop, self.size) - slice.start
85 return self.char*size
86
87 def hexdigest(self):
88 """Send a fake hexdigest. For big contents this takes too much time
89 to calculate, so we just fake it."""
90 block_size = BLOCK_SIZE
91 start = 0
92 data = self[start:start+block_size]
93 md5calc = hashlib.md5()
94 md5calc.update(data)
95 return md5calc.hexdigest()
96
97 def __len__(self):
98 """The size."""
99 return self.size
100
101class ContentProducer(object):
102 """A content producer used to stream big data."""
103 implements(interfaces.IPullProducer)
104
105 def __init__(self, request, content, buffer_size=BLOCK_SIZE):
106 """Create a producer for request, that produces content."""
107 self.request = request
108 self.content = content
109 self.buffer_size = buffer_size
110 self.position = 0
111 self.paused = False
112
113 def startProducing(self):
114 """IPullProducer api."""
115 self.request.registerProducer(self, streaming=False)
116
117 def finished(self):
118 """Called to finish the request after producing."""
119 self.request.unregisterProducer()
120 self.request.finish()
121
122 def resumeProducing(self):
123 """IPullProducer api."""
124 if self.position > len(self.content):
125 self.finished()
126 return
127
128 data = self.content[self.position:self.position+self.buffer_size]
129
130 self.position += self.buffer_size
131 self.request.write(data)
132
133 def stopProducing(self):
134 """IPullProducer api."""
135 pass
136
137
138def canonical_string(method, bucket="", key="", query_args=None, headers=None):
139 """ compatibility S3 canonical string calculator for cases where passing in
140 a bucket name, key anme and a hash of query args is easier than using an
141 S3 path """
142 path = []
143 if bucket:
144 path.append("/%s" % bucket)
145 path.append("/%s" % key)
146 if query_args:
147 path.append("?%s" % urlencode(query_args))
148 path = "".join(path)
149 if headers is None:
150 headers = {}
151 return canonical_path_string(method=method, path=path, headers=headers)
152
153def encode(secret_key, data):
154 """base64encoded digest of data using secret_key"""
155 encoded = hmac.new(secret_key, data, hashlib.sha1).digest()
156 return base64.encodestring(encoded).strip()
157
158def parse_range_header(range):
159 """modeled after twisted.web.static.File._parseRangeHeader()"""
160 if '=' in range:
161 type, value = range.split('=', 1)
162 else:
163 raise ValueError("Invalid range header, no '='")
164 if type != 'bytes':
165 raise ValueError("Invalid range header, must be a 'bytes' range")
166 raw_ranges = [bytes.strip() for bytes in value.split(',')]
167 ranges = []
168 for current_range in raw_ranges:
169 if '-' not in current_range:
170 raise ValueError("Illegal byte range: %r" % current_range)
171 begin, end = current_range.split('-')
172 if begin:
173 begin = int(begin)
174 else:
175 begin = None
176 if end:
177 end = int(end)
178 else:
179 end = None
180 ranges.append((begin, end))
181 return ranges
182
183class _ListResult(resource.Resource):
184 """ base class for bulding lists of amazon results """
185 isLeaf = True
186 def __init__(self):
187 resource.Resource.__init__(self)
188 def add_headers(self, request, content):
189 """ add standard headers to an amazon list result page reply """
190 request.setHeader("x-amz-id-2", str(request))
191 request.setHeader("x-amz-request-id", str(request))
192 request.setHeader("Content-Type", "text/xml")
193 request.setHeader("Content-Length", str(len(content)))
194
195
196class ListAllMyBucketsResult(_ListResult):
197 """ builds the result for list all buckets call """
198 def __init__(self, buckets, owner=None):
199 _ListResult.__init__(self)
200 self.buckets = buckets
201 if owner:
202 self.owner = owner
203 else:
204 self.owner = dict(id = 0, name = "fakeuser")
205
206 def render_GET(self, request):
207 """ render request for a GET listing """
208 lambr = s4_xml.ListAllMyBucketsResult(self.owner, self.buckets)
209 content = s4_xml.to_XML(lambr)
210 self.add_headers(request, content)
211 return content
212
213class ListBucketResult(_ListResult):
214 """ encapsulates a list of items in a bucket """
215 def __init__(self, bucket):
216 _ListResult.__init__(self)
217 self.bucket = bucket
218
219 def render_GET(self, request):
220 """ Render response for a GET listing """
221 # pylint: disable-msg=W0631
222 children = self.bucket.bucket_children.copy()
223 prefix = request.args.get("prefix", "")
224 if prefix:
225 children = dict([x for x in children.iteritems()
226 if x[0].startswith(prefix[0])])
227 maxkeys = request.args.get("max-keys", 0)
228 if maxkeys:
229 maxkeys = int(maxkeys[0])
230 ck = children.keys()[:maxkeys]
231 children = dict([x for x in children.iteritems() if x[0] in ck])
232 lbr = s4_xml.ListBucketResult(self.bucket, children)
233 s4_xml.add_props(lbr, Prefix=prefix, MaxKeys=maxkeys)
234 content = s4_xml.to_XML(lbr)
235 self.add_headers(request, content)
236 return content
237
238class BasicS3Object(object):
239 """ Basic S3 object class that takes care of contents and properties """
240 owner_id = 0
241 owner = "fakeuser"
242
243 def __init__(self, name, contents, content_type="binary/octect-stream",
244 content_md5=None):
245 self.name = name
246 self.content_type = content_type
247 self.contents = contents
248 if content_md5:
249 if isinstance(content_md5, str):
250 self._etag = content_md5
251 else:
252 self._etag = content_md5.hexdigest()
253 else:
254 self._etag = hashlib.md5(contents).hexdigest()
255 self._date = time.asctime()
256 self._meta = {}
257
258 def __getstate__(self):
259 d = self.__dict__.copy()
260 del d["children"]
261 return d
262
263 def get_etag(self):
264 " build an ETag value. Extra quites are mandated by standards "
265 return '"%s"' % self._etag
266 def set_date(self, datestr):
267 """ set the object's time """
268 self._date = datestr
269 def get_date(self):
270 """ get the object's time """
271 return self._date
272 def get_size(self):
273 """ returns size of object's contents """
274 return len(self.contents)
275 def get_owner(self):
276 """ query object's owner """
277 return self.owner
278 def get_owner_id(self):
279 """ query object's owner id """
280 return self.owner_id
281 def set_meta(self, name, val):
282 """ set metadata value for object """
283 m = self._meta.setdefault(name, [])
284 m.append(val)
285 def iter_meta(self):
286 """ iterate over object's metadata """
287 for k, vals in self._meta.iteritems():
288 for v in vals:
289 yield k, v
290 def delete(self):
291 """ clear storage used by object """
292 self.contents = None
293
294class S3Object(BasicS3Object, resource.Resource):
295 """ Storage Object
296 This objects store the data and metadata
297 """
298 isLeaf = True
299
300 def __init__(self, *args, **kw):
301 BasicS3Object.__init__(self, *args, **kw)
302 resource.Resource.__init__(self)
303
304 def _render(self, request):
305 """render the response for a GET or HEAD request on this object"""
306 request.setHeader("x-amz-id-2", str(request))
307 request.setHeader("x-amz-request-id", str(request))
308 request.setHeader("Content-Type", self.content_type)
309 request.setHeader("ETag", self._etag)
310 for k, v in self.iter_meta():
311 request.setHeader("%s%s" % (AMAZON_META_PREFIX, k), v)
312 range = request.getHeader("Range")
313 size = len(self.contents)
314 if request.method == 'HEAD':
315 request.setHeader("Content-Length", size)
316 return ""
317 if range:
318 ranges = parse_range_header(range)
319 length = 0
320 if len(ranges)==1:
321 begin, end = ranges[0]
322 if begin is None:
323 request.setResponseCode(
324 http.REQUESTED_RANGE_NOT_SATISFIABLE)
325 return ''
326 if not end:
327 end = size
328 elif end < size:
329 end += 1
330 if begin >= size:
331 request.setResponseCode(
332 http.REQUESTED_RANGE_NOT_SATISFIABLE)
333 request.setHeader(
334 'content-range', 'bytes */%d' % size)
335 return ''
336 else:
337 request.setHeader(
338 'content-range',
339 'bytes %d-%d/%d' % (begin, end-1, size))
340 length = (end - begin)
341 request.setHeader("Content-Length", length)
342 request.setResponseCode(http.PARTIAL_CONTENT)
343 contents = self.contents[begin:end]
344 else:
345 # multiple ranges should be returned in a multipart response
346 request.setResponseCode(
347 http.REQUESTED_RANGE_NOT_SATISFIABLE)
348 return ''
349
350 else:
351 request.setHeader("Content-Length", str(size))
352 contents = self.contents
353
354 producer = ContentProducer(request, contents)
355 producer.startProducing()
356 return server.NOT_DONE_YET
357 render_GET = _render
358 render_HEAD = _render
359
360class UploadS3Object(resource.Resource):
361 """ Class for handling uploads
362
363 It handles the render_PUT method to update the bucket with the data
364 """
365 isLeaf = True
366 def __init__(self, bucket, name):
367 resource.Resource.__init__(self)
368 self.bucket = bucket
369 self.name = name
370
371 def render_PUT(self, request):
372 """accept the incoming data for a PUT request"""
373 data = request.content.read()
374 content_type = request.getHeader("Content-Type")
375 content_md5 = request.getHeader("Content-MD5")
376 if content_md5: # check if the data is good
377 header_md5 = base64.decodestring(content_md5)
378 data_md5 = hashlib.md5(data)
379 assert (data_md5.digest() == header_md5), "md5 check failed!"
380 content_md5 = data_md5
381 child = S3Object(self.name, data, content_type, content_md5)
382 date = request.getHeader("Date")
383 if not date:
384 date = time.ctime()
385 child.set_date(date)
386 for k, v in request.getAllHeaders().items():
387 if k.startswith(AMAZON_META_PREFIX):
388 child.set_meta(k[len(AMAZON_META_PREFIX):], v)
389 self.bucket.bucket_children[ self.name ] = child
390 request.setHeader("ETag", child.get_etag())
391 logger.debug("created object bucket=%s name=%s size=%d" % (
392 self.bucket, self.name, len(data)))
393 return ""
394
395
396class EmptyPage(resource.Resource):
397 """ return Ok/empty document """
398 isLeaf = True
399 def __init__(self, retcode=http.OK, headers=None, body=""):
400 resource.Resource.__init__(self)
401 self._retcode = retcode
402 self._headers = headers
403 self._body = body
404
405 def render(self, request):
406 """ override the render method to return an empty document """
407 request.setHeader("x-amz-id-2", str(request))
408 request.setHeader("x-amz-request-id", str(request))
409 request.setHeader("Content-Type", "text/html")
410 request.setHeader("Connection", "close")
411 if self._headers:
412 for h, v in self._headers.items():
413 request.setHeader(h, v)
414 request.setResponseCode(self._retcode)
415 return self._body
416
417def ErrorPage(http_code, code, message, path, with_body=True):
418 """ helper function that renders an Amazon error response xml page """
419 err = s4_xml.AmazonError(code, message, path)
420 body = s4_xml.to_XML(err)
421 body_size = str(len(body))
422 if not with_body:
423 body = ""
424 logger.info("returning error page %s [%s]%s for %s" % (
425 http_code, code, message, path))
426 return EmptyPage(http_code, headers={
427 "Content-Type": "text/xml",
428 "Content-Length": body_size,
429 }, body=body)
430
431# pylint: disable-msg=C0321
432class Bucket(resource.Resource):
433 """ Storage Bucket
434
435 Buckets hold objects with data and receive uploads in case of PUT
436 """
437 def __init__(self, name):
438 resource.Resource.__init__(self)
439 # cant use children, resource already has that name
440 # and it would work as a cache
441 self.bucket_children = {}
442 self._name = name
443 self._date = time.time()
444
445 def get_name(self):
446 """ returns this bucket's name """
447 return self._name
448 def __len__(self):
449 """ returns how many objects are in this bucket """
450 return len(self.bucket_children)
451 def iter_children(self):
452 """ iterator that returns each children objects """
453 for (key, val) in self.bucket_children.iteritems():
454 yield key, val
455 def delete(self):
456 """ clean up internal state to prepare bucket for deletion """
457 pass
458 def _get_state_file(self, rootdir, check=True):
459 """ builds the pathname of the state file """
460 state_file = os.path.join(rootdir, "%s%s" % (self._name, S4_STATE_FILE))
461 if check and not os.path.exists(state_file):
462 return None
463 return state_file
464 def _save(self, rootdir):
465 """ saves the state of a bucket """
466 state_file = self._get_state_file(rootdir, check=False)
467 data = dict(
468 name = self._name,
469 date = self._date,
470 objects = dict([ x for x in self.bucket_children.iteritems() ])
471 )
472 with open(state_file, "wb") as state_fd:
473 pickle.dump(data, state_fd)
474 logger.debug("saved bucket '%s' in file '%s'" % (
475 self._name, state_file))
476 return
477 def _load(self, rootdir):
478 """ loads a saved bucket state """
479 state_file = self._get_state_file(rootdir)
480 if not state_file:
481 return
482 with open(state_file, "rb") as state_fd:
483 data = pickle.load(state_fd)
484 assert (self._name == data["name"]), \
485 "can not load bucket with different name"
486 self._date = data["date"]
487 self.bucket_children = data["objects"]
488 return
489
490 def getChild(self, name, request):
491 """get the next object down the chain"""
492 # avoid recursion into the key names
493 # (which can contain / as a valid char!)
494 if name and request.postpath:
495 name = os.path.join(*((name,)+tuple(request.postpath)))
496 assert (name), "Wrong call stack for name='%s'" % (name,)
497 if request.method == "PUT":
498 child = UploadS3Object(self, name)
499 elif request.method in ("GET", "HEAD") :
500 child = self.bucket_children.get(name, None)
501 elif request.method == "DELETE":
502 child = self.bucket_children.get(name, None)
503 if child is None: # delete unknown object
504 return EmptyPage(http.NO_CONTENT)
505 child.delete()
506 del self.bucket_children[name]
507 return EmptyPage(http.NO_CONTENT)
508 else:
509 logger.error("UNHANDLED request method %s" % request.method)
510 return ErrorPage(http.BAD_REQUEST, "BadRequest",
511 "Your '%s' request is invalid" % request.method,
512 request.path)
513 if child is None:
514 return ErrorPage(http.NOT_FOUND, "NoSuchKey",
515 "The specified key does not exist.",
516 request.path, with_body=(request.method!="HEAD"))
517 return child
518
519class DiscardBucket(Bucket):
520 """A bucket that will just discard all data as it arrives."""
521
522 def getChild(self, name, request):
523 """accept uploads and discard them."""
524 if request.method == "PUT":
525 return self
526 else:
527 return ErrorPage(http.NOT_FOUND, "NoSuchKey",
528 "The specified key does not exist.",
529 request.path)
530
531 def render_PUT(self, request):
532 """accept the incoming data for a PUT request"""
533 # we need to compute a correct md5/etag to send back to the client
534 etag = hashlib.md5()
535 # this loop should be deadlocking with the client code that writes the
536 # data. But render put doesnt get called until the streamer has
537 # put all the that. The python mem usage is constant. And it works.
538 while True:
539 data = request.content.read(BLOCK_SIZE)
540 if not data:
541 break
542 etag.update(data)
543 request.setHeader("ETag", '"%s"' % etag.hexdigest())
544 return ""
545
546class SizeBucket(Bucket):
547 """ SizeBucket
548
549 Fakes contents and always returns an object with size = int(objname)
550 """
551
552 def getChild(self, name, request):
553 """get the next object down the chain"""
554 try:
555 fake = FakeContent("0", int(name))
556 o = S3Object(name, fake, "text/plain", fake.hexdigest())
557 return o
558 except ValueError:
559 return "this buckets requires integer named objects"
560
561
562class Root(resource.Resource):
563 """ Site Root
564
565 handles all the requests.
566 on initialization it configures some default buckets
567 """
568 owner_id = 0
569 owner = "fakeuser"
570
571 def __init__(self, storagedir=None, allow_default_access=True):
572 resource.Resource.__init__(self)
573
574 self.auth = {}
575 if allow_default_access:
576 self.auth[ AWS_DEFAULT_ACCESS_KEY_ID ] = \
577 AWS_DEFAULT_SECRET_ACCESS_KEY
578 self.fail_next = {}
579 self.buckets = dict(
580 size = SizeBucket("size"),
581 discard = DiscardBucket("discard"))
582
583 self._rootdir = storagedir
584 if self._rootdir:
585 self._load()
586
587 def _add_bucket(self, name):
588 """ create a new bucket """
589 if self.buckets.has_key(name):
590 return self.buckets[name]
591 bucket = Bucket(name)
592 self.buckets[name] = bucket
593 if self._rootdir:
594 bucket._save(self._rootdir)
595 self._save()
596 return bucket
597
598 def _get_state_file(self, check=True):
599 """ locate the saved state file on disk """
600 assert self._rootdir, "S4 storage has not been initialized"
601 state_file = os.path.join(self._rootdir, S4_STATE_FILE)
602 if check and not os.path.exists(state_file):
603 return None
604 return state_file
605 def _load(self):
606 "load a saved bucket list state from disk "
607 state_file = self._get_state_file()
608 if not state_file:
609 return
610 data = dict(buckets=[])
611 with open(state_file, "rb") as state_fd:
612 data = pickle.load(state_fd)
613 self.owner_id = data["owner_id"]
614 self.owner = data["owner"]
615 for bucket_name in data["buckets"]:
616 bucket = Bucket(bucket_name)
617 bucket._load(self._rootdir)
618 self.buckets[bucket_name] = bucket
619 self._save(with_buckets=False)
620 return
621 def _save(self, with_buckets=True):
622 """ save current state to disk """
623 state_file = self._get_state_file(check=False)
624 data = dict(
625 owner = self.owner,
626 owner_id = self.owner_id,
627 buckets = [ x for x in self.buckets.keys()
628 if x not in ("size", "discard")],
629 )
630 with open(state_file, "wb") as state_fd:
631 pickle.dump(data, state_fd)
632 logger.debug("saved state file %s" % state_file)
633 if not with_buckets:
634 return
635 for bucket_name in data["buckets"]:
636 bucket = self.buckets[bucket_name]
637 bucket._save(self._rootdir)
638 return
639 def fail_next_put(self, error=http.INTERNAL_SERVER_ERROR,
640 message="Internal Server Error"):
641 """
642 Force next PUT request to return an error
643 """
644 logger.debug("will fail next put with %d (%s)", error, message)
645 self.fail_next['PUT'] = error, message
646
647 def fail_next_get(self, error=http.INTERNAL_SERVER_ERROR,
648 message="Internal Server Error"):
649 """
650 Force next GET request to return an error
651 """
652 logger.debug("will fail next get with %d (%s)", error, message)
653 self.fail_next['GET'] = error, message
654
655 def getChild(self, name, request):
656 """get the next object down the resource path"""
657 if not self.check_auth( request ):
658 return ErrorPage(http.FORBIDDEN, "InvalidSecurity",
659 "The provided security credentials are not valid.",
660 request.path)
661 if request.method in self.fail_next:
662 err, message = self.fail_next.pop(request.method)
663 return error.ErrorPage(err, message, message)
664 if request.path == "/" and request.method == "GET":
665 # this is a getallbuckets call
666 return ListAllMyBucketsResult(self.buckets.values())
667
668 # need to record when things change and save bucket state
669 if self._rootdir and name and request.method in ("PUT", "DELETE"):
670 def save_state(result, self, name, method):
671 """ callback for when rendering is finished """
672 bucket = self.buckets[name]
673 return bucket._save(self._rootdir)
674 _defer = request.notifyFinish()
675 _defer.addCallback(save_state, self, name, request.method)
676
677 bucket = self.buckets.get(name, None)
678 # if we operate on a key, pass control
679 if request.postpath and request.postpath[0]:
680 if bucket is None:
681 # bucket does not exist, yet we attempt operation on
682 # an object from that bucket
683 return ErrorPage(http.NOT_FOUND, "InvalidBucketName",
684 "The specified bucket is not valid",
685 request.path)
686 return bucket
687
688 # these are operations that are happening on a bucket and
689 # which are better handled from the root handler
690
691 # we're asked to list a bucket
692 if request.method in ("GET", "HEAD"):
693 if bucket is None:
694 return ErrorPage(http.NOT_FOUND, "NoSuchBucket",
695 "The specified bucket does not exist.",
696 request.path)
697 return ListBucketResult(bucket)
698 # bucket creation. if bucket already exists, noop
699 elif request.method == "PUT":
700 if bucket is None:
701 bucket = self._add_bucket(name)
702 return EmptyPage()
703 # we're asked to delete a bucket
704 elif request.method == "DELETE":
705 if len(bucket): # non-empty buckets can not be deleted
706 return ErrorPage(http.CONFLICT, "BucketNotEmpty",
707 "The bucket you tried to delete is not empty.",
708 request.path)
709 bucket.delete()
710 del self.buckets[name]
711 if self._rootdir:
712 self._save(with_buckets=False)
713 return EmptyPage(http.NO_CONTENT,
714 headers=dict(Location=request.path))
715 else:
716 return ErrorPage(http.BAD_REQUEST, "BadRequest",
717 "Your '%s' request is invalid" % request.method,
718 request.path)
719 return bucket
720
721 def check_auth(self, request):
722 """ Validates key/secret """
723 auth_str = request.getHeader('Authorization')
724 if not auth_str.startswith("AWS "):
725 return False
726 access_key, signature = auth_str[4:].split(":")
727 if not access_key in self.auth:
728 return False
729 secret_key = self.auth[ access_key ]
730 headers = request.getAllHeaders()
731 c_string = canonical_path_string(
732 request.method, request.path, headers)
733 if encode(secret_key, c_string) != signature:
734 return False
735 return True
736
737
738if __name__ == "__main__":
739 root = Root()
740 site = server.Site(root)
741 reactor.listenTCP(8808, site)
742 reactor.run()
0743
=== added file 'txaws/s4/s4_xml.py'
--- txaws/s4/s4_xml.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/s4_xml.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,155 @@
1# Copyright 2008-2009 Canonical Ltd.
2#
3# Permission is hereby granted, free of charge, to any person obtaining
4# a copy of this software and associated documentation files (the
5# "Software"), to deal in the Software without restriction, including
6# without limitation the rights to use, copy, modify, merge, publish,
7# distribute, sublicense, and/or sell copies of the Software, and to
8# permit persons to whom the Software is furnished to do so, subject to
9# the following conditions:
10#
11# The above copyright notice and this permission notice shall be
12# included in all copies or substantial portions of the Software.
13#
14# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21
22""" build XML responses that mimic the behavior of the real S3 """
23
24from StringIO import StringIO
25from xml.etree.ElementTree import Element, ElementTree
26
27XMLNS = "http://s3.amazonaws.com/doc/2006-03-01"
28
29# <?xml version="1.0" encoding="UTF-8"?>
30def to_XML(elem):
31 """ renders an xml element to a text/xml page """
32 s = StringIO()
33 s.write("""<?xml version="1.0" encoding="UTF-8"?>\n""")
34 tree = ElementTree(elem)
35 tree.write(s)
36 return s.getvalue()
37
38def add_props(elem, **kw):
39 """ add subnodes to a xml node based on a dictionary """
40 for (key, val) in kw.iteritems():
41 prop = Element(key)
42 prop.tail = "\n"
43 if val is None:
44 val = ""
45 elif isinstance(val, bool):
46 val = str(val).lower()
47 elif not isinstance(val, str):
48 val = str(val)
49 prop.text = val
50 elem.append(prop)
51
52# <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01">
53# <Name>bucket</Name>
54# <Prefix>prefix</Prefix>
55# <Marker>marker</Marker>
56# <MaxKeys>max-keys</MaxKeys>
57# <IsTruncated>false</IsTruncated>
58# <Contents>
59# <Key>object</Key>
60# <LastModified>date</LastModified>
61# <ETag>etag</ETag>
62# <Size>size</Size>
63# <StorageClass>STANDARD</StorageClass>
64# <Owner>
65# <ID>owner_id</ID>
66# <DisplayName>owner_name</DisplayName>
67# </Owner>
68# </Contents>
69# ...
70# </ListBucketResult>
71def ListBucketResult(bucket, children):
72 """ builds the xml tree corresponding to a bucket listing """
73 root = Element("ListBucketResult", dict(xmlns=XMLNS))
74 root.tail = root.text = "\n"
75 add_props(root, **dict(
76 Name = bucket.get_name(),
77 IsTruncated = False,
78 Marker = 0,
79 ))
80 for (obname, ob) in children.iteritems():
81 contents = Element("Contents")
82 add_props(contents, **dict(
83 Key = obname,
84 LastModified = ob.get_date(),
85 ETag = ob.get_etag(),
86 Size = ob.get_size(),
87 StorageClass = "STANDARD",
88 ))
89 owner = Element("Owner")
90 add_props(owner, **dict(
91 ID = ob.get_owner_id(),
92 DisplayName = ob.get_owner(), ))
93 contents.append(owner)
94 root.append(contents)
95 return root
96
97# <Error>
98# <Code>NoSuchKey</Code>
99# <Message>The resource you requested does not exist</Message>
100# <Resource>/mybucket/myfoto.jpg</Resource>
101# <RequestId>4442587FB7D0A2F9</RequestId>
102# </Error>
103def AmazonError(code, message, resource, req_id=""):
104 """ builds xml tree corresponding to an Amazon error xml page """
105 root = Element("Error")
106 root.tail = root.text = "\n"
107 add_props(root, **dict(
108 Code = code,
109 Message = message,
110 Resource = resource,
111 RequestId = req_id))
112 return root
113
114# <ListAllMyBucketsResult xmlns="http://doc.s3.amazonaws.com/2006-03-01">
115# <Owner>
116# <ID>user_id</ID>
117# <DisplayName>display_name</DisplayName>
118# </Owner>
119# <Buckets>
120# <Bucket>
121# <Name>bucket_name</Name>
122# <CreationDate>date</CreationDate>
123# </Bucket>
124# ...
125# </Buckets>
126# </ListAllMyBucketsResult>
127def ListAllMyBucketsResult(owner, buckets):
128 """ builds xml tree corresponding to an Amazon list all buckets """
129 root = Element("ListAllMyBucketsResult", dict(xmlns=XMLNS))
130 root.tail = root.text = "\n"
131 xml_owner = Element("Owner")
132 add_props(xml_owner, **dict(
133 ID = owner["id"],
134 DisplayName = owner["name"] ))
135 root.append(xml_owner)
136 xml_buckets = Element("Buckets")
137 for bucket in buckets:
138 b = Element("Bucket")
139 add_props(b, **dict(
140 Name = bucket._name,
141 CreationDate = bucket._date))
142 xml_buckets.append(b)
143 root.append(xml_buckets)
144 return root
145
146if __name__ == '__main__':
147 # pylint: disable-msg=W0403
148 # pylint: disable-msg=E0611
149 from s4 import Bucket
150 bucket = Bucket("test-bucket")
151 lbr = ListBucketResult(bucket)
152 print to_XML(lbr)
153 print
154
155
0156
=== added directory 'txaws/s4/testing'
=== added file 'txaws/s4/testing/__init__.py'
=== added file 'txaws/s4/testing/testcase.py'
--- txaws/s4/testing/testcase.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/testing/testcase.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,131 @@
1# Copyright 2008-2009 Canonical Ltd.
2#
3# Permission is hereby granted, free of charge, to any person obtaining
4# a copy of this software and associated documentation files (the
5# "Software"), to deal in the Software without restriction, including
6# without limitation the rights to use, copy, modify, merge, publish,
7# distribute, sublicense, and/or sell copies of the Software, and to
8# permit persons to whom the Software is furnished to do so, subject to
9# the following conditions:
10#
11# The above copyright notice and this permission notice shall be
12# included in all copies or substantial portions of the Software.
13#
14# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21
22"""Test case for S4 test server"""
23
24import os
25import tempfile
26import shutil
27
28from twisted.web import server
29from twisted.internet import reactor
30from twisted.trial.unittest import TestCase as TwistedTestCase
31
32from txaws.s4 import s4
33from boto.s3 import connection
34
35# pylint: disable-msg=W0201
36class S4TestCase(TwistedTestCase):
37 """ Base class for testing S4
38
39 This class takes care of starting a server instance for all S4 tests
40
41 As S4 is based on twisted, we inherit from TwistedTestCase.
42 As our tests are blocking, we decorate them with 'blocking_test' to
43 handle that.
44 """
45 s3 = None
46 logfile = None
47 storagedir = None
48 active = False
49 def setUp(self):
50 """Setup method."""
51 if not self.active:
52 self.start_server()
53
54 def tearDown(self):
55 """ tear down end testcase method """
56 # dirty hack to force closing all the cruft boto might be
57 # leaving around
58 if self.s3:
59 # this for is intentional to deal with s3._cache.__iter__ breakage
60 for key in [x for x in self.s3._cache]:
61 self.s3._cache[key].close()
62 self.s3._cache[key] = None
63 self.s3 = None
64 self.stop_server()
65
66 def connect_ok(self, access=s4.AWS_DEFAULT_ACCESS_KEY_ID,
67 secret=s4.AWS_DEFAULT_SECRET_ACCESS_KEY):
68 """ Get a valid connection to S3 (actually, to S4) """
69 if self.s3:
70 return self.s3
71 s3 = connection.S3Connection(access, secret, is_secure=False,
72 host="localhost", port=self.port,
73 calling_format=s4.CallingFormat())
74 # don't let boto do it's braindead retrying for us
75 s3.num_retries = 0
76 # Need to keep track of this connection
77 self.s3 = s3
78 return s3
79
80 @property
81 def port(self):
82 """The port."""
83 return self.conn.getHost().port
84
85 def start_server(self, persistent=False):
86 """ start the S4 listening server """
87 if self.active:
88 return
89 if persistent:
90 if not self.storagedir:
91 self.storagedir = tempfile.mkdtemp(
92 prefix="test-s4-boto-", suffix="-cache")
93 root = s4.Root(storagedir=self.storagedir)
94 else:
95 root = s4.Root()
96 self.site = server.Site(root)
97 self.active = True
98 self.conn = reactor.listenTCP(0, self.site)
99
100 def stop_server(self):
101 """ stop the S4 listening server """
102 self.active = False
103 self.conn.stopListening()
104 if self.storagedir and os.path.exists(self.storagedir):
105 shutil.rmtree(self.storagedir, ignore_errors=True)
106 self.storagedir = None
107
108 def restart_server(self, persistent=False):
109 """ restarts the S4 listening server """
110 self.stop_server()
111 self.start_server(persistent=persistent)
112
113
114from twisted.internet import threads
115from twisted.python.util import mergeFunctionMetadata
116
117def defer_to_thread(function):
118 """Run in a thread and return a Deferred that fires when done."""
119 def decorated(*args, **kwargs):
120 """Run in a thread and return a Deferred that fires when done."""
121 return threads.deferToThread(function, *args, **kwargs)
122 return mergeFunctionMetadata(function, decorated)
123
124def skip_test(reason):
125 """ tag a testcase to be skipped by the test runner """
126 def deco(f, *args, **kw):
127 """ testcase decorator """
128 f.skip = reason
129 return deco
130
131
0132
=== added directory 'txaws/s4/tests'
=== added file 'txaws/s4/tests/__init__.py'
=== added file 'txaws/s4/tests/test_S4.py'
--- txaws/s4/tests/test_S4.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/tests/test_S4.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,194 @@
1# Copyright 2008 Canonical Ltd.
2#
3# Permission is hereby granted, free of charge, to any person obtaining
4# a copy of this software and associated documentation files (the
5# "Software"), to deal in the Software without restriction, including
6# without limitation the rights to use, copy, modify, merge, publish,
7# distribute, sublicense, and/or sell copies of the Software, and to
8# permit persons to whom the Software is furnished to do so, subject to
9# the following conditions:
10#
11# The above copyright notice and this permission notice shall be
12# included in all copies or substantial portions of the Software.
13#
14# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21
22"""Unit tests for S4 test server"""
23
24import time
25import unittest
26
27from txaws.s4.testing.testcase import S4TestCase, defer_to_thread
28from boto.exception import S3ResponseError, BotoServerError
29
30class TestBasicObjectManipulation(S4TestCase):
31 """Tests for basic object manipulation."""
32
33 def _get_sample_key(self, s3, content, content_type=None):
34 """ cerate a new bucket and return a sample key from content """
35 bname = "test-%.2f" % time.time()
36 bucket = s3.create_bucket(bname)
37 key = bucket.new_key("sample")
38 if content_type:
39 key.content_type = content_type
40 key.set_contents_from_string(content)
41 return key
42
43 @defer_to_thread
44 def test_get(self):
45 """ Get one object """
46
47 s3 = self.connect_ok()
48 size = 30
49 b = s3.get_bucket("size")
50 m = b.get_key(str(size))
51
52 body = m.get_contents_as_string()
53 self.assertEquals(body, "0"*size)
54 self.assertEquals(m.size, size)
55 self.assertEquals(m.content_type, "text/plain")
56
57 @defer_to_thread
58 def test_get_range(self):
59 """Get part of one object"""
60
61 s3 = self.connect_ok()
62 content = '0123456789'
63 key = self._get_sample_key(s3, content)
64 size = len(content)
65
66 def _get_range(range_start, range_size=None):
67 """test range get for various ranges"""
68 if range_size:
69 range_header = {"Range" : "bytes=%s-%s" % (
70 range_start, range_start + range_size - 1 )}
71 else:
72 range_header = {"Range" : "bytes=%s-" % (range_start,)}
73 range_size = size - range_start
74 key.open_read(headers=range_header)
75 self.assertEquals(key.size, range_size)
76 self.assertEquals(key.resp.status, 206)
77 ret = key.read()
78 body = content[range_start:range_start+range_size]
79 self.assertEquals(ret, body)
80 key.close()
81 # get a test range
82 range_size = 5
83 range_start = 2
84 _get_range(range_start)
85 _get_range(range_start, range_size)
86
87 @defer_to_thread
88 def test_get_multiple_range(self):
89 """Get part of one object"""
90
91 s3 = self.connect_ok()
92 content = '0123456789'
93 size = len(content)
94 key = self._get_sample_key(s3, content)
95 range_header = {"Range" : "bytes=0-1,5-6,9-" }
96 exc = self.assertRaises(S3ResponseError, key.open_read,
97 headers=range_header)
98 self.assertEquals(exc.status, 416)
99 key.close()
100
101 @defer_to_thread
102 def test_get_illegal_range(self):
103 """make sure first-byte-pos is present"""
104
105 s3 = self.connect_ok()
106 content = '0123456789'
107 size = len(content)
108 key = self._get_sample_key(s3, content)
109 range_header = {"Range" : "bytes=-1" }
110 exc = self.assertRaises(S3ResponseError, key.open_read,
111 headers=range_header)
112 self.assertEquals(exc.status, 416)
113 key.close()
114
115 @defer_to_thread
116 def test_get_404(self):
117 """ Try to get an object thats not there, expect 404 """
118
119 s3 = self.connect_ok()
120 bname = "test-%.2f" % time.time()
121 bucket = s3.create_bucket(bname)
122 # this does not create a key on the server side yet
123 key = bucket.new_key(bname)
124 # ... which is why we should get errors when attempting to read it
125 exc = self.assertRaises(S3ResponseError, key.open_read)
126 self.assertEquals(key.resp.status, 404)
127 self.assertEquals(exc.status, 404)
128
129 @defer_to_thread
130 def test_get_403(self):
131 """ Try to get an object with invalid credentials """
132 s3 = self.connect_ok(secret="bad secret")
133 exc = self.assertRaises(S3ResponseError, s3.get_bucket, "size")
134 self.assertEquals(exc.status, 403)
135
136
137 @defer_to_thread
138 def test_discarded(self):
139 """ put an object, get a 404 """
140 s3 = self.connect_ok()
141 bucket = s3.get_bucket("discard")
142 key = bucket.new_key("sample")
143 message = "Hello World!"
144 key.content_type = "text/lame"
145 key.set_contents_from_string(message)
146 exc = self.assertRaises(S3ResponseError, key.read)
147 self.assertEquals(exc.status, 404)
148
149 @defer_to_thread
150 def test_put(self):
151 """ put an object, get it back """
152 s3 = self.connect_ok()
153
154 message = "Hello World!"
155 key = self._get_sample_key(s3, message, "text/lame")
156 for x in range(1, 10):
157 body = key.get_contents_as_string()
158 self.assertEquals(body, message*x)
159 key.set_contents_from_string(message*(x+1))
160 self.assertEquals(key.content_type, "text/lame")
161
162 @defer_to_thread
163 def test_fail_next(self):
164 """ Test whether fail_next_put works """
165 s3 = self.connect_ok()
166 message = "Hello World!"
167 key = self._get_sample_key(s3, message, "text/lamest")
168
169 # dirty poking at our own internals, but it works...
170 self.site.resource.fail_next_put()
171
172 exc = self.assertRaises(BotoServerError, key.set_contents_from_string,
173 message)
174 self.assertEquals(exc.status, 500)
175 # next one should work
176 key.set_contents_from_string(message*2)
177 body = key.get_contents_as_string()
178 self.assertEquals(body, message*2)
179
180 # now test the get fail
181 self.site.resource.fail_next_get()
182 key.set_contents_from_string(message*3)
183 exc = self.assertRaises(BotoServerError, key.read)
184 self.assertEquals(exc.status, 500)
185 # next get should work
186 body = key.get_contents_as_string()
187 self.assertEquals(body, message*3)
188
189def test_suite():
190 """Used by the rest runner to find the tests in this module"""
191 return unittest.TestLoader().loadTestsFromName(__name__)
192
193if __name__ == "__main__":
194 unittest.main()
0195
=== added file 'txaws/s4/tests/test_boto.py'
--- txaws/s4/tests/test_boto.py 1970-01-01 00:00:00 +0000
+++ txaws/s4/tests/test_boto.py 2009-08-19 14:36:56 +0000
@@ -0,0 +1,275 @@
1#!/usr/bin/python
2#
3# Permission is hereby granted, free of charge, to any person obtaining
4# a copy of this software and associated documentation files (the
5# "Software"), to deal in the Software without restriction, including
6# without limitation the rights to use, copy, modify, merge, publish,
7# distribute, sublicense, and/or sell copies of the Software, and to
8# permit persons to whom the Software is furnished to do so, subject to
9# the following conditions:
10#
11# The above copyright notice and this permission notice shall be
12# included in all copies or substantial portions of the Software.
13#
14# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21
22#
23# test s4 implementation using the python-boto client
24
25"""
26imported (from boto) unit tests for the S3Connection
27"""
28import unittest
29
30import os
31import time
32import tempfile
33
34from StringIO import StringIO
35
36from txaws.s4.testing.testcase import S4TestCase, defer_to_thread, skip_test
37from boto.exception import S3PermissionsError
38
39# pylint: disable-msg=C0111
40class S3ConnectionTest(S4TestCase):
41 def _get_bucket(self, s3conn):
42 # create a new, empty bucket
43 bucket_name = 'test-%.3f' % time.time()
44 bucket = s3conn.create_bucket(bucket_name)
45 # now try a get_bucket call and see if it's really there
46 bucket = s3conn.get_bucket(bucket_name)
47 return bucket
48
49 @defer_to_thread
50 def test_basic(self):
51 T1 = 'This is a test of file upload and download'
52 s3conn = self.connect_ok()
53
54 all_buckets = s3conn.get_all_buckets()
55 bucket = self._get_bucket(s3conn)
56 all_buckets = s3conn.get_all_buckets()
57 self.failUnless(bucket.name in [x.name for x in all_buckets])
58 # bucket should be empty now
59 self.failUnlessEqual(bucket.get_key("missing"), None)
60 all = bucket.get_all_keys()
61 self.failUnlessEqual(len(all), 0)
62 # create a new key and store it's content from a string
63 k = bucket.new_key()
64 k.name = 'foobar'
65 k.set_contents_from_string(T1)
66 fp = StringIO()
67 # now get the contents from s3 to a local file
68 k.get_contents_to_file(fp)
69 # check to make sure content read from s3 is identical to original
70 self.failUnlessEqual(T1, fp.getvalue())
71 bucket.delete_key(k)
72 self.failUnlessEqual(bucket.get_key(k.name), None)
73
74 @defer_to_thread
75 def test_lookup(self):
76 T1 = 'This is a test of file upload and download'
77 T2 = 'This is a second string to test file upload and download'
78 s3conn = self.connect_ok()
79 bucket = self._get_bucket(s3conn)
80 # create a new key and store it's content from a string
81 k = bucket.new_key()
82 # test a few variations on get_all_keys - first load some data
83 # for the first one, let's override the content type
84 (fd, fname) = tempfile.mkstemp()
85 os.write(fd, T1)
86 os.close(fd)
87 phony_mimetype = 'application/x-boto-test'
88 headers = {'Content-Type': phony_mimetype}
89 k.name = 'foo/bar'
90 k.set_contents_from_string(T1, headers)
91 k.name = 'foo/bas'
92 k.set_contents_from_filename(fname)
93 k.name = 'foo/bat'
94 k.set_contents_from_string(T1)
95 k.name = 'fie/bar'
96 k.set_contents_from_string(T1)
97 k.name = 'fie/bas'
98 k.set_contents_from_string(T1)
99 k.name = 'fie/bat'
100 k.set_contents_from_string(T1)
101 # try resetting the contents to another value
102 md5 = k.md5
103 k.set_contents_from_string(T2)
104 self.failIfEqual(k.md5, md5)
105 os.unlink(fname)
106 all = bucket.get_all_keys()
107 self.failUnlessEqual(len(all), 6)
108 rs = bucket.get_all_keys(prefix='foo')
109 self.failUnlessEqual(len(rs), 3)
110 rs = bucket.get_all_keys(maxkeys=5)
111 self.failUnlessEqual(len(rs), 5)
112 # test the lookup method
113 k = bucket.lookup('foo/bar')
114 self.failUnless(isinstance(k, bucket.key_class))
115 self.failUnlessEqual(k.content_type, phony_mimetype)
116 k = bucket.lookup('notthere')
117 self.failUnlessEqual(k, None)
118
119 @defer_to_thread
120 def test_metadata(self):
121 T1 = 'This is a test of file upload and download'
122 s3conn = self.connect_ok()
123 bucket = self._get_bucket(s3conn)
124 # try some metadata stuff
125 k = bucket.new_key()
126 k.name = 'has_metadata'
127 mdkey1 = 'meta1'
128 mdval1 = 'This is the first metadata value'
129 k.set_metadata(mdkey1, mdval1)
130 mdkey2 = 'meta2'
131 mdval2 = 'This is the second metadata value'
132 k.set_metadata(mdkey2, mdval2)
133 k.set_contents_from_string(T1)
134 k = bucket.lookup('has_metadata')
135 self.failUnlessEqual(k.get_metadata(mdkey1), mdval1)
136 self.failUnlessEqual(k.get_metadata(mdkey2), mdval2)
137 k = bucket.new_key()
138 k.name = 'has_metadata'
139 k.get_contents_as_string()
140 self.failUnlessEqual(k.get_metadata(mdkey1), mdval1)
141 self.failUnlessEqual(k.get_metadata(mdkey2), mdval2)
142 bucket.delete_key(k)
143 # try a key with a funny character
144 rs = bucket.get_all_keys()
145 num_keys = len(rs)
146 k = bucket.new_key()
147 k.name = 'testnewline\n'
148 k.set_contents_from_string('This is a test')
149 rs = bucket.get_all_keys()
150 self.failUnlessEqual(len(rs), num_keys + 1)
151 bucket.delete_key(k)
152 rs = bucket.get_all_keys()
153 self.failUnlessEqual(len(rs), num_keys)
154
155 # tests removing objects from the store
156 @defer_to_thread
157 def test_cleanup(self):
158 s3conn = self.connect_ok()
159 bucket = self._get_bucket(s3conn)
160 for x in range(10):
161 k = bucket.new_key()
162 k.name = "foo%d" % x
163 k.set_contents_from_string("test %d" % x)
164 all = bucket.get_all_keys()
165 # now delete all keys in bucket
166 for k in all:
167 bucket.delete_key(k)
168 # now delete bucket
169 s3conn.delete_bucket(bucket)
170
171 @defer_to_thread
172 def test_connection(self):
173 s3conn = self.connect_ok()
174 bucket = self._get_bucket(s3conn)
175 all_buckets = s3conn.get_all_buckets()
176 size_bucket = s3conn.get_bucket("size")
177 discard_buucket = s3conn.get_bucket("discard")
178
179 @defer_to_thread
180 def test_persistence(self):
181 # pylint: disable-msg=W0631
182 # first, stop the server and restart it in persistent mode
183 self.restart_server(persistent=True)
184 s3conn = self.connect_ok()
185 for bcount in range(1, 5):
186 bucket = self._get_bucket(s3conn)
187 for kcount in range(1, 5):
188 k = bucket.new_key()
189 k.name = "bucket-%d-key-%d" % (bcount, kcount)
190 k.set_contents_from_string(
191 "This is key %d from bucket %d (%s)" %(
192 kcount, bcount, bucket.name))
193 k.set_metadata("bcount", bcount)
194 k.set_metadata("kcount", kcount)
195 # now get a list of all the buckets and objects in the store
196 all_buckets = s3conn.get_all_buckets()
197 all_objects = {}
198 for x in all_buckets:
199 if x.name in ["size", "discard"]:
200 continue
201 objset = all_objects.setdefault(x.name, set())
202 bucket = s3conn.get_bucket(x.name)
203 for obj in bucket.get_all_keys():
204 objset.add(obj)
205 # XXX: test metadata
206 # now stop the S4Server and restart it
207 self.restart_server(persistent=True)
208 new_buckets = s3conn.get_all_buckets()
209 self.failUnlessEqual(
210 set([x.name for x in all_buckets]),
211 set([x.name for x in new_buckets]) )
212 new_objects = {}
213 for x in new_buckets:
214 if x.name in ["size", "discard"]:
215 continue
216 objset = new_objects.setdefault(x.name, set())
217 bucket = s3conn.get_bucket(x.name)
218 for obj in bucket.get_all_keys():
219 objset.add(obj)
220 # XXX: test metadata
221 # test the newobjects
222 self.failUnlessEqual(
223 set(all_objects.keys()),
224 set(new_objects.keys()) )
225 for key in all_objects.keys():
226 self.failUnlessEqual(
227 set([x.name for x in all_objects[key]]),
228 set([x.name for x in new_objects[key]]) )
229
230 @defer_to_thread
231 def test_size_bucket(self):
232 s3conn = self.connect_ok()
233 bucket = s3conn.get_bucket("size")
234 all_keys = bucket.get_all_keys()
235 self.failUnlessEqual(all_keys, [])
236 for size in range(1, 10**7, 10000):
237 k = bucket.get_key(str(size))
238 self.failUnlessEqual(size, k.size)
239 # try to read in the last key (should be the biggest)
240 size = 0
241 k.open("r")
242 for chunk in k:
243 size += len(chunk)
244 self.failUnlessEqual(size, k.size)
245
246 @skip_test("S4 does not have this functionality yet")
247 @defer_to_thread
248 def test_acl(self):
249 s3conn = self.connect_ok()
250 bucket = self._get_bucket(s3conn)
251 # try some acl stuff
252 bucket.set_acl('public-read')
253 policy = bucket.get_acl()
254 assert len(policy.acl.grants) == 2
255 bucket.set_acl('private')
256 policy = bucket.get_acl()
257 assert len(policy.acl.grants) == 1
258 k = bucket.lookup('foo/bar')
259 k.set_acl('public-read')
260 policy = k.get_acl()
261 assert len(policy.acl.grants) == 2
262 k.set_acl('private')
263 policy = k.get_acl()
264 assert len(policy.acl.grants) == 1
265 # try the convenience methods for grants
266 bucket.add_user_grant(
267 'FULL_CONTROL',
268 'c1e724fbfa0979a4448393c59a8c055011f739b6d102fb37a65f26414653cd67')
269 self.failUnlessRaises(S3PermissionsError, bucket.add_email_grant,
270 'foobar', 'foo@bar.com')
271
272if __name__ == '__main__':
273 suite = unittest.TestSuite()
274 suite.addTest(unittest.makeSuite(S3ConnectionTest))
275 unittest.TextTestRunner(verbosity=2).run(suite)

Subscribers

People subscribed via source and target branches