Merge lp:~dooferlad/linaro-license-protection/add_api into lp:~linaro-automation/linaro-license-protection/trunk

Proposed by James Tunnicliffe
Status: Merged
Approved by: Milo Casagrande
Approved revision: 177
Merged at revision: 177
Proposed branch: lp:~dooferlad/linaro-license-protection/add_api
Merge into: lp:~linaro-automation/linaro-license-protection/trunk
Diff against target: 395 lines (+294/-6)
5 files modified
HACKING (+28/-0)
license_protected_downloads/tests/test_views.py (+104/-0)
license_protected_downloads/views.py (+67/-6)
scripts/download.py (+89/-0)
urls.py (+6/-0)
To merge this branch: bzr merge lp:~dooferlad/linaro-license-protection/add_api
Reviewer Review Type Date Requested Status
Milo Casagrande (community) Approve
Review via email: mp+152717@code.launchpad.net

Description of the change

Adds an API for clients to use to download files. For sample usage see scripts/download.py.

To post a comment you must log in.
Revision history for this message
Milo Casagrande (milo) wrote :

Hi James, thanks for working on this.

It looks to go for me.
A couple of things that can be fixed during merge:

216 +# Generate the URL that will return the license information. This is the URL
217 +# if the file with /api/license prepended to the path.

I guess there is a small type there: s/if the file/of the file

Another thing that might be good is to add something in the HACKING file about the new changes/API.

review: Approve
178. By James Tunnicliffe

Added information about the API to the HACKING document.
Added test for the listing API.
Updated download.py to use all parts of the API, demonstrating how to download all files in a directory, perform interactive checks with the user for if they accept a license and storing which licenses the user accepts so they don't have to re-accept a license they have alread read.

179. By James Tunnicliffe

Added tests to check failure mode of API.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'HACKING'
--- HACKING 2012-08-17 15:05:44 +0000
+++ HACKING 2013-03-12 15:05:47 +0000
@@ -94,3 +94,31 @@
9494
95 staging.snapshots.linaro.org95 staging.snapshots.linaro.org
96 staging.releases.linaro.org96 staging.releases.linaro.org
97
98API
99---
100An API is provided to allow scripts to easily interact with the service without
101having to scrape the web interface. It is designed as a RESTful service and is
102demonstrated in scripts/download.py, which:
103
104 * Gets a directory listing using /api/ls
105 - <server>/api/ls/<path to file> returns a JSON document containing all the
106 data shown by a file listing on the web interface.
107 * Downloads each file, using the type key from /api/ls to avoid directories
108 * Displays the license that protects the file to the user by fetching it
109 using /api/license.
110 - <server>/api/license/<path to file> returns a JSON document containing
111 the licence information for the file pointed to. Both the license text
112 and the digest used to accept the license are returned. Including the
113 digest means that the choice of digest used internally can change without
114 re-writing clients - to the client this is just a magic string and they
115 don't need to care how it is generated.
116 * Stores the digest for each license that the user accepts to avoid asking
117 them to accept the same license twice.
118 * Downloads each file in the directory by providing the appropriate license
119 accept header.
120 - The existing web service stores which licenses have been accepted in a
121 cookie. This is incompatible with a stateless API and is also
122 inconvenient to manage in scripts. The license accept check function now
123 checks for a header containing accepted license digests, which scripts
124 should use.
97125
=== modified file 'license_protected_downloads/tests/test_views.py'
--- license_protected_downloads/tests/test_views.py 2013-02-28 11:20:23 +0000
+++ license_protected_downloads/tests/test_views.py 2013-03-12 15:05:47 +0000
@@ -1,5 +1,6 @@
1__author__ = 'dooferlad'1__author__ = 'dooferlad'
22
3from datetime import datetime
3from django.conf import settings4from django.conf import settings
4from django.test import Client, TestCase5from django.test import Client, TestCase
5import hashlib6import hashlib
@@ -8,6 +9,7 @@
8import unittest9import unittest
9import urllib210import urllib2
10import urlparse11import urlparse
12import json
1113
12from license_protected_downloads import bzr_version14from license_protected_downloads import bzr_version
13from license_protected_downloads.buildinfo import BuildInfo15from license_protected_downloads.buildinfo import BuildInfo
@@ -191,6 +193,108 @@
191 file_path = os.path.join(TESTSERVER_ROOT, target_file)193 file_path = os.path.join(TESTSERVER_ROOT, target_file)
192 self.assertEqual(response['X-Sendfile'], file_path)194 self.assertEqual(response['X-Sendfile'], file_path)
193195
196 def test_api_get_license_list(self):
197 target_file = "build-info/snowball-blob.txt"
198 digest = self.set_up_license(target_file)
199
200 license_url = "/api/license/" + target_file
201
202 # Download JSON containing license information
203 response = self.client.get(license_url)
204 data = json.loads(response.content)["licenses"]
205
206 # Extract digests
207 digests = [d["digest"] for d in data]
208
209 # Make sure digests match what is in the database
210 self.assertIn(digest, digests)
211 self.assertEqual(len(digests), 1)
212
213 def test_api_get_license_list_multi_license(self):
214 target_file = "build-info/multi-license.txt"
215 digest_1 = self.set_up_license(target_file)
216 digest_2 = self.set_up_license(target_file, 1)
217
218 license_url = "/api/license/" + target_file
219
220 # Download JSON containing license information
221 response = self.client.get(license_url)
222 data = json.loads(response.content)["licenses"]
223
224 # Extract digests
225 digests = [d["digest"] for d in data]
226
227 # Make sure digests match what is in the database
228 self.assertIn(digest_1, digests)
229 self.assertIn(digest_2, digests)
230 self.assertEqual(len(digests), 2)
231
232 def test_api_get_license_list_404(self):
233 target_file = "build-info/snowball-b"
234 license_url = "/api/license/" + target_file
235
236 # Download JSON containing license information
237 response = self.client.get(license_url)
238 self.assertEqual(response.status_code, 404)
239
240 def test_api_download_file(self):
241 target_file = "build-info/snowball-blob.txt"
242 digest = self.set_up_license(target_file)
243
244 url = urlparse.urljoin("http://testserver/", target_file)
245 response = self.client.get(url, follow=True,
246 HTTP_LICENSE_ACCEPTED=digest)
247 self.assertEqual(response.status_code, 200)
248 file_path = os.path.join(TESTSERVER_ROOT, target_file)
249 self.assertEqual(response['X-Sendfile'], file_path)
250
251 def test_api_download_file_multi_license(self):
252 target_file = "build-info/multi-license.txt"
253 digest_1 = self.set_up_license(target_file)
254 digest_2 = self.set_up_license(target_file, 1)
255
256 url = urlparse.urljoin("http://testserver/", target_file)
257 response = self.client.get(url, follow=True,
258 HTTP_LICENSE_ACCEPTED=" ".join([digest_1, digest_2]))
259 self.assertEqual(response.status_code, 200)
260 file_path = os.path.join(TESTSERVER_ROOT, target_file)
261 self.assertEqual(response['X-Sendfile'], file_path)
262
263 def test_api_download_file_404(self):
264 target_file = "build-info/snowball-blob.txt"
265 digest = self.set_up_license(target_file)
266
267 url = urlparse.urljoin("http://testserver/", target_file[:-2])
268 response = self.client.get(url, follow=True,
269 HTTP_LICENSE_ACCEPTED=digest)
270 self.assertEqual(response.status_code, 404)
271
272 def test_api_get_listing(self):
273 url = "/api/ls/build-info"
274 response = self.client.get(url)
275 self.assertEqual(response.status_code, 200)
276
277 data = json.loads(response.content)["files"]
278
279 # For each file listed, check some key attributes
280 for file_info in data:
281 file_path = os.path.join(TESTSERVER_ROOT,
282 file_info["url"].lstrip("/"))
283 if file_info["type"] == "folder":
284 self.assertTrue(os.path.isdir(file_path))
285 else:
286 self.assertTrue(os.path.isfile(file_path))
287
288 mtime = datetime.fromtimestamp(
289 os.path.getmtime(file_path)).strftime('%d-%b-%Y %H:%M')
290
291 self.assertEqual(mtime, file_info["mtime"])
292
293 def test_api_get_listing_404(self):
294 url = "/api/ls/buld-info"
295 response = self.client.get(url)
296 self.assertEqual(response.status_code, 404)
297
194 def test_OPEN_EULA_txt(self):298 def test_OPEN_EULA_txt(self):
195 target_file = '~linaro-android/staging-vexpress-a9/test.txt'299 target_file = '~linaro-android/staging-vexpress-a9/test.txt'
196 url = urlparse.urljoin("http://testserver/", target_file)300 url = urlparse.urljoin("http://testserver/", target_file)
197301
=== modified file 'license_protected_downloads/views.py'
--- license_protected_downloads/views.py 2013-02-28 10:05:14 +0000
+++ license_protected_downloads/views.py 2013-03-12 15:05:47 +0000
@@ -83,14 +83,14 @@
83 # example), it doesn't have a mtime.83 # example), it doesn't have a mtime.
84 mtime = 084 mtime = 0
8585
86 type = "other"86 target_type = "other"
87 if os.path.isdir(file):87 if os.path.isdir(file):
88 type = "folder"88 target_type = "folder"
89 else:89 else:
90 type_tuple = guess_type(name)90 type_tuple = guess_type(name)
91 if type_tuple and type_tuple[0]:91 if type_tuple and type_tuple[0]:
92 if type_tuple[0].split('/')[0] == "text":92 if type_tuple[0].split('/')[0] == "text":
93 type = "text"93 target_type = "text"
9494
95 if os.path.exists(file):95 if os.path.exists(file):
96 size = os.path.getsize(file)96 size = os.path.getsize(file)
@@ -111,7 +111,7 @@
111 license_list = License.objects.all_with_hashes(license_digest_list)111 license_list = License.objects.all_with_hashes(license_digest_list)
112 listing.append({'name': name,112 listing.append({'name': name,
113 'size': _sizeof_fmt(size),113 'size': _sizeof_fmt(size),
114 'type': type,114 'type': target_type,
115 'mtime': mtime,115 'mtime': mtime,
116 'license_digest_list': license_digest_list,116 'license_digest_list': license_digest_list,
117 'license_list': license_list,117 'license_list': license_list,
@@ -298,6 +298,11 @@
298298
299299
300def license_accepted(request, digest):300def license_accepted(request, digest):
301 license_header = "HTTP_LICENSE_ACCEPTED"
302 if license_header in request.META:
303 if digest in request.META[license_header].split():
304 return True
305
301 return 'license_accepted_' + digest in request.COOKIES306 return 'license_accepted_' + digest in request.COOKIES
302307
303308
@@ -387,7 +392,7 @@
387 if not result:392 if not result:
388 raise Http404393 raise Http404
389394
390 type = result[0]395 target_type = result[0]
391 path = result[1]396 path = result[1]
392397
393 if get_client_ip(request) in config.INTERNAL_HOSTS:398 if get_client_ip(request) in config.INTERNAL_HOSTS:
@@ -411,7 +416,7 @@
411 if openid_response:416 if openid_response:
412 return openid_response417 return openid_response
413418
414 if type == "dir":419 if target_type == "dir":
415 # Generate a link to the parent directory (if one exists)420 # Generate a link to the parent directory (if one exists)
416 if url != '/' and url != '':421 if url != '/' and url != '':
417 up_dir = "/" + os.path.split(url)[0]422 up_dir = "/" + os.path.split(url)[0]
@@ -496,3 +501,59 @@
496 raise501 raise
497502
498 return HttpResponse(data)503 return HttpResponse(data)
504
505
506def list_files_api(request, path):
507 url = path
508 result = test_path(path)
509 if not result:
510 raise Http404
511
512 target_type = result[0]
513 path = result[1]
514
515 if target_type == "dir":
516 listing = dir_list(url, path)
517
518 clean_listing = []
519 for entry in listing:
520 if len(entry["license_list"]) == 0:
521 entry["license_list"] = ["Open"]
522
523 clean_listing.append({
524 "name": entry["name"],
525 "size": entry["size"],
526 "type": entry["type"],
527 "mtime": entry["mtime"],
528 "url": entry["url"],
529 })
530
531 data = json.dumps({"files": clean_listing})
532 else:
533 data = json.dumps({"files": ["File not found."]})
534
535 return HttpResponse(data, mimetype='application/json')
536
537
538def get_license_api(request, path):
539 result = test_path(path)
540 if not result:
541 raise Http404
542
543 target_type = result[0]
544 path = result[1]
545
546 if target_type == "dir":
547 data = json.dumps({"licenses":
548 ["File not found."]})
549 else:
550 license_digest_list = is_protected(path)
551 license_list = License.objects.all_with_hashes(license_digest_list)
552 if len(license_list) == 0:
553 license_list = ["Open"]
554 else:
555 license_list = [{"text": l.text, "digest": l.digest}
556 for l in license_list]
557 data = json.dumps({"licenses": license_list})
558
559 return HttpResponse(data, mimetype='application/json')
499560
=== added file 'scripts/download.py'
--- scripts/download.py 1970-01-01 00:00:00 +0000
+++ scripts/download.py 2013-03-12 15:05:47 +0000
@@ -0,0 +1,89 @@
1#!/usr/bin/python
2
3import json
4import urlparse
5import shutil
6import urllib2
7import os
8from html2text import html2text
9
10# Example of how to use the API to download all files in a directory. This is
11# written as one procedural script without functions
12directory_url = "http://localhost:8001/build-info"
13
14# Generate the URL that will return the license information. This is the URL
15# of the file with /api/license prepended to the path.
16
17# Unfortunately urlsplit returns an immutable object. Convert it to an array
18# so we can modify the path section (index 2)
19parsed_url = [c for c in urlparse.urlsplit(directory_url)]
20url_path_section = parsed_url[2]
21
22parsed_url[2] = "/api/ls" + url_path_section
23listing_url = urlparse.urlunsplit(parsed_url)
24
25u = urllib2.urlopen(listing_url)
26data = json.loads(u.read())["files"]
27
28for file_info in data:
29 if file_info["type"] == "folder":
30 # Skip folders...
31 continue
32
33 parsed_url[2] = "/api/license" + file_info["url"]
34 license_url = urlparse.urlunsplit(parsed_url)
35
36 parsed_url[2] = file_info["url"]
37 file_url = urlparse.urlunsplit(parsed_url)
38
39 # Get the licenses. They are returned as a JSON document in the form:
40 # {"licenses":
41 # [{"text": "<license text>", "digest": "<digest of license>"},
42 # {"text": "<license text>", "digest": "<digest of license>"},
43 # ...
44 # ]}
45 # Each license has a digest associated with it.
46 u = urllib2.urlopen(license_url)
47 data = json.loads(u.read())["licenses"]
48
49 if data[0] == "Open":
50 headers = {}
51 else:
52 # If this were a command line client designed to ask the user to accept
53 # each license, you could use this code to ask the user to accept each
54 # license in turn. In this example we store which licenses are accepted
55 # so the user only has to accept them once.
56 if os.path.isfile("accepted_licenses"):
57 with open("accepted_licenses") as accepted_licenses_file:
58 accepted_licenses = accepted_licenses_file.read().split()
59 else:
60 accepted_licenses = []
61
62 # Present each license to the user...
63 for d in data:
64 if d["digest"] not in accepted_licenses:
65 # Licenses are stored as HTML. Convert them to markdown (text)
66 # and print it to the terminal.
67 print html2text(d["text"])
68
69 # Ask the user if they accept the license. If they don't we
70 # terminate the script.
71 user_response = raw_input("Do you accept this license? (y/N)")
72 if user_response != "y":
73 exit(1)
74
75 accepted_licenses.append(d["digest"])
76
77 # Store the licenses that the user accepted
78 with open("accepted_licenses", "w") as accepted_licenses_file:
79 accepted_licenses_file.write(" ".join(accepted_licenses))
80
81 # To accept a license, place the digest in the LICENSE_ACCEPTED header.
82 # For multiple licenses, they are stored space separated.
83 digests = [d["digest"] for d in data]
84 headers = {"LICENSE_ACCEPTED": " ".join(digests)}
85
86 # Once the header has been generated, just download the file.
87 req = urllib2.urlopen(urllib2.Request(file_url, headers=headers))
88 with open(os.path.basename(parsed_url[2]), 'wb') as fp:
89 shutil.copyfileobj(req, fp)
090
=== modified file 'urls.py'
--- urls.py 2013-02-26 19:38:30 +0000
+++ urls.py 2013-03-12 15:05:47 +0000
@@ -46,6 +46,12 @@
46 'license_protected_downloads.views.get_textile_files',46 'license_protected_downloads.views.get_textile_files',
47 name='get_textile_files'),47 name='get_textile_files'),
4848
49 url(r'^api/ls/(?P<path>.*)$',
50 'license_protected_downloads.views.list_files_api'),
51
52 url(r'^api/license/(?P<path>.*)$',
53 'license_protected_downloads.views.get_license_api'),
54
49 # Catch-all. We always return a file (or try to) if it exists.55 # Catch-all. We always return a file (or try to) if it exists.
50 # This handler does that.56 # This handler does that.
51 url(r'(?P<path>.*)', 'license_protected_downloads.views.file_server'),57 url(r'(?P<path>.*)', 'license_protected_downloads.views.file_server'),

Subscribers

People subscribed via source and target branches