Merge lp:~dooferlad/linaro-license-protection/add_api into lp:~linaro-automation/linaro-license-protection/trunk

Proposed by James Tunnicliffe
Status: Merged
Approved by: Milo Casagrande
Approved revision: 177
Merged at revision: 177
Proposed branch: lp:~dooferlad/linaro-license-protection/add_api
Merge into: lp:~linaro-automation/linaro-license-protection/trunk
Diff against target: 395 lines (+294/-6)
5 files modified
HACKING (+28/-0)
license_protected_downloads/tests/test_views.py (+104/-0)
license_protected_downloads/views.py (+67/-6)
scripts/download.py (+89/-0)
urls.py (+6/-0)
To merge this branch: bzr merge lp:~dooferlad/linaro-license-protection/add_api
Reviewer Review Type Date Requested Status
Milo Casagrande (community) Approve
Review via email: mp+152717@code.launchpad.net

Description of the change

Adds an API for clients to use to download files. For sample usage see scripts/download.py.

To post a comment you must log in.
Revision history for this message
Milo Casagrande (milo) wrote :

Hi James, thanks for working on this.

It looks to go for me.
A couple of things that can be fixed during merge:

216 +# Generate the URL that will return the license information. This is the URL
217 +# if the file with /api/license prepended to the path.

I guess there is a small type there: s/if the file/of the file

Another thing that might be good is to add something in the HACKING file about the new changes/API.

review: Approve
178. By James Tunnicliffe

Added information about the API to the HACKING document.
Added test for the listing API.
Updated download.py to use all parts of the API, demonstrating how to download all files in a directory, perform interactive checks with the user for if they accept a license and storing which licenses the user accepts so they don't have to re-accept a license they have alread read.

179. By James Tunnicliffe

Added tests to check failure mode of API.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'HACKING'
2--- HACKING 2012-08-17 15:05:44 +0000
3+++ HACKING 2013-03-12 15:05:47 +0000
4@@ -94,3 +94,31 @@
5
6 staging.snapshots.linaro.org
7 staging.releases.linaro.org
8+
9+API
10+---
11+An API is provided to allow scripts to easily interact with the service without
12+having to scrape the web interface. It is designed as a RESTful service and is
13+demonstrated in scripts/download.py, which:
14+
15+ * Gets a directory listing using /api/ls
16+ - <server>/api/ls/<path to file> returns a JSON document containing all the
17+ data shown by a file listing on the web interface.
18+ * Downloads each file, using the type key from /api/ls to avoid directories
19+ * Displays the license that protects the file to the user by fetching it
20+ using /api/license.
21+ - <server>/api/license/<path to file> returns a JSON document containing
22+ the licence information for the file pointed to. Both the license text
23+ and the digest used to accept the license are returned. Including the
24+ digest means that the choice of digest used internally can change without
25+ re-writing clients - to the client this is just a magic string and they
26+ don't need to care how it is generated.
27+ * Stores the digest for each license that the user accepts to avoid asking
28+ them to accept the same license twice.
29+ * Downloads each file in the directory by providing the appropriate license
30+ accept header.
31+ - The existing web service stores which licenses have been accepted in a
32+ cookie. This is incompatible with a stateless API and is also
33+ inconvenient to manage in scripts. The license accept check function now
34+ checks for a header containing accepted license digests, which scripts
35+ should use.
36
37=== modified file 'license_protected_downloads/tests/test_views.py'
38--- license_protected_downloads/tests/test_views.py 2013-02-28 11:20:23 +0000
39+++ license_protected_downloads/tests/test_views.py 2013-03-12 15:05:47 +0000
40@@ -1,5 +1,6 @@
41 __author__ = 'dooferlad'
42
43+from datetime import datetime
44 from django.conf import settings
45 from django.test import Client, TestCase
46 import hashlib
47@@ -8,6 +9,7 @@
48 import unittest
49 import urllib2
50 import urlparse
51+import json
52
53 from license_protected_downloads import bzr_version
54 from license_protected_downloads.buildinfo import BuildInfo
55@@ -191,6 +193,108 @@
56 file_path = os.path.join(TESTSERVER_ROOT, target_file)
57 self.assertEqual(response['X-Sendfile'], file_path)
58
59+ def test_api_get_license_list(self):
60+ target_file = "build-info/snowball-blob.txt"
61+ digest = self.set_up_license(target_file)
62+
63+ license_url = "/api/license/" + target_file
64+
65+ # Download JSON containing license information
66+ response = self.client.get(license_url)
67+ data = json.loads(response.content)["licenses"]
68+
69+ # Extract digests
70+ digests = [d["digest"] for d in data]
71+
72+ # Make sure digests match what is in the database
73+ self.assertIn(digest, digests)
74+ self.assertEqual(len(digests), 1)
75+
76+ def test_api_get_license_list_multi_license(self):
77+ target_file = "build-info/multi-license.txt"
78+ digest_1 = self.set_up_license(target_file)
79+ digest_2 = self.set_up_license(target_file, 1)
80+
81+ license_url = "/api/license/" + target_file
82+
83+ # Download JSON containing license information
84+ response = self.client.get(license_url)
85+ data = json.loads(response.content)["licenses"]
86+
87+ # Extract digests
88+ digests = [d["digest"] for d in data]
89+
90+ # Make sure digests match what is in the database
91+ self.assertIn(digest_1, digests)
92+ self.assertIn(digest_2, digests)
93+ self.assertEqual(len(digests), 2)
94+
95+ def test_api_get_license_list_404(self):
96+ target_file = "build-info/snowball-b"
97+ license_url = "/api/license/" + target_file
98+
99+ # Download JSON containing license information
100+ response = self.client.get(license_url)
101+ self.assertEqual(response.status_code, 404)
102+
103+ def test_api_download_file(self):
104+ target_file = "build-info/snowball-blob.txt"
105+ digest = self.set_up_license(target_file)
106+
107+ url = urlparse.urljoin("http://testserver/", target_file)
108+ response = self.client.get(url, follow=True,
109+ HTTP_LICENSE_ACCEPTED=digest)
110+ self.assertEqual(response.status_code, 200)
111+ file_path = os.path.join(TESTSERVER_ROOT, target_file)
112+ self.assertEqual(response['X-Sendfile'], file_path)
113+
114+ def test_api_download_file_multi_license(self):
115+ target_file = "build-info/multi-license.txt"
116+ digest_1 = self.set_up_license(target_file)
117+ digest_2 = self.set_up_license(target_file, 1)
118+
119+ url = urlparse.urljoin("http://testserver/", target_file)
120+ response = self.client.get(url, follow=True,
121+ HTTP_LICENSE_ACCEPTED=" ".join([digest_1, digest_2]))
122+ self.assertEqual(response.status_code, 200)
123+ file_path = os.path.join(TESTSERVER_ROOT, target_file)
124+ self.assertEqual(response['X-Sendfile'], file_path)
125+
126+ def test_api_download_file_404(self):
127+ target_file = "build-info/snowball-blob.txt"
128+ digest = self.set_up_license(target_file)
129+
130+ url = urlparse.urljoin("http://testserver/", target_file[:-2])
131+ response = self.client.get(url, follow=True,
132+ HTTP_LICENSE_ACCEPTED=digest)
133+ self.assertEqual(response.status_code, 404)
134+
135+ def test_api_get_listing(self):
136+ url = "/api/ls/build-info"
137+ response = self.client.get(url)
138+ self.assertEqual(response.status_code, 200)
139+
140+ data = json.loads(response.content)["files"]
141+
142+ # For each file listed, check some key attributes
143+ for file_info in data:
144+ file_path = os.path.join(TESTSERVER_ROOT,
145+ file_info["url"].lstrip("/"))
146+ if file_info["type"] == "folder":
147+ self.assertTrue(os.path.isdir(file_path))
148+ else:
149+ self.assertTrue(os.path.isfile(file_path))
150+
151+ mtime = datetime.fromtimestamp(
152+ os.path.getmtime(file_path)).strftime('%d-%b-%Y %H:%M')
153+
154+ self.assertEqual(mtime, file_info["mtime"])
155+
156+ def test_api_get_listing_404(self):
157+ url = "/api/ls/buld-info"
158+ response = self.client.get(url)
159+ self.assertEqual(response.status_code, 404)
160+
161 def test_OPEN_EULA_txt(self):
162 target_file = '~linaro-android/staging-vexpress-a9/test.txt'
163 url = urlparse.urljoin("http://testserver/", target_file)
164
165=== modified file 'license_protected_downloads/views.py'
166--- license_protected_downloads/views.py 2013-02-28 10:05:14 +0000
167+++ license_protected_downloads/views.py 2013-03-12 15:05:47 +0000
168@@ -83,14 +83,14 @@
169 # example), it doesn't have a mtime.
170 mtime = 0
171
172- type = "other"
173+ target_type = "other"
174 if os.path.isdir(file):
175- type = "folder"
176+ target_type = "folder"
177 else:
178 type_tuple = guess_type(name)
179 if type_tuple and type_tuple[0]:
180 if type_tuple[0].split('/')[0] == "text":
181- type = "text"
182+ target_type = "text"
183
184 if os.path.exists(file):
185 size = os.path.getsize(file)
186@@ -111,7 +111,7 @@
187 license_list = License.objects.all_with_hashes(license_digest_list)
188 listing.append({'name': name,
189 'size': _sizeof_fmt(size),
190- 'type': type,
191+ 'type': target_type,
192 'mtime': mtime,
193 'license_digest_list': license_digest_list,
194 'license_list': license_list,
195@@ -298,6 +298,11 @@
196
197
198 def license_accepted(request, digest):
199+ license_header = "HTTP_LICENSE_ACCEPTED"
200+ if license_header in request.META:
201+ if digest in request.META[license_header].split():
202+ return True
203+
204 return 'license_accepted_' + digest in request.COOKIES
205
206
207@@ -387,7 +392,7 @@
208 if not result:
209 raise Http404
210
211- type = result[0]
212+ target_type = result[0]
213 path = result[1]
214
215 if get_client_ip(request) in config.INTERNAL_HOSTS:
216@@ -411,7 +416,7 @@
217 if openid_response:
218 return openid_response
219
220- if type == "dir":
221+ if target_type == "dir":
222 # Generate a link to the parent directory (if one exists)
223 if url != '/' and url != '':
224 up_dir = "/" + os.path.split(url)[0]
225@@ -496,3 +501,59 @@
226 raise
227
228 return HttpResponse(data)
229+
230+
231+def list_files_api(request, path):
232+ url = path
233+ result = test_path(path)
234+ if not result:
235+ raise Http404
236+
237+ target_type = result[0]
238+ path = result[1]
239+
240+ if target_type == "dir":
241+ listing = dir_list(url, path)
242+
243+ clean_listing = []
244+ for entry in listing:
245+ if len(entry["license_list"]) == 0:
246+ entry["license_list"] = ["Open"]
247+
248+ clean_listing.append({
249+ "name": entry["name"],
250+ "size": entry["size"],
251+ "type": entry["type"],
252+ "mtime": entry["mtime"],
253+ "url": entry["url"],
254+ })
255+
256+ data = json.dumps({"files": clean_listing})
257+ else:
258+ data = json.dumps({"files": ["File not found."]})
259+
260+ return HttpResponse(data, mimetype='application/json')
261+
262+
263+def get_license_api(request, path):
264+ result = test_path(path)
265+ if not result:
266+ raise Http404
267+
268+ target_type = result[0]
269+ path = result[1]
270+
271+ if target_type == "dir":
272+ data = json.dumps({"licenses":
273+ ["File not found."]})
274+ else:
275+ license_digest_list = is_protected(path)
276+ license_list = License.objects.all_with_hashes(license_digest_list)
277+ if len(license_list) == 0:
278+ license_list = ["Open"]
279+ else:
280+ license_list = [{"text": l.text, "digest": l.digest}
281+ for l in license_list]
282+ data = json.dumps({"licenses": license_list})
283+
284+ return HttpResponse(data, mimetype='application/json')
285
286=== added file 'scripts/download.py'
287--- scripts/download.py 1970-01-01 00:00:00 +0000
288+++ scripts/download.py 2013-03-12 15:05:47 +0000
289@@ -0,0 +1,89 @@
290+#!/usr/bin/python
291+
292+import json
293+import urlparse
294+import shutil
295+import urllib2
296+import os
297+from html2text import html2text
298+
299+# Example of how to use the API to download all files in a directory. This is
300+# written as one procedural script without functions
301+directory_url = "http://localhost:8001/build-info"
302+
303+# Generate the URL that will return the license information. This is the URL
304+# of the file with /api/license prepended to the path.
305+
306+# Unfortunately urlsplit returns an immutable object. Convert it to an array
307+# so we can modify the path section (index 2)
308+parsed_url = [c for c in urlparse.urlsplit(directory_url)]
309+url_path_section = parsed_url[2]
310+
311+parsed_url[2] = "/api/ls" + url_path_section
312+listing_url = urlparse.urlunsplit(parsed_url)
313+
314+u = urllib2.urlopen(listing_url)
315+data = json.loads(u.read())["files"]
316+
317+for file_info in data:
318+ if file_info["type"] == "folder":
319+ # Skip folders...
320+ continue
321+
322+ parsed_url[2] = "/api/license" + file_info["url"]
323+ license_url = urlparse.urlunsplit(parsed_url)
324+
325+ parsed_url[2] = file_info["url"]
326+ file_url = urlparse.urlunsplit(parsed_url)
327+
328+ # Get the licenses. They are returned as a JSON document in the form:
329+ # {"licenses":
330+ # [{"text": "<license text>", "digest": "<digest of license>"},
331+ # {"text": "<license text>", "digest": "<digest of license>"},
332+ # ...
333+ # ]}
334+ # Each license has a digest associated with it.
335+ u = urllib2.urlopen(license_url)
336+ data = json.loads(u.read())["licenses"]
337+
338+ if data[0] == "Open":
339+ headers = {}
340+ else:
341+ # If this were a command line client designed to ask the user to accept
342+ # each license, you could use this code to ask the user to accept each
343+ # license in turn. In this example we store which licenses are accepted
344+ # so the user only has to accept them once.
345+ if os.path.isfile("accepted_licenses"):
346+ with open("accepted_licenses") as accepted_licenses_file:
347+ accepted_licenses = accepted_licenses_file.read().split()
348+ else:
349+ accepted_licenses = []
350+
351+ # Present each license to the user...
352+ for d in data:
353+ if d["digest"] not in accepted_licenses:
354+ # Licenses are stored as HTML. Convert them to markdown (text)
355+ # and print it to the terminal.
356+ print html2text(d["text"])
357+
358+ # Ask the user if they accept the license. If they don't we
359+ # terminate the script.
360+ user_response = raw_input("Do you accept this license? (y/N)")
361+ if user_response != "y":
362+ exit(1)
363+
364+ accepted_licenses.append(d["digest"])
365+
366+ # Store the licenses that the user accepted
367+ with open("accepted_licenses", "w") as accepted_licenses_file:
368+ accepted_licenses_file.write(" ".join(accepted_licenses))
369+
370+ # To accept a license, place the digest in the LICENSE_ACCEPTED header.
371+ # For multiple licenses, they are stored space separated.
372+ digests = [d["digest"] for d in data]
373+ headers = {"LICENSE_ACCEPTED": " ".join(digests)}
374+
375+ # Once the header has been generated, just download the file.
376+ req = urllib2.urlopen(urllib2.Request(file_url, headers=headers))
377+ with open(os.path.basename(parsed_url[2]), 'wb') as fp:
378+ shutil.copyfileobj(req, fp)
379
380=== modified file 'urls.py'
381--- urls.py 2013-02-26 19:38:30 +0000
382+++ urls.py 2013-03-12 15:05:47 +0000
383@@ -46,6 +46,12 @@
384 'license_protected_downloads.views.get_textile_files',
385 name='get_textile_files'),
386
387+ url(r'^api/ls/(?P<path>.*)$',
388+ 'license_protected_downloads.views.list_files_api'),
389+
390+ url(r'^api/license/(?P<path>.*)$',
391+ 'license_protected_downloads.views.get_license_api'),
392+
393 # Catch-all. We always return a file (or try to) if it exists.
394 # This handler does that.
395 url(r'(?P<path>.*)', 'license_protected_downloads.views.file_server'),

Subscribers

People subscribed via source and target branches