Merge lp:~nigelbabu/summit/stop-screen-scrape into lp:summit

Proposed by Nigel Babu
Status: Merged
Approved by: Nigel Babu
Approved revision: 118
Merged at revision: 121
Proposed branch: lp:~nigelbabu/summit/stop-screen-scrape
Merge into: lp:summit
Diff against target: 45 lines (+7/-14)
1 file modified
summit/schedule/models/meetingmodel.py (+7/-14)
To merge this branch: bzr merge lp:~nigelbabu/summit/stop-screen-scrape
Reviewer Review Type Date Requested Status
James Westby (community) Approve
Summit Hackers Pending
Review via email: mp+63490@code.launchpad.net

Commit message

Stop the screen scape and use the json API instead.

Description of the change

Stop the screen scape and use the json API instead.

To post a comment you must log in.
Revision history for this message
James Westby (james-w) wrote :

Hi,

The assumptions this makes about the API aren't great, but it's
only really as fragile as what was there before, so there's
no reason it shouldn't go in.

Thanks,

James

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'summit/schedule/models/meetingmodel.py'
2--- summit/schedule/models/meetingmodel.py 2011-05-22 19:14:22 +0000
3+++ summit/schedule/models/meetingmodel.py 2011-06-05 14:25:54 +0000
4@@ -18,14 +18,13 @@
5 from urlparse import urlparse
6 from datetime import datetime
7
8-from BeautifulSoup import BeautifulSoup
9-
10 from django.db import models
11 from django.core.exceptions import ObjectDoesNotExist
12 from django.conf import settings
13 from django.core.urlresolvers import reverse
14 #from summit.brainstorm.models import Idea
15
16+import simplejson as json
17 from summit.schedule.fields import NameField
18
19 from summit.schedule.models.summitmodel import Summit
20@@ -204,19 +203,13 @@
21 self.spec_url = spec_url
22
23 # fetch to update the title
24- # if only Launchpad had some kind of API which meant we didn't
25- # have to screenscrape... :-/
26+ # Now, LP has an API, we're stopping the sceen scrape :)
27 try:
28- lpdata = urllib2.urlopen(spec_url)
29- soup = BeautifulSoup(lpdata)
30- title = soup('h1')[0]('span')[0].string.replace(
31- "&lt;", "<").replace(
32- "&gt;", ">").replace(
33- "&apos;", "'").replace(
34- "&quot;", '"').replace(
35- "&amp;", "&").rstrip()
36- if title:
37- self.title = title[:100]
38+ apiurl = spec_url.replace('blueprints.launchpad.net', 'api.launchpad.net/devel') + '?ws.accept=application/json'
39+ lpdata = urllib2.urlopen(apiurl)
40+ data = json.loads(lpdata)
41+ if 'title' in data:
42+ self.title = data['title'][:100]
43 except (urllib2.HTTPError, KeyError):
44 pass
45 else:

Subscribers

People subscribed via source and target branches