Merge lp:~abentley/charmworld/bulk-insert into lp:~juju-jitsu/charmworld/trunk

Proposed by Aaron Bentley
Status: Merged
Approved by: Aaron Bentley
Approved revision: 231
Merged at revision: 232
Proposed branch: lp:~abentley/charmworld/bulk-insert
Merge into: lp:~juju-jitsu/charmworld/trunk
Diff against target: 52 lines (+22/-2)
2 files modified
charmworld/search.py (+7/-2)
charmworld/tests/test_search.py (+15/-0)
To merge this branch: bzr merge lp:~abentley/charmworld/bulk-insert
Reviewer Review Type Date Requested Status
Abel Deuring (community) code Approve
Review via email: mp+165110@code.launchpad.net

Commit message

Use bulk-insert to optimize reindexing.

Description of the change

Reindexing charms takes multiple seconds to complete. Ideally, this would be a faster operation, since may be needed when the code changes. By switching from insert to bulk-insert, the operation completes in sub-second time for 1000 charms.

For some reason, Elasticsearch treats a zero-length array as an error instead of a no-op, so ElasticSearchClient.index_charms special-cases it.

To post a comment you must log in.
Revision history for this message
Abel Deuring (adeuring) wrote :

looks good

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'charmworld/search.py'
2--- charmworld/search.py 2013-05-17 14:02:17 +0000
3+++ charmworld/search.py 2013-05-22 13:15:34 +0000
4@@ -150,6 +150,12 @@
5 self._client.index(self.index_name, 'charm', charm, charm_id,
6 refresh=True)
7
8+ def index_charms(self, charms):
9+ if len(charms) == 0:
10+ return
11+ self._client.bulk_index(self.index_name, 'charm', charms, '_id',
12+ refresh=True)
13+
14 @staticmethod
15 def _get_text_query(text):
16 boosted_fields = [field + ('' if boost is None else '^%d' % boost)
17@@ -359,8 +365,7 @@
18 copy.create_index()
19 try:
20 copy.put_mapping()
21- for charm in self.api_search(valid_charms_only=False):
22- copy.index_charm(charm)
23+ self.index_charms(self.api_search(valid_charms_only=False))
24 return copy
25 except:
26 copy.delete_index()
27
28=== modified file 'charmworld/tests/test_search.py'
29--- charmworld/tests/test_search.py 2013-05-17 17:05:49 +0000
30+++ charmworld/tests/test_search.py 2013-05-22 13:15:34 +0000
31@@ -61,6 +61,21 @@
32 self.index_client.index_charm(foo_charm)
33 self.assertTrue(self.exists_in_index(foo_charm['_id']))
34
35+ def test_index_charms(self):
36+ foo_charm = factory.get_charm_json()
37+ bar_charm = factory.get_charm_json()
38+ self.assertFalse(self.exists_in_index(foo_charm['_id']))
39+ self.assertFalse(self.exists_in_index(bar_charm['_id']))
40+ self.index_client.index_charms([foo_charm, bar_charm])
41+ self.assertTrue(self.exists_in_index(foo_charm['_id']))
42+ self.assertTrue(self.exists_in_index(bar_charm['_id']))
43+
44+ def test_index_charms_no_charms(self):
45+ # Assert that no exception is raised.
46+ with self.assertRaises(AssertionError):
47+ with self.assertRaises(BaseException):
48+ self.index_client.index_charms([])
49+
50 def makeCharm(self, *args, **kwargs):
51 charm = factory.get_charm_json(*args, **kwargs)
52 self.index_client.index_charm(charm)

Subscribers

People subscribed via source and target branches