Merge lp:~mvo/daisy/problem-type-column-family into lp:daisy

Proposed by Michael Vogt
Status: Needs review
Proposed branch: lp:~mvo/daisy/problem-type-column-family
Merge into: lp:daisy
Diff against target: 89 lines (+56/-0)
3 files modified
daisy/schema.py (+4/-0)
daisy/submit.py (+6/-0)
tools/back_populate_problem_type.py (+46/-0)
To merge this branch: bzr merge lp:~mvo/daisy/problem-type-column-family
Reviewer Review Type Date Requested Status
Daisy Pluckers Pending
Review via email: mp+317883@code.launchpad.net

Description of the change

This is my first jump into the world of cassandra, so please be gentle with me :)

This branch adds a new ColumnFamily of type "ProblemReport" so that its quicker to send a query that filters out all problem reports of type "Snap".

Mostly pushing this to get things started and to see if I am vaguely in the right direction.

To post a comment you must log in.
Revision history for this message
Brian Murray (brian-murray) wrote :

In reflection, I think Evan's idea of having a Snaps ColumnFamily similar to the OOPS ColumnFamily makes sense, but that we should also add a DaySnaps CF similar to DayOOPS CF.

740. By Michael Vogt

fix ordering

Unmerged revisions

740. By Michael Vogt

fix ordering

739. By Michael Vogt

add ColumnFamily for ProblemType

738. By Michael Vogt

add ProblemType ColumnFamily

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'daisy/schema.py'
2--- daisy/schema.py 2014-08-05 17:52:36 +0000
3+++ daisy/schema.py 2017-02-21 17:37:20 +0000
4@@ -93,6 +93,10 @@
5 workaround_1779(mgr.create_column_family, keyspace, 'SystemImages',
6 key_validation_class=UTF8_TYPE,
7 comparator_type=UTF8_TYPE)
8+ if 'ProblemType' not in cfs:
9+ workaround_1779(mgr.create_column_family, keyspace, 'ProblemType',
10+ key_validation_class=UTF8_TYPE,
11+ comparator_type=UTF8_TYPE)
12 finally:
13 mgr.close()
14
15
16=== modified file 'daisy/submit.py'
17--- daisy/submit.py 2016-11-03 16:39:25 +0000
18+++ daisy/submit.py 2017-02-21 17:37:20 +0000
19@@ -300,6 +300,7 @@
20 oops_cf = pycassa.ColumnFamily(_pool, 'OOPS')
21 stacktrace_cf = pycassa.ColumnFamily(_pool, 'Stacktrace')
22 images_cf = pycassa.ColumnFamily(_pool, 'SystemImages')
23+ problem_type_cf = pycassa.ColumnFamily(_pool, 'ProblemType')
24 report = create_report_from_bson(data)
25
26 # gather and insert image information in the SystemImages CF
27@@ -333,6 +334,11 @@
28 except NotFoundException:
29 images_cf.insert('device_image', {device_image : ''})
30
31+ # make problem-type easily accessible
32+ problem_type = report.get('ProblemType')
33+ if problem_type:
34+ problem_type_cf.insert(problem_type, {oops_id: ''})
35+
36 # Recoverable Problem, Package Install Failure, Suspend Resume
37 crash_signature = report.get('DuplicateSignature')
38 if crash_signature:
39
40=== added file 'tools/back_populate_problem_type.py'
41--- tools/back_populate_problem_type.py 1970-01-01 00:00:00 +0000
42+++ tools/back_populate_problem_type.py 2017-02-21 17:37:20 +0000
43@@ -0,0 +1,46 @@
44+#!/usr/bin/python
45+
46+import sys
47+import pycassa
48+from pycassa.cassandra.ttypes import NotFoundException
49+from collections import defaultdict
50+from daisy import config
51+
52+creds = {'username': config.cassandra_username,
53+ 'password': config.cassandra_password}
54+pool = pycassa.ConnectionPool(config.cassandra_keyspace,
55+ config.cassandra_hosts, timeout=600,
56+ credentials=creds)
57+
58+dayoops_cf = pycassa.ColumnFamily(pool, 'DayOOPS')
59+oops_cf = pycassa.ColumnFamily(pool, 'OOPS')
60+problem_type_cf = pycassa.ColumnFamily(pool, 'ProblemType')
61+
62+# Main
63+
64+if __name__ == '__main__':
65+ if len(sys.argv) != 2:
66+ print >>sys.stderr, "Usage: [date]"
67+ sys.exit(1)
68+ oopses = set()
69+ start = ''
70+ date = sys.argv[1]
71+ while True:
72+ try:
73+ buf = dayoops_cf.get(date, column_start=start, column_count=1000)
74+ except NotFoundException:
75+ break
76+ start = buf.keys()[-1]
77+ buf = buf.values()
78+ oopses.update(buf)
79+ if len(buf) < 1000:
80+ break
81+ for oops_id in oopses:
82+ try:
83+ data = oops_cf.get(str(oops_id), columns=['ProblemType'])
84+ problem_type = data['ProblemType']
85+ problem_type_cf.insert(problem_type, {oops_id: ''})
86+ except (NotFoundException, KeyError):
87+ # Sometimes we didn't insert the full OOPS. I have no idea why.
88+ #print 'could not find', uuid
89+ continue

Subscribers

People subscribed via source and target branches

to all changes: