Merge lp:~brian-murray/daisy/src-version-buckets into lp:daisy

Proposed by Brian Murray
Status: Merged
Merged at revision: 283
Proposed branch: lp:~brian-murray/daisy/src-version-buckets
Merge into: lp:daisy
Diff against target: 68 lines (+46/-0)
2 files modified
daisy/utils.py (+4/-0)
tools/build_src_version_buckets.py (+42/-0)
To merge this branch: bzr merge lp:~brian-murray/daisy/src-version-buckets
Reviewer Review Type Date Requested Status
Daisy Pluckers Pending
Review via email: mp+155631@code.launchpad.net

Description of the change

This branch will start recording to SourceVersionBuckets during the bucketing of Oopses. Additionally, there is a tool to populate SourceVersionBuckets with data from the OOPS column family.

To post a comment you must log in.
Revision history for this message
Evan (ev) wrote :

Mostly looks good. Feel free to merge once you've addressed the points
below.

On Tue, Mar 26, 2013 at 9:57 PM, Brian Murray <email address hidden> wrote:
>
> + oopses.update_source_version_buckets(oops_config, src_package,
> + version, crash_signature)
>

Can you make this a if hasattr(oopses, 'update_source_version_buckets') so
that we don't crash on processing new crashes while we're updating
oops-repository and daisy on production.

> if version:
> oopses.update_bucket_versions(oops_config, crash_signature,
> version)
>
>
> +for key, oops in oops_cf.get_range(columns=cols):
> + count += 1
> + if count % 10000 == 0:
> + break
>

I suspect you had this in for debugging?

> + release = oops['DistroRelease'].encode('utf8')
>

oops.get. You're not guaranteed to get rows with all the column names you
specified.

> +
> + if not release.startswith('Ubuntu '):
> + continue
> + package_data = oops['Package'].split(' ')

+ if len(package_data) < 2:
> + continue
> + version = package_data[1]
>

Please use daisy.utils.split_package_and_version instead, as it catches
some corner cases (hopefully).

+ src_package = oops['SourcePackage']
>

oops.get.

> + srcversbucketsinsert((src_package, version), {oops_id : ''})
>

Typo :)

280. By Brian Murray

changes based on evan's feedback

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'daisy/utils.py'
2--- daisy/utils.py 2013-03-26 23:58:48 +0000
3+++ daisy/utils.py 2013-03-27 17:13:24 +0000
4@@ -38,6 +38,7 @@
5 def bucket(oops_config, oops_id, crash_signature, report_dict):
6 release = report_dict.get('DistroRelease', '')
7 package = report_dict.get('Package', '')
8+ src_package = report_dict.get('SourcePackage', '')
9 problem_type = report_dict.get('ProblemType', '')
10 dependencies = report_dict.get('Dependencies', '')
11 system_uuid = report_dict.get('SystemIdentifier', '')
12@@ -78,6 +79,9 @@
13 oopses.update_bucket_metadata(oops_config, crash_signature, package,
14 version, apt.apt_pkg.version_compare,
15 release)
16+ if hasattr(oopses, 'update_source_version_buckets'):
17+ oopses.update_source_version_buckets(oops_config, src_package,
18+ version, crash_signature)
19 if version:
20 oopses.update_bucket_versions(oops_config, crash_signature, version)
21
22
23=== added file 'tools/build_src_version_buckets.py'
24--- tools/build_src_version_buckets.py 1970-01-01 00:00:00 +0000
25+++ tools/build_src_version_buckets.py 2013-03-27 17:13:24 +0000
26@@ -0,0 +1,42 @@
27+#!/usr/bin/python
28+
29+import pycassa
30+import uuid
31+from daisy import config
32+from utils import split_package_and_version
33+from collections import Counter
34+
35+creds = {'username': config.cassandra_username,
36+ 'password': config.cassandra_password}
37+pool = pycassa.ConnectionPool(config.cassandra_keyspace,
38+ config.cassandra_hosts, timeout=600,
39+ max_retries=100, credentials=creds)
40+
41+oops_cf = pycassa.ColumnFamily(pool, 'OOPS')
42+srcversbuckets = pycassa.ColumnFamily(pool, 'SourceVersionBuckets')
43+
44+cols = ['SourcePackage', 'Package', 'DistroRelease']
45+count = 0
46+for key, oops in oops_cf.get_range(columns=cols):
47+ count += 1
48+ if count % 100000 == 0:
49+ print 'processed', count
50+
51+ if Counter(cols) != Counter(oops.keys()):
52+ continue
53+
54+ release = oops.get('DistroRelease', '')
55+ if not release.startswith('Ubuntu ') or release == '':
56+ continue
57+ package = oops.get('Package', '')
58+ if package:
59+ package, version = split_package_and_version(package)
60+ src_package = oops.get('SourcePackage', '')
61+ if src_package == '' or version == '':
62+ continue
63+ oops_id = uuid.UUID(key)
64+ #print('Would insert (%s, %s) = {%s, ""}' % (src_package, version,
65+ # oops_id))
66+ srcversbuckets.insert((src_package, version), {oops_id : ''})
67+
68+print 'total processed', count

Subscribers

People subscribed via source and target branches

to all changes: