Merge lp:~ev/oops-repository/weighted-machines into lp:~daisy-pluckers/oops-repository/trunk

Proposed by Evan
Status: Merged
Merged at revision: 63
Proposed branch: lp:~ev/oops-repository/weighted-machines
Merge into: lp:~daisy-pluckers/oops-repository/trunk
Diff against target: 125 lines (+63/-4)
3 files modified
oopsrepository/oopses.py (+32/-0)
oopsrepository/schema.py (+14/-0)
oopsrepository/tests/test_oopses.py (+17/-4)
To merge this branch: bzr merge lp:~ev/oops-repository/weighted-machines
Reviewer Review Type Date Requested Status
Daisy Pluckers Pending
Review via email: mp+153887@code.launchpad.net

Description of the change

This adds a new method, update_errors_by_release, which populates the FirstError column family with the first occurrence of an error for the given system identifier in the given Ubuntu release. It then writes this first occurrence into the ErrorsByRelease CF for the given Ubuntu release and today's date under the OOPS ID. This allows us to look up all the errors that occurred for an Ubuntu release for each day, weighting each error in the result set by how many days its been since its first error, as discussed in bug 1077122.

More complete details of the implementation can be found in bug 1077122. The big difference from that write up is that instead of using a new uuid1() for the column name in ErrorsByRelease, we re-use the OOPS ID UUID, so that the script in https://code.launchpad.net/~ev/daisy/weighted-machines/+merge/153885 can be idempotent on multiple runs.

To post a comment you must log in.
Revision history for this message
Brian Murray (brian-murray) wrote :
Download full text (3.4 KiB)

On Mon, Mar 18, 2013 at 05:52:20PM -0000, Evan Dandrea wrote:
> Evan Dandrea has proposed merging
> lp:~ev/oops-repository/weighted-machines into
> lp:~daisy-pluckers/oops-repository/trunk.
>
> Requested reviews:
> Daisy Pluckers (daisy-pluckers)
>
> For more details, see:
> https://code.launchpad.net/~ev/oops-repository/weighted-machines/+merge/153887
>
> This adds a new method, update_errors_by_release, which populates the
> FirstError column family with the first occurrence of an error for the
> given system identifier in the given Ubuntu release. It then writes
> this first occurrence into the ErrorsByRelease CF for the given Ubuntu
> release and today's date under the OOPS ID. This allows us to look up
> all the errors that occurred for an Ubuntu release for each day,
> weighting each error in the result set by how many days its been since
> its first error, as discussed in bug 1077122.
>
> More complete details of the implementation can be found in bug
> 1077122. The big difference from that write up is that instead of
> using a new uuid1() for the column name in ErrorsByRelease, we re-use
> the OOPS ID UUID, so that the script in
> https://code.launchpad.net/~ev/daisy/weighted-machines/+merge/153885
> can be idempotent on multiple runs.

> --
> https://code.launchpad.net/~ev/oops-repository/weighted-machines/+merge/153887
> Your team Daisy Pluckers is requested to review the proposed merge of lp:~ev/oops-repository/weighted-machines into lp:~daisy-pluckers/oops-repository/trunk.

> === modified file 'oopsrepository/oopses.py'
> --- oopsrepository/oopses.py 2013-03-12 00:22:41 +0000
> +++ oopsrepository/oopses.py 2013-03-18 17:51:27 +0000
> @@ -9,6 +9,7 @@
> import json
> import time
> import uuid
> +import datetime
>
> import pycassa
> from pycassa.cassandra.ttypes import NotFoundException
> @@ -186,6 +187,37 @@
> except NotFoundException:
> return None
>
> +def update_errors_by_release(config, oops_id, system_token, release):
> + release = release.encode('utf8')
> + pool = connection_pool(config)
> + firsterror = pycassa.ColumnFamily(pool, 'FirstError')
> + errorsbyrelease = pycassa.ColumnFamily(pool, 'ErrorsByRelease')
> +
> + today = datetime.datetime.today()
> + today = today.replace(hour=0, minute=0, second=0, microsecond=0)
> + try:
> + first_error_date = firsterror.get(release, columns=[system_token])
> + first_error_date = first_error_date[system_token]
> + except NotFoundException:
> + firsterror.insert(release, {system_token: today})
> + first_error_date = today
> +
> + # We use the OOPS ID rather than the system identifier here because we want
> + # each crash from a system to take up a new column in this column family.
> + # Each one of those columns should be associated with the date of the first
> + # error for the system in this release.
> + #
> + # Remember, we're ultimately tracking errors here, not systems, but we need
> + # the system idnetifier to know the first occurance of an error in the

typo: identifier

> + # release for that machine.
> + #
> + # For the given release for today, the crash should be...

Read more...

Revision history for this message
Evan (ev) wrote :

On Mon, Mar 18, 2013 at 11:31 PM, Brian Murray <email address hidden> wrote:

> > + # Remember, we're ultimately tracking errors here, not systems, but
> we need
> > + # the system idnetifier to know the first occurance of an error in
> the
>
> typo: identifier
>
> > + # release for that machine.
> > + #
> > + # For the given release for today, the crash should be weighted by
> the
> > + # first time an error occured in the release for the system this
> came from.
> > + # Multipled by their weight and summed together, these form the
> numerator
>
> typo: Multiplied
>

Fixed as r66.

> Otherwise the rest of the merge proposal looks good to me.

Excellent. Thanks for the review!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'oopsrepository/oopses.py'
2--- oopsrepository/oopses.py 2013-03-12 00:22:41 +0000
3+++ oopsrepository/oopses.py 2013-03-19 08:50:27 +0000
4@@ -9,6 +9,7 @@
5 import json
6 import time
7 import uuid
8+import datetime
9
10 import pycassa
11 from pycassa.cassandra.ttypes import NotFoundException
12@@ -186,6 +187,37 @@
13 except NotFoundException:
14 return None
15
16+def update_errors_by_release(config, oops_id, system_token, release):
17+ release = release.encode('utf8')
18+ pool = connection_pool(config)
19+ firsterror = pycassa.ColumnFamily(pool, 'FirstError')
20+ errorsbyrelease = pycassa.ColumnFamily(pool, 'ErrorsByRelease')
21+
22+ today = datetime.datetime.today()
23+ today = today.replace(hour=0, minute=0, second=0, microsecond=0)
24+ try:
25+ first_error_date = firsterror.get(release, columns=[system_token])
26+ first_error_date = first_error_date[system_token]
27+ except NotFoundException:
28+ firsterror.insert(release, {system_token: today})
29+ first_error_date = today
30+
31+ # We use the OOPS ID rather than the system identifier here because we want
32+ # each crash from a system to take up a new column in this column family.
33+ # Each one of those columns should be associated with the date of the first
34+ # error for the system in this release.
35+ #
36+ # Remember, we're ultimately tracking errors here, not systems, but we need
37+ # the system identifier to know the first occurrence of an error in the
38+ # release for that machine.
39+ #
40+ # For the given release for today, the crash should be weighted by the
41+ # first time an error occurred in the release for the system this came
42+ # from. Multiplied by their weight and summed together, these form the
43+ # numerator of our average errors per calendar day calculation.
44+
45+ errorsbyrelease.insert((release, today), {oops_id: first_error_date})
46+
47 def update_bucket_metadata(config, bucketid, source, version, comparator, release=''):
48 # We only update the first and last seen version fields. We do not update
49 # the current version field as talking to Launchpad is an expensive
50
51=== modified file 'oopsrepository/schema.py'
52--- oopsrepository/schema.py 2013-03-05 16:57:16 +0000
53+++ oopsrepository/schema.py 2013-03-19 08:50:27 +0000
54@@ -8,9 +8,11 @@
55
56 from pycassa.types import (
57 CompositeType,
58+ AsciiType,
59 UTF8Type,
60 CounterColumnType,
61 IntegerType,
62+ DateType,
63 )
64 from pycassa.system_manager import (
65 SystemManager,
66@@ -76,6 +78,18 @@
67 workaround_1779(mgr.create_column_family, keyspace, 'Counters',
68 comparator_type=UTF8_TYPE,
69 default_validation_class=CounterColumnType())
70+ if 'FirstError' not in cfs:
71+ workaround_1779(mgr.create_column_family, keyspace, 'FirstError',
72+ key_validation_class=ASCII_TYPE,
73+ comparator_type=ASCII_TYPE,
74+ default_validation_class=DateType())
75+ if 'ErrorsByRelease' not in cfs:
76+ composite = CompositeType(AsciiType(), DateType())
77+ workaround_1779(mgr.create_column_family, keyspace,
78+ 'ErrorsByRelease',
79+ default_validation_class=DateType(),
80+ key_validation_class=composite,
81+ comparator_type=TIME_UUID_TYPE)
82 finally:
83 mgr.close()
84
85
86=== modified file 'oopsrepository/tests/test_oopses.py'
87--- oopsrepository/tests/test_oopses.py 2013-03-09 00:22:38 +0000
88+++ oopsrepository/tests/test_oopses.py 2013-03-19 08:50:27 +0000
89@@ -108,15 +108,11 @@
90 day_key = oopses.insert_dict(self.config, oopsid, oops, user_token)
91 oops_count = counters_cf.get('oopses', [day_key])
92 self.assertEqual([1], oops_count.values())
93- user_count = counters_cf.get('users', [day_key])
94- self.assertEqual([1], user_count.values())
95
96 oopsid = str(uuid.uuid1())
97 day_key = oopses.insert_dict(config, oopsid, oops, user_token)
98 oops_count = counters_cf.get('oopses', [day_key])
99 self.assertEqual([2], oops_count.values())
100- user_count = counters_cf.get('users', [day_key])
101- self.assertEqual([1], user_count.values())
102
103 class TestBucket(ClearCache):
104
105@@ -206,3 +202,20 @@
106 bucketversions_cf = pycassa.ColumnFamily(self.pool, 'BucketVersions')
107 oopses.update_bucket_versions(self.config, 'bucket-id', '1.2.3')
108 self.assertEqual(bucketversions_cf.get('bucket-id')['1.2.3'], 1)
109+
110+ def test_update_errors_by_release(self):
111+ keyspace = self.useFixture(TemporaryOOPSDB()).keyspace
112+ firsterror = pycassa.ColumnFamily(self.pool, 'FirstError')
113+ errorsbyrelease = pycassa.ColumnFamily(self.pool, 'ErrorsByRelease')
114+ release = 'Ubuntu 12.04'
115+ system_token = 'system-id'
116+ oops_id = uuid.uuid1()
117+ today = datetime.datetime.today()
118+ today = today.replace(hour=0, minute=0, second=0, microsecond=0)
119+ oopses.update_errors_by_release(self.config, oops_id, system_token,
120+ release)
121+
122+ d = firsterror.get(release, columns=[system_token])[system_token]
123+ self.assertEqual(today, d)
124+ d = errorsbyrelease.get((release, today)).values()[0]
125+ self.assertEqual(today, d)

Subscribers

People subscribed via source and target branches

to all changes: