Merge lp:~ev/oops-repository/bucket-versions into lp:~daisy-pluckers/oops-repository/trunk
Status: | Merged |
---|---|
Merged at revision: | 69 |
Proposed branch: | lp:~ev/oops-repository/bucket-versions |
Merge into: | lp:~daisy-pluckers/oops-repository/trunk |
Diff against target: |
115 lines (+78/-4) 3 files modified
oopsrepository/oopses.py (+30/-4) oopsrepository/schema.py (+22/-0) oopsrepository/tests/test_oopses.py (+26/-0) |
To merge this branch: | bzr merge lp:~ev/oops-repository/bucket-versions |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Daisy Pluckers | Pending | ||
Review via email: mp+158946@code.launchpad.net |
Description of the change
This branch changes the updating of the package versions for each bucket. It now keeps track of which release as well as which version the bucket was updated for (ordering the latter using the dpkg binary version number comparator).
It uses three new column families:
BucketVersionsFull - A wide row of OOPS IDs for each combination of (bucket, DistroRelease, binary package version).
BucketVersionsCount - A counter column family that keeps track of the (DistroRelease, binary package version) count for each bucket. The binary package version is ordered using the dpkg comparator, which produces the following:
{('Ubuntu 12.04', '1.0~ppa1') : 12, ('Ubuntu 12.04', '1.0') : 9, ('Ubuntu 12.10', '1.0') : 36, ...}
BucketVersionsDay - A column family that's just used to keep track of which buckets we've updated the versions for today (today as when they were bucketed, not when the crash occurred - see the comment in the code). We'll use this to repair counters that were over/under-counted.
I'm tempted to only add to this when an exception is encountered when modifying the counter, rather than it grow with the BucketVersionsFull CF. We could also use a TTL to expire the data after a week or two (since it's not a counter CF). Can you think of any reason we might not want to do this? Is there any reason you can think of to keep around the knowledge of when we incremented the counters for BucketVersionsC
I haven't been able to think up one myself yet, but I'm not ruling it out :)
There's a script in lp:daisy that will be used to back-populate this data (which is updated to match this merge as https:/
In the Description you indicate that BucketVersionsCount is "(DistroRelease, binary package version)", however the code is written so that it is "bv_count. add(bucketid, (version, release))". I'm not sure why but I'd expect the ordering to have release before version instead of the other way around. Is there a specifc reason it is done with version before release?