Merge lp:~cjwatson/launchpad/gina-dsc-binaries into lp:launchpad
| Status: | Merged |
|---|---|
| Approved by: | Gavin Panella on 2012-01-05 |
| Approved revision: | no longer in the source branch. |
| Merged at revision: | 14643 |
| Proposed branch: | lp:~cjwatson/launchpad/gina-dsc-binaries |
| Merge into: | lp:launchpad |
| Diff against target: |
153 lines (+78/-3) 5 files modified
database/schema/security.cfg (+1/-1) lib/lp/scripts/garbo.py (+37/-0) lib/lp/scripts/tests/test_garbo.py (+36/-0) lib/lp/soyuz/doc/gina.txt (+2/-0) lib/lp/soyuz/scripts/gina/handlers.py (+2/-2) |
| To merge this branch: | bzr merge lp:~cjwatson/launchpad/gina-dsc-binaries |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Gavin Panella (community) | 2012-01-05 | Approve on 2012-01-05 | |
|
Review via email:
|
|||
Commit Message
[r=allenap][bug=911943] Separate dsc_binaries correctly in gina, and add a garbo-daily job to clean up existing wreckage.
Description of the Change
== Summary ==
There is some wrong data in SourcePackageRe
== Proposed fix ==
Fix gina to separate dsc_binaries correctly, and add a garbo-daily job to clean up the wreckage.
== Pre-implementation notes ==
Robert and Stuart educated me a bit on how to write garbo jobs.
== Implementation details ==
I found a few rows on dogfood where the Binary field had been incorrectly written as a dependency relationship field, complete with "(>= ...)". This is wildly wrong and it came from PPA uploads rather than gina; but it's in the past and s/ /, /g would not improve matters, so I just excluded anything containing '(', and included a test that such rows remain unchanged.
== Tests ==
bin/test -vvc -t gina -t garbo
== Demo and Q/A ==
I'd like to run this garbo job on dogfood and check that it really does clean everything up, which is probably most easily done by manual inspection of:
SELECT dsc_binaries FROM SourcePackageRe
== lint ==
./lib/lp/
162: want exceeds 78 characters.
179: want exceeds 78 characters.
223: want exceeds 78 characters.
236: want exceeds 78 characters.
242: want exceeds 78 characters.
Pre-existing, fiddly to fix, and should be ignoreable for the purposes of this branch.
| Colin Watson (cjwatson) wrote : | # |
On Thu, Jan 05, 2012 at 04:16:30PM -0000, Gavin Panella wrote:
> + WHERE dsc_binaries ~ '[a-z0-9+.-] '
>
> Could the regex be simplified to '[^,] '?
No, because there exist dsc_binaries fields with embedded newlines.
Something like [^\n,] might be possible, but I actually felt the above
was a bit clearer. I checked on dogfood's database to make sure there
were no rows that slipped between the cracks.
> Fwiw, an alternative, simpler, implementation might be something like:
Thanks! This is indeed much easier to read. I've applied that.
> + three = [
> + self.factory.
> + self.factory.
> + self.factory.
> + ]
> + spr_three = self.factory.
> + dsc_binaries=" ".join(three))
>
> Is it important to use unique strings? If not it might be clearer to
> make them static:
Yes, you're probably right; I got a bit carried away with the factory
there. I've simplified this as you suggest.
Do you think you could land this for me, if you're satisfied with my
answer to your first point? TIA.
| Gavin Panella (allenap) wrote : | # |
> Do you think you could land this for me, if you're satisfied with my
> answer to your first point? TIA.
Sure, I'll send it off to ec2 now.

Nice :)
[1]
+ WHERE dsc_binaries ~ '[a-z0-9+.-] '
Could the regex be simplified to '[^,] '?
[2]
Fwiw, an alternative, simpler, implementation might be something like:
def __init__(self, log, abort_time=None):
super( SourcePackageRe leaseDscBinarie sUpdater, self).__init__( SourcePackageRe lease)
self. store.find(
SourcePackage Release. id, separated
SQL(" dsc_binaries ~ '[a-z0-9+.-] '"),
SQL(" dsc_binaries NOT LIKE '%(%'")))
log, abort_time)
self.store = IMasterStore(
self.ids = list(
# Get all SPR IDs which have an incorrectly-
# dsc_binaries value (space rather than comma-space).
# Skip rows with dsc_binaries in dependency relationship
# format. This is a different bug.
def isDone(self):
"""See `TunableLoop`."""
return len(self.ids) == 0
def __call__(self, chunk_size): :chunk_ size] :chunk_ size]
self.store. execute( """ lease
dsc_binaries, '([a-z0-9+.-]) ', E'\\\\1, ', 'g') chunk_ids) , noresult=True)
transaction. commit( )
"""See `TunableLoop`."""
chunk_ids = self.ids[
del self.ids[
UPDATE SourcePackageRe
SET dsc_binaries = REGEXP_REPLACE(
WHERE id IN %s""" % sqlvalues(
[3]
+ three = [ getUniqueString (), getUniqueString (), getUniqueString (), makeSourcePacka geRelease(
+ self.factory.
+ self.factory.
+ self.factory.
+ ]
+ spr_three = self.factory.
+ dsc_binaries=" ".join(three))
Is it important to use unique strings? If not it might be clearer to
make them static:
spr_three = self.factory. makeSourcePacka geRelease( "one1 two2 three3")
self.assertEqu al("one1, two2, three3", spr_three. dsc_binaries)
...