On Wed, 2009-11-25 at 23:03 +0000, Julian Edwards wrote: > Review: Needs Fixing code > > While your suggested solution would not work (at least not without the > > 'nonlocal' keyword), I believe I have found something better. I keep one > > dict mapping SPFTs to counts, and another mapping component names to > > counts. This means there's only one special case (for > > COMPONENT_ORIG_TARBALL), and the code for each format becomes much > > cleaner too. > > Ah, this is 100% better than my suggestion, nice one! > > > I've split it, and completely changed the verification style. The > > failure messages are much less verbose, but one would have to manually > > edit the .dsc to get any of them anyway. I feel the reduction in code > > makes things much better. > > Actually it is not only better, it's easier to test because there are > fewer cases! > > > This does make testing much easier. > > > > However, there are now just a few code paths, but a lot of potential > > failure cases. How completely should I test? > > We always test 100% code coverage at least and any peculiar corner cases > that are obvious. > > I'd be happy here to have 2 tests per format type; one that tests it accepts > correctly and one that tests it fails correctly, since that's pretty much > all the code is doing now. I've added tests in nascentuploadfile.txt. > > > Also, can you please change it to compare against the format enum instead > > > of strings. Then the assertion error below will make more sense. > > > > > > > else: > > > > raise AssertionError("Unknown source format.") > > > > It will now die with a KeyError when looking up the file checker, which > > seems as reasonable as the old explicit AssertionError. > > I don't really like that, it's not as explicit as the old error, particularly > because the error will appear in the rejection email (unlikely as it is). > > Can you catch the KeyError and re-raise it as AssertionError with some good > text. This will ensure if/when we add more format types, it will be easier to > track down why the failure occurred. Done. > > The two uses are on different types: this is a SourceUploadFile, while > > the other is a SourcePackageReleaseFile. Should I instead create a > > utility function alongside SPFT, perhaps? > > I would make a mix-in class that both can inherit from and put the property > on that. As you suggested on IRC, I've put it in lp.soyuz.model.files.SourceFileMixin. > > @@ -375,35 +377,32 @@ > > and checksum. > > """ > > > > - diff_count = 0 > > - debian_tar_count = 0 > > - orig_tar_count = 0 > > + file_type_counts = { > > + SourcePackageFileType.DIFF: 0, > > + SourcePackageFileType.ORIG_TARBALL: 0, > > + SourcePackageFileType.DEBIAN_TARBALL: 0, > > + SourcePackageFileType.NATIVE_TARBALL: 0, > > + } > > component_orig_tar_counts = {} > > - native_tar_count = 0 > > - > > bzip2_count = 0 > > - > > *Loads* better! > > > files_missing = False > > + > > for sub_dsc_file in self.files: > > - filetype = determine_source_file_type(sub_dsc_file.filename) > > - > > - if filetype == SourcePackageFileType.DIFF: > > - diff_count += 1 > > - elif filetype == SourcePackageFileType.ORIG_TARBALL: > > - orig_tar_count += 1 > > - elif filetype == SourcePackageFileType.COMPONENT_ORIG_TARBALL: > > + file_type = determine_source_file_type(sub_dsc_file.filename) > > + > > + if file_type is None: > > + yield UploadError('Unknown file: ' + sub_dsc_file.filename) > > + continue > > + > > + if file_type == SourcePackageFileType.COMPONENT_ORIG_TARBALL: > > + # Split the count by component name. > > component = re_is_component_orig_tar_ext.match( > > get_source_file_extension(sub_dsc_file.filename)).group(1) > > if component not in component_orig_tar_counts: > > component_orig_tar_counts[component] = 0 > > component_orig_tar_counts[component] += 1 > > - elif filetype == SourcePackageFileType.DEBIAN_TARBALL: > > - debian_tar_count += 1 > > - elif filetype == SourcePackageFileType.NATIVE_TARBALL: > > - native_tar_count += 1 > > else: > > - yield UploadError('Unknown file: ' + sub_dsc_file.filename) > > - continue > > + file_type_counts[file_type] += 1 > > This is one of the best bits of crappy code cleaned up I've seen in years :) > > > +def check_format_1_0_files(filename, file_type_counts, component_counts, > > + bzip2_count): > > + """Check that the given counts of each file type suit format 1.0. > > + > > + A 1.0 source must be native (with only one tar.gz), or have an orig.tar.gz > > + and a diff.gz. It cannot use bzip2 compression. > > + """ > > + if bzip2_count > 0: > > + yield UploadError( > > + "%s: is format 1.0 but uses bzip2 compression." > > + % filename) > > + > > + if file_type_counts not in ({ > > + SourcePackageFileType.NATIVE_TARBALL: 1, > > + SourcePackageFileType.ORIG_TARBALL: 0, > > + SourcePackageFileType.DEBIAN_TARBALL: 0, > > + SourcePackageFileType.DIFF: 0, > > + }, { > > + SourcePackageFileType.ORIG_TARBALL: 1, > > + SourcePackageFileType.DIFF: 1, > > + SourcePackageFileType.NATIVE_TARBALL: 0, > > + SourcePackageFileType.DEBIAN_TARBALL: 0, > > + }) or len(component_counts) > 0: > > This is a little ugly. Can you do something like: > > valid_file_type_counts = [ > { > SourcePackageFileType.NATIVE_TARBALL: 1, > ... > }, > { > ... > }, > ] > > if file_type_counts not in valid_file_type_counts ... > > And the same for the other check funcs as well. While it violates the style guide, it is indeed much more readable. I've done that. > [snip] > > Otherwise a great change! > > Let me see these few changes and then we'll get this puppy landed. Thanks. The intermediate diff is attached.