[I've not responded to your entire review point-by-point, but I believe I've addressed all points where I haven't given an explicit response.] On Wed, Dec 07, 2011 at 09:22:24AM -0000, Jeroen T. Vermeulen wrote: > + def runGerminate(self, override_file, series_name, arch, flavours, > + structures): > > Rightly or wrongly, the “run” in that name led me to expect that this > method would fire up a separate process. Maybe a very short docstring > to dispel that notion? The name doesn't actually make a lot of sense; I think originally I was in the mindset where this actually did fire up a separate process. I've renamed the script instance to germinateArch (with a docstring), and the test instance to fetchGerminatedOverrides. Does that help? > To me, your branch is of very high quality but this is its Achilles' > heel. The following loop body is massive! I can't see it all on one > page, so it becomes hard even to see just how long it is. If I judged > the indentations right, it makes up the entire rest of the method. That's fair; I went back and forward a bit on this, but clearly landed in the wrong place. Actually, now that I've renamed runGerminate to germinateArch, it's easier in my mind to turn a block of it into germinateArchFlavour in turn. (Regularised naming may be boring, but I like it anyway.) The methods do start ending up with quite a few parameters, but that's probably better than the alternative. > Once you've extracted the loop body, given its size, it's probably > still worth extracting chunks from that. I've done a good deal of this, although will probably end up restructuring a bit further per your later comments. > According to some, the pattern of “comment, block of code, blank line, > comment, block of code, blank line” is a strong indication that you > need to split things up. It shows that you have already done so in > your own mind, but leaves each individual reader to figure out what > the dependencies between consecutive blocks are. I don't necessarily agree with this in all cases (e.g. I don't know that the new writeGerminateOutput method would benefit much from being split further), but I agree that the original version of this code was rather too far in the other direction. > + # Add this to the germinate log as well so that that can be > + # debugged more easily. Log a separator line first. > + self.germinate_logger.info('', extra={'progress': True}) > + self.germinate_logger.info('Germinating for %s/%s/%s', > + flavour, series_name, arch, > + extra={'progress': True}) > > What does the “extra” parameter do? That attaches a dictionary of extra attributes to the log message which can then be picked up by the formatter. Germinate uses this for certain messages (arguably a wart in germinate's design; I should probably look into doing that more neatly at some point). I've moved this to its own method so that it can have a clarifying docstring. > That's a lot of string manipulation. It may be clearer as a regex: > > task_header_regex = re.compile("task-([^:]*):(.*)", flags=re.IGNORECASE) > for line in seedtext: > match = task_header_regex.match(line) > if match is not None: > key, value = match.groups() > task_headers[key.lower()] = value.strip() Thanks. I used this mostly verbatim, although I simplified using ".*?". > If flavours[0] has a special meaning, that's worth documenting. > Consider documenting it in a docstring, or even making it a separate > parameter. I've added a primary_flavour parameter. Regarding your question of whether Build-Essential overrides should be pulled out of the loop, it's a tough call, but I actually think it's clearer inside the loop now that we have the primary_flavour parameter. That's because otherwise I'd have to pass the results of germination out of the germinateArchFlavour method, and I think the flow control is clearer if I don't do that. Also, I think it's worth keeping the two "write overrides" pieces next to each other. > I think for now flavours[0] is always “ubuntu,” but will we want to > generalize this code for derived distros later? As you say. The shell code I was replacing hardcoded [ "$distro" = ubuntu ] tests, but I didn't see a need to repeat that hardcoding. Thanks for the detailed commentary on seed/task parsing here; as well as clearer code, it's resulted in the tests being a few seconds faster, and every little helps.