Code review comment for lp:~sinzui/launchpad/oil-and-pigment

Revision history for this message
Curtis Hovey (sinzui) wrote :

On Fri, 2010-04-23 at 13:41 +0000, Abel Deuring wrote:
> Review: Needs Information code
> Hi Curtis,
>
> overall, a nice branch. I have though a few qestions regarding the
> filtering of inactive projects for the karma display, so I'm marking
> the MP as "needs information".
>
> > === modified file 'lib/lp/registry/model/person.py'
> > --- lib/lp/registry/model/person.py 2010-04-19 21:16:12 +0000
> > +++ lib/lp/registry/model/person.py 2010-04-22 18:05:27 +0000
> > @@ -850,12 +850,16 @@
> > def getProjectsAndCategoriesContributedTo(self, limit=5):
> > """See `IPerson`."""
> > contributions = []
> > - results = self._getProjectsWithTheMostKarma(limit=limit)
> > - for pillar_name, karma in results:
> > + # Pillars names have no concept of active. Extra pillars names are
> > + # requested because deactivated pillars will be filtered out.
> > + extra_limit = limit + 5
> > + results = self._getProjectsWithTheMostKarma(limit=extra_limit)
> > + for pillar_name, karma in results[:limit]:
>
> While you retrieve 5 extra items calling _getProjectsWithTheMostKarma(),
> you remove these extra items here ;)

I am incompetent. I removed the limit on the iterator. I added a guard
in the loop to break at the limit.

> > pillar = getUtility(IPillarNameSet).getByName(pillar_name)
> > - contributions.append(
> > - {'project': pillar,
> > - 'categories': self._getContributedCategories(pillar)})
> > + if pillar.active:
> > + contributions.append(
> > + {'project': pillar,
> > + 'categories': self._getContributedCategories(pillar)})
> > return contributions
>
> I think that should be
>
> return contributions[:limit]
>
> combined with a loop over the full result set.

I prefer a guard in the loop. The loop may make 5 extra SQL queries for
data that will not be used. _getContributedCategories() is a call to the
db.

> But: Did you consider to change the SQL query in the method
> _getProjectsWithTheMostKarma() itself so that only activ projects
> are returned (if necessary, using an optional method parameter
> "active_only=False")?

Yes, the query would insane because it is often nothing to join on. We
are querying pillar names, not objects. that is why our code always
filters *after* the pillar_name is used to get the pillar (product,
distribution, project group)

> While adding five extra items to the result set looks reasonable,
> there is the (admittedly largely theoretical) situation that
> the result set contains more than five inactive projects, so that
> the length of the "effective" list is shorter than expected.

Yes. This hack is the same approach taken when getting a short list of
bugs that may have private ones filtered out. The complaints from users
is about 1 deactivated project in the list. We are providing coverage
for 5 deactivated projects.

> And I would prefer it anyway to do the filtering in
> _getProjectsWithTheMostKarma(), even if it would be too cumbersome
> to implement in SQL.

I do not think that is practical. As I said above, the schema does not
suit this and fixing it is out of scope for this bug. Pillar names and
karma are just strings because the value is the name for lookups. The
tables have no idea what kind of object is needed. If you review our
lookup code, you will see we make multiple checks for each object type
looking for a matching object.

> > === modified file 'lib/lp/registry/tests/test_person.py'
> > --- lib/lp/registry/tests/test_person.py 2010-02-11 19:26:26 +0000
> > +++ lib/lp/registry/tests/test_person.py 2010-04-22 18:05:27 +0000
...
> > + def _makeKarmaCache(self, person, product, category_name_values):
> > + """Create a KarmaCache entry with the given arguments.
> > +
> > + In order to create the KarmaCache record we must switch to the DB
> > + user 'karma'. This requires a commit and invalidates the product
> > + instance.
> > + """
> > + transaction.commit()
> > + reconnect_stores('karmacacheupdater')
> > + total = 0
> > + # Insert category total for person and project.
> > + for category_name_value in category_name_values:
> > + category_name, value = category_name_value
>
> Just an aesthetical remark, but I would prefer a simple
>
> for category_name, value in category_name_values:

Agreed.

{{{
=== modified file 'lib/lp/registry/model/person.py'
--- lib/lp/registry/model/person.py 2010-04-22 16:15:47 +0000
+++ lib/lp/registry/model/person.py 2010-04-23 14:14:22 +0000
@@ -854,12 +854,15 @@
         # requested because deactivated pillars will be filtered out.
         extra_limit = limit + 5
         results = self._getProjectsWithTheMostKarma(limit=extra_limit)
- for pillar_name, karma in results[:limit]:
- pillar = getUtility(IPillarNameSet).getByName(pillar_name)
- if pillar.active:
+ for pillar_name, karma in results:
+ pillar = getUtility(IPillarNameSet).getByName(
+ pillar_name, ignore_inactive=True)
+ if pillar is not None:
                 contributions.append(
                     {'project': pillar,
                      'categories': self._getContributedCategories(pillar)})
+ if len(contributions) == limit:
+ break
         return contributions

     def _getProjectsWithTheMostKarma(self, limit=10):

=== modified file 'lib/lp/registry/tests/test_person.py'
--- lib/lp/registry/tests/test_person.py 2010-04-22 17:42:06 +0000
+++ lib/lp/registry/tests/test_person.py 2010-04-23 14:13:06 +0000
@@ -630,8 +630,7 @@
         reconnect_stores('karmacacheupdater')
         total = 0
         # Insert category total for person and project.
- for category_name_value in category_name_values:
- category_name, value = category_name_value
+ for category_name, value in category_name_values:
             category = KarmaCategory.byName(category_name)
             self.cache_manager.new(
                 value, person.id, category.id, product_id=product.id)

}}}

--
__Curtis C. Hovey_________
http://launchpad.net/

« Back to merge proposal