Merge lp:~jcsackett/launchpad/plus-participation-additional-fixes into lp:launchpad
| Status: | Merged | ||||
|---|---|---|---|---|---|
| Approved by: | Edwin Grubbs on 2010-08-19 | ||||
| Approved revision: | no longer in the source branch. | ||||
| Merged at revision: | 11408 | ||||
| Proposed branch: | lp:~jcsackett/launchpad/plus-participation-additional-fixes | ||||
| Merge into: | lp:launchpad | ||||
| Diff against target: |
656 lines (+238/-87) 13 files modified
lib/canonical/launchpad/doc/sample-data-assertions.txt (+2/-2) lib/lp/bugs/doc/initial-bug-contacts.txt (+1/-1) lib/lp/bugs/templates/person-portlet-team-assignedbugs.pt (+1/-1) lib/lp/registry/browser/person.py (+34/-21) lib/lp/registry/browser/tests/test_person_view.py (+2/-2) lib/lp/registry/doc/person-account.txt (+2/-2) lib/lp/registry/doc/private-team-visibility.txt (+1/-1) lib/lp/registry/doc/teammembership.txt (+5/-5) lib/lp/registry/doc/vocabularies.txt (+1/-1) lib/lp/registry/interfaces/person.py (+7/-7) lib/lp/registry/model/person.py (+77/-15) lib/lp/registry/tests/test_person.py (+95/-7) lib/lp/registry/vocabularies.py (+10/-22) |
||||
| To merge this branch: | bzr merge lp:~jcsackett/launchpad/plus-participation-additional-fixes | ||||
| Related bugs: |
|
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Jelmer Vernooij (community) | code | Approve on 2010-08-22 | |
| Edwin Grubbs (community) | code | Approve on 2010-08-19 | |
| Māris Fogels (community) | 2010-08-16 | Needs Information on 2010-08-17 | |
|
Review via email:
|
|||
Commit Message
Attacks the +participation problem by stormifying the team_memberships method (previously myactivemembers
Description of the Change
Summary
Contributes to fixing the +participation time outs by converting team path look ups into one large SQL lookup followed by some graph traversal.
Previously the mp showed an implementation that largely cut-off the team_path lookup; the changes for it are still useful, so have not been removed from the branch, but the solution changed in the last few revisions.
Proposed fix
Stormifying where possible to assist db load; more importantly, using a graph of all teams the person participates in to create the team_paths for indirect teams as displayed.
Pre-implementation notes
Chatted with Curtis Hovey (sinzui)
Got schooled by Robert Collins (lifeless) in the ways of the graph, and agreed it was a much better solution.
Implementation details
Renames myactivememberships to team_memberships, per an XXX from kiko in the codebase. Updates call sites.
Stormifies team_memberships and updates callsites.
Uses a graph traversal in getPathsToTeams to find the path between the person and all teams, then filters out direct memberships from that to get indirect teams and their team_paths.
Demo and Q/A
bin/test -vvc -t TestPersonTeams
bin/test -vvc -t TestPersonParti
On launchpad.dev, checkout https:/
lint
= Launchpad lint =
Checking for conflicts and issues in changed files.
Linting changed files:
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
./lib/lp/
180: E302 expected 2 blank lines, found 1
230: E302 expected 2 blank lines, found 1
625: E302 expected 2 blank lines, found 1
1495: E301 expected 1 blank line, found 0
./lib/lp/
448: E302 expected 2 blank lines, found 1
The "expected n blank lines" were explained to me as a factor of lint having issues around comments and class definitions.
| Robert Collins (lifeless) wrote : | # |
| j.c.sackett (jcsackett) wrote : | # |
The findPathToTeam calculations were a problem b/c in the cases causing the timeout, the query gets called hundreds of times, leading to the largest set of time in the process. Because of the iteration through an unknown series of team relations, there's no easy way to predict how this will work.
As an example, in https:/
| Robert Collins (lifeless) wrote : | # |
On Wed, Aug 18, 2010 at 1:01 AM, j.c.sackett
<email address hidden> wrote:
> The findPathToTeam calculations were a problem b/c in the cases causing the timeout, the query gets called hundreds of times, leading to the largest set of time in the process. Because of the iteration through an unknown series of team relations, there's no easy way to predict how this will work.
>
> As an example, in https:/
Ok, so theres basic reason for it to be slow.
Have you considered this algorithm (psuedo code):
# 1 query, we want all the teams he participates in
# (that is, all rows that have the member == user, and the team == anything
# should return a list of tuples [(user, t1), (user, t2)] or something like that
memberships = list(get_
teams = [team for _, team in memberships]
# Get all the memberships of those teams themselves; because memberships
# are transitive, this will be a subset of the first result, and lets us infer
# a valid graph cheaply (there can be multiple paths to a team, we only
# need to show one)
team_temberships = get_memberships
graph_edges = list(team_
graph = dict(graph_edges)
graph.update(
# now render
...
| Robert Collins (lifeless) wrote : | # |
Bah, I got the direct and indirect person memberships the wrong way around.
Have you considered this algorithm (psuedo code):
# 1 query, we want all the teams he participates via the expanded memo:
# (that is, all rows that have the member == user, and the team == anything
# that is directly or indirectly in. This should return a list of tuples
# [(user, t1), (user, t2)] or something like that
memberships = list(get_
known_teams = [team for _, team in memberships]
known_members = known_teams + [user]
# Get all the memberships of those teams themselves; because memberships
# are transitive, we can query for the entire graph directly.
# Note that here we query direct memberships only
team_temberships = get_direct_
graph = dict(team_
# now render
...
| Māris Fogels (mars) wrote : | # |
Hi Jonathan,
I reviewed the branch and both the test and implementation of the changes look good. However, I did come upon one large question: did the IPerson.
Besides that, I only found minor points in the diff:
• Does the new 'via' argument to _asParticipation() need a epydoc stanza in the method's docstring? Otherwise the argument type is not obvious.
• Should there be a comment on the line 53 of the diff, before calling findPathToTeam(
• It looks like the 'from storm.expr' import in model/person.py was ordered alphabetically, but Desc isn't in order.
• You probably want to use an epydoc :param: stanza for the new limit argument in the IPerson.
I am marking this as Needs Information for now. When the IPerson interface questions are answered then we can approve this to land.
Maris
| Māris Fogels (mars) wrote : | # |
Looking at the test commands, I see that only the TestPersonTeams and TestPersonParti
| j.c.sackett (jcsackett) wrote : | # |
Maris--
You were right, I had borked the interface. I've since updated the interface to show the new parameters.
Per suggestion from Robert, a lot has changed in this code; I'm unsure if you want to continue this MP or kill it and have me submit another. All the code reviewed is still there, as the changes were still good ones, but many of the changes are no longer taken advantage of by person/
| j.c.sackett (jcsackett) wrote : | # |
Robert--
I've pushed up changes that I believe implement your idea. Rather than picking the shortest path though, it uses the oldest path to maintain consistency with the data previously presented (save of course those occasions when the shortest path is a direct_membership).
Take a look and let me know what you think.
| Robert Collins (lifeless) wrote : | # |
129 """Validate the person is a real person with no other restrictions."""
130 +
131 def validate(person):
That new vertical whitespace is non-PEP8 - this is a curried function, so its appropriate to have it adjacent and not break the eye's flow reading past it - please don't do that. If your editor is telling you to do it, give it a good spanking.
You also have some list reflows that make things less readable - our style guide encourages readability over sheer compactness; I'm on the fence myself, but wanted you to be aware that you have discretion there - what the code had was fine, with the lines and indentation clearly grouping the query clauses.
The implementation of a 2-query look up looks pretty nice, I'm glad it turned out well. I can't really judge the ui result trivially, perhaps a screenshot of a complexish situation?
I haven't done a full code review, I think you should carry on with Maris or whomever is reviewer-de-jour.
| j.c.sackett (jcsackett) wrote : | # |
Robert--
Thanks, I'll look into the reflows and that line break.
I agree on pursuing the review with Maris or whomever; I was interested in your view of my implementation of the two query graph thing. Really great idea, btw.
| Edwin Grubbs (edwin-grubbs) wrote : | # |
Hi JC,
The main thing I learned from reviewing this branch is that we should
not give new employees programming tasks concerning graph traversal
unless we want to scare them away. I think that is the most confusing
code I've ever seen in the registry.
I am impressed with all that you have accomplished in this branch. Most
of my comments are related to formatting, which is to be expected with
new employees. I felt your pain.
You really need to run "make lint" against the branch, especially
test_person.py.
-Edwin
>=== modified file 'lib/lp/
>--- lib/lp/
>+++ lib/lp/
>@@ -3078,31 +3078,34 @@
> def label(self):
> return 'Team participation for ' + self.context.
>
>- def _asParticipatio
>+ def _asParticipatio
> """Return a dict of participation information for the membership.
>
> Method requires membership or team, not both.
>+ :param via: The team through which the membership in the indirect
>+ is established.
"in the indirect" seems to be missing a noun.
> """
> if ((membership is None and team is None) or
> (membership is not None and team is not None)):
> raise AssertionError(
> "The membership or team argument must be provided, not both.")
>
>- if team_path is None:
>- via = None
>- else:
>- via = COMMASPACE.
>+ if via is not None:
>+ # When showing the path, it's unnecessary to show the team in
>+ # question at the beginning of the path, or the user at the
>+ # end of the path.
>+ via = COMMASPACE.join(
>+ [via_team.
>
> if membership is None:
>- #we have membership via an indirect team, and can use
>- #sane defaults
>+ # Membership is via an indirect team; sane defaults exist.
>
>- #the user can only be a member of this team
>+ # The user can only be a member of this team.
The space between these two comments makes it look like a line of
code disappeared. "sane defaults exist" kinda confuses me. I see what
you are going for with a comment for the block and a separate comment
for the first statement. It also took a second to understand what "can
only be a member" meant. Maybe, the two comments can be combined as:
# An indirect member cannot be the Owner or Admin of the team.
> role = 'Member'
>- #the user never joined, and can't have a join date
>+ # The user never joined, and can't have a join date.
"user" almost always refers to the logged-in user, so it is better
to say "person" if the function is not dealing with the logged-in user.
> datejoined = None
> else:
>- #the member is a direct member, so we can use membership data
>+ # The member is a direct memb...
| j.c.sackett (jcsackett) wrote : | # |
Edwin--
Thanks for all the comments; I'll make those changes and push them up soon.
On Aug 18, 2010, at 9:48 PM, Edwin Grubbs wrote:
> Review: Needs Fixing
> Hi JC,
>
> The main thing I learned from reviewing this branch is that we should
> not give new employees programming tasks concerning graph traversal
> unless we want to scare them away. I think that is the most confusing
> code I've ever seen in the registry.
>
> I am impressed with all that you have accomplished in this branch. Most
> of my comments are related to formatting, which is to be expected with
> new employees. I felt your pain.
>
> You really need to run "make lint" against the branch, especially
> test_person.py.
>
>
> -Edwin
>
>
>> === modified file 'lib/lp/
>> --- lib/lp/
>> +++ lib/lp/
>> @@ -3078,31 +3078,34 @@
>> def label(self):
>> return 'Team participation for ' + self.context.
>>
>> - def _asParticipatio
>> + def _asParticipatio
>> """Return a dict of participation information for the membership.
>>
>> Method requires membership or team, not both.
>> + :param via: The team through which the membership in the indirect
>> + is established.
>
>
> "in the indirect" seems to be missing a noun.
>
>
>> """
>> if ((membership is None and team is None) or
>> (membership is not None and team is not None)):
>> raise AssertionError(
>> "The membership or team argument must be provided, not both.")
>>
>> - if team_path is None:
>> - via = None
>> - else:
>> - via = COMMASPACE.
>> + if via is not None:
>> + # When showing the path, it's unnecessary to show the team in
>> + # question at the beginning of the path, or the user at the
>> + # end of the path.
>> + via = COMMASPACE.join(
>> + [via_team.
>>
>> if membership is None:
>> - #we have membership via an indirect team, and can use
>> - #sane defaults
>> + # Membership is via an indirect team; sane defaults exist.
>>
>> - #the user can only be a member of this team
>> + # The user can only be a member of this team.
>
>
>
> The space between these two comments makes it look like a line of
> code disappeared. "sane defaults exist" kinda confuses me. I see what
> you are going for with a comment for the block and a separate comment
> for the first statement. It also took a second to understand what "can
> only be a member" meant. Maybe, the two comments can be combined as:
> # An indirect member cannot be the Owner or Admin of the team.
>
>
>> role = 'Member'
>> - #the user never joined, and can't have a join date
>> + # The user never joined, and can't have a join date.
>
>
> "user" almost always ...
| j.c.sackett (jcsackett) wrote : | # |
More detailed reply with updated code...
> I am impressed with all that you have accomplished in this branch. Most
> of my comments are related to formatting, which is to be expected with
> new employees. I felt your pain.
Actually, once I ran through some experiments in a terminal, it wasn't *that* painful. But thanks. :-)
> You really need to run "make lint" against the branch, especially
> test_person.py.
I thought I had, but I think I ran make lint before committing some files, and so only got the issues with those files.
> The style guide doesn't explicitely mention this, but I think the
> formatting choices should be similar to a function call. Each line of
> the list comprehension should line up so they are easier to read:
> Either:
> result = [foo
> bar
> baz]
> Or:
> result = [
> foo
> bar
> baz]
>
> For actual lists, the ending square bracket has to be on a line by
> itself. There isn't an exact rule for list comprehensions, but
> pocketlint will complain if you put the square bracket on a line by
> itself without a comma on the previous line, and of course, that only
> works for lists.
I've broken up the list comprehension as
indirect_teams = [
team for team in paths.keys() if
team not in direct_teams]
It seems clean to break the conditional filter onto its own line.
> I don't see the limit being used anywhere? Is that feature still needed?
>
> If it's needed the docstring should describe what limit is. Look in other
> docstrings, which use the convention:
> :param PARAMNAME: DESCRIPTION
The limit isn't being used anywhere, but I had left it in b/c I could see use of this method
being a problem for someone else down the road. However, I can see it's presence now
is just confusing, and it's a sufficiently obvious solution that if it's actually needed down
the road it can be redone then.
>
>
>> # This is our guarantee that _getDirectMembe
>> # never return None
>> assert self.hasPartici
>> @@ -1687,19 +1699,16 @@
>> self.invited_
>> orderBy=
>>
>> - # XXX: kiko 2005-10-07:
>> - # myactivememberships should be renamed to team_memberships and be
>> - # described as the set of memberships for the object's teams.
>> @property
>> - def myactivemembers
>> + def team_membership
>> """See `IPerson`."""
>> - return TeamMembership.
>> - TeamMembership.
>> - Person.id = TeamMembership.team
>> - """ % sqlvalues(self.id, [TeamMembership
>> - TeamMembershipS
>> - clauseTables=
>> - orderBy=
>> + Team = ClassAlias(Person, "Team")
>> + store = Store.of(self)
>> + return store.find(
>> + And(TeamMembers
>> + Team.id == TeamMembership.
>> + TeamMembership.
>> + ...
| Edwin Grubbs (edwin-grubbs) wrote : | # |
Hi JC,
I just have a few comments more before it's ready to land.
-Edwin
> > The style guide doesn't explicitely mention this, but I think the
> > formatting choices should be similar to a function call. Each line of
> > the list comprehension should line up so they are easier to read:
> > Either:
> > result = [foo
> > bar
> > baz]
> > Or:
> > result = [
> > foo
> > bar
> > baz]
> >
> > For actual lists, the ending square bracket has to be on a line by
> > itself. There isn't an exact rule for list comprehensions, but
> > pocketlint will complain if you put the square bracket on a line by
> > itself without a comma on the previous line, and of course, that only
> > works for lists.
>
> I've broken up the list comprehension as
>
> indirect_teams = [
> team for team in paths.keys() if
> team not in direct_teams]
>
> It seems clean to break the conditional filter onto its own line.
When scanning code, a person often doesn't notice the very end of the line. Here is an extreme example to illustrate my point:
result = [
a in blist]
The user might think you are iterating over blist instead of alist. While your list comprehension isn't nearly as hazardous, I think putting the "if" at the start of the line will help readers parse it quickly.
> > I don't see the limit being used anywhere? Is that feature still needed?
> >
> > If it's needed the docstring should describe what limit is. Look in other
> > docstrings, which use the convention:
> > :param PARAMNAME: DESCRIPTION
>
> The limit isn't being used anywhere, but I had left it in b/c I could see use
> of this method
> being a problem for someone else down the road. However, I can see it's
> presence now
> is just confusing, and it's a sufficiently obvious solution that if it's
> actually needed down
> the road it can be redone then.
>
>
> >
> >
> >> # This is our guarantee that _getDirectMembe
> >> # never return None
> >> assert self.hasPartici
> >> @@ -1687,19 +1699,16 @@
> >> self.invited_
> >> orderBy=
> >>
> >> - # XXX: kiko 2005-10-07:
> >> - # myactivememberships should be renamed to team_memberships and be
> >> - # described as the set of memberships for the object's teams.
> >> @property
> >> - def myactivemembers
> >> + def team_membership
> >> """See `IPerson`."""
> >> - return TeamMembership.
> >> - TeamMembership.
> >> - Person.id = TeamMembership.team
> >> - """ % sqlvalues(self.id, [TeamMembership
> >> - TeamMembershipS
> >> - clauseTables=
> >> - orderBy=
> >> + Team = ClassAlias(Person, "Team")
> >> + store = Store.of(self)
> >> + return store.find(
> >> + And(TeamMembers
> >> + ...
| Edwin Grubbs (edwin-grubbs) wrote : | # |
Hi JC,
This looks good. BTW, it is customary to add a comment with the incremental diff of changes. This way the reviewer gets all the changes they need to look at in an email. I've included them below in case anybody else is following the progress on this.
-Edwin
=== modified file 'lib/lp/
--- lib/lp/
+++ lib/lp/
@@ -3101,7 +3101,7 @@
# Membership is via an indirect team so sane defaults exist.
# An indirect member cannot be an Owner or Admin of a team.
role = 'Member'
- # The Person joined, and can't have a join date.
+ # The Person never joined, and can't have a join date.
else:
# The member is a direct member; use the membership data.
@@ -3133,8 +3133,8 @@
paths, memberships = self.context.
- team for team in paths.keys() if
- team not in direct_teams]
+ team for team in paths.keys()
+ if team not in direct_teams]
# First, create a participation for all direct memberships.
=== modified file 'lib/lp/
--- lib/lp/
+++ lib/lp/
@@ -1712,7 +1712,6 @@
store = Store.of(self)
return store.find(
- Team.id == TeamMembership.
@@ -2080,7 +2079,7 @@
# Get all of the memberships for any of the teams this person is
# a participant of. This must be ordered by date and id because
- # because the graph the results will create needs to contain
+ # because the graph of the results will create needs to contain
# the oldest path information to be consistent with results from
# IPerson.
store = Store.of(self)
@@ -2114,7 +2113,7 @@
- return (paths, user_memberships)
+ return (paths, user_memberships)
@property
def teams_participa
=== modified file 'lib/lp/
--- lib/lp/
+++ lib/lp/
@@ -75,7 +75,7 @@
- def test_path_
+ def test_path_
path_to_a = self.user.
path_to_b = self.user.
path_to_c = self.user.
@@ -84,15 +84,6 @@
...
| j.c.sackett (jcsackett) wrote : | # |
=== modified file 'lib/lp/
--- lib/lp/
+++ lib/lp/
@@ -2203,6 +2203,10 @@
# Get all of the teams this person participates in.
teams = list(self.
+ # For cases where self is a team, we don't need self as a team
+ # participated in.
+ teams = [team for team in teams if team != self]
+
# Get all of the memberships for any of the teams this person is
# a participant of. This must be ordered by date and id because
# because the graph of the results will create needs to contain
@@ -2210,9 +2214,16 @@
# IPerson.
store = Store.of(self)
- TeamMembership.
- [team.id for team in teams] + [self.id]
- Desc(TeamMember
+ And(
+ TeamMembership.
+ [team.id for team in teams] + [self.id]),
+ TeamMembership.
+ TeamMembership.
+ TeamMembershipS
+ TeamMembershipS
+ ]))).order_by(
+ Desc(TeamMember
+ Desc(TeamMember
# Cast the results to list now, because they will be iterated over
# several times.
| Robert Collins (lifeless) wrote : | # |
On Sun, Aug 22, 2010 at 4:46 PM, j.c.sackett
<email address hidden> wrote:
> === modified file 'lib/lp/
> --- lib/lp/
> +++ lib/lp/
> @@ -2203,6 +2203,10 @@
> # Get all of the teams this person participates in.
> teams = list(self.
>
> + # For cases where self is a team, we don't need self as a team
> + # participated in.
> + teams = [team for team in teams if team != self]
This is not idiomatic python. Does the following work?
teams = [team for team in teams if team is not self]
It may not due to security proxies; worth a comment if it doesn't to
avoid cleanups taking someones time.
-Rob
| Jelmer Vernooij (jelmer) wrote : | # |
+1 on the improvements after the ec2 test run.

Could you explain why the indirect team path calculations were a problem?