Merge lp:~kfogel/launchpad/505845-affects-me-too-from-dups into lp:launchpad/db-devel
| Status: | Rejected |
|---|---|
| Rejected by: | Jonathan Lange on 2010-03-22 |
| Proposed branch: | lp:~kfogel/launchpad/505845-affects-me-too-from-dups |
| Merge into: | lp:launchpad/db-devel |
| Diff against target: |
68 lines (+21/-2) 4 files modified
database/schema/comments.sql (+1/-0) database/schema/patch-2207-99-0.sql (+9/-0) lib/lp/code/browser/branchmergeproposallisting.py (+3/-1) lib/lp/code/browser/tests/test_branchmergeproposallisting.py (+8/-1) |
| To merge this branch: | bzr merge lp:~kfogel/launchpad/505845-affects-me-too-from-dups |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Stuart Bishop | db | 2010-01-18 | Disapprove on 2010-01-19 |
| Jonathan Lange (community) | 2010-01-18 | Needs Fixing on 2010-01-18 | |
|
Review via email:
|
|||
| Karl Fogel (kfogel) wrote : | # |
| Jonathan Lange (jml) wrote : | # |
Hey Karl,
A couple of questions:
- why doesn't the new column have a foreign key referring to the bug table?
- what's the test for?
Also, if I understand this correctly, this patch denormalizes the database -- information about whether the bug is a dupe is already stored somewhere. I guess this change to the model allows you to support crazy things like the example described above, but why is this desirable?
If we do keep this patch, then I think (but am not sure) that we need a uniqueness constraint such that any given bugaffecstperso
Thanks,
jml
| Karl Fogel (kfogel) wrote : | # |
Good point on the foreign key, d'oh. Thanks.
It _looks_ like denormalization, but semantically it's not, really: this new field doesn't mean "is / is not a dup of", it means "this bug-affects-person relationship should / should not be passed through to". That doesn't map reliably to "is / is not a dup of", it's just that they happen to coincide a lot -- but not always, hence the need for a separate record. It's true that that data is already in the system, if you iterate over enough bugs and people, but the problem is performance issues.
Thoughts?
Re the uniqueness constraint: yes. Note that person may want to claim to be unaffected by M but to still be affected by D, and in that scenario the bugaffectsperso
There is probably a way to express this constraint in SQL (abel theorizes): say that bugaffectsperso
| Jonathan Lange (jml) wrote : | # |
Weeeeeeeeeell, I'm not sure how we feelr about this sort of denormalization, but I'll be guided by stub's opinion here.
If you can think of how to do the constraint, that'd be good. I'll save an approve vote for an updated patch.
| Stuart Bishop (stub) wrote : | # |
This seems a very complicated way of duplicating information already available in the database. It is a denormalization, which we should avoid doing except to fix performance problems. I don't think the existing structure is a performance problem for bug heat, as bug heat is being recalculated as a background task and not on the fly when handling a request.
| Karl Fogel (kfogel) wrote : | # |
Marked branch as 'abandoned'. We had a conversation here (adeuring, gmb, kfogel) and have come around to a solution that doesn't need a schema change. You're right: the fact that bug heat is calculated as a background task makes the difference; and gmb explained that current performance problems with the subscriber portlet are not due to database queries but to over-iteration after the query (so there is no need for us to try to address those problems here).
Unmerged revisions
- 10189. By Karl Fogel on 2010-01-18
-
DB schema change for bug #505845 ("Affected users should be carried
through from duplicates to master").* database/
schema/ patch-2207- 99-0.sql: New file, adding new column
'master_affects' to table 'bugaffectsperson'.* database/
schema/ comments. sql: Update accordingly.

Make the DB change necessary to implement the solution described in https:/ /bugs.edge. launchpad. net/malone/ +bug/505845/ comments/ 1 . Short version:
This new field will allow Launchpad to remember when a person is affected by a bug via duplication, versus directly; and will be able to represent perverse things like people claiming to be affected by a bug but not by its dup.
The basic plan is to add a new column:
ALTER table bugaffectsperson ADD COLUMN master_affects integer;
By default, the new column has the same value as the 'bug' column. But when a bug D is marked as a dup of bug M, D's 'master_affects' is set to M (unless there was already a record indicating M does not affect P).
We'll do most queries on the 'master_affects' column (instead of on 'bug' as currently). For example, when a person P visits the page for bug M, we figure out whether to show it as affecting them based on whether there are any rows with M in 'master_affects'. And when we're calculating total affects-person counts for bug heat, we query on the master_affects column. Thus, if D is later unduplicated from M, the affects-person count on M will automatically be decremented properly.
See the above-linked comment for more details.
NOTE: There will need to be some code updates following this patch -- we know about them. Until then, you'll see test errors like 'IntegrityError: null value in column "master_affects" violates not-null constraint'.
OTHER NOTE: There is a possibility this solution will help with the performance issues we have with the subscriber portlet on the bugs pages; however, we're not positive enough to say that with 100% certainty yet. We just wanted to point it out for warm fuzzies.