Merge lp:~zematynnad/ubuntu-webcatalog/duplicates_987826 into lp:ubuntu-webcatalog

Proposed by Danny Tamez
Status: Merged
Approved by: Anthony Lenton
Approved revision: 127
Merged at revision: 122
Proposed branch: lp:~zematynnad/ubuntu-webcatalog/duplicates_987826
Merge into: lp:ubuntu-webcatalog
Diff against target: 93 lines (+41/-10)
2 files modified
src/webcatalog/tests/test_views.py (+26/-5)
src/webcatalog/views.py (+15/-5)
To merge this branch: bzr merge lp:~zematynnad/ubuntu-webcatalog/duplicates_987826
Reviewer Review Type Date Requested Status
Anthony Lenton (community) Approve
Review via email: mp+107091@code.launchpad.net

Commit message

Removes duplicate apps from top rated apps widget.

Description of the change

Overview
=========
This branch is a fix for http://pad.lv/987826 where duplicate apps were showing up in the top rated apps carousel.

Details
========
The problem was that apps for the same package but for different distroseries were being returned in the query. Since django doesn't do distinct on individual fields (at least not without a patch) the duplicates have to be filtered out a different way. I just used a dictionary with the packagename as the key to avoid the duplicates.

To Test
========
$fab bootstrap test

To post a comment you must log in.
Revision history for this message
Anthony Lenton (elachuni) wrote :

Hi Danny,

I really don't think the code is doing what we want there. Comments about the changes in src/webcatalog/views.py:
 - You iterate over all of top_rated_apps (tens of thousands of apps) to keep just the first settings.NUMBER_TOP_RATED_APPS off the filtered list (tens of apps). See if there's a way to only filter enough apps as we need.
 - For each package name you currently keep the last item on the list that matches the name (as the dict is overridden each time you find a new Application instance with the same package name), so I you're effectively keeping the *lowest* rated app instance for each package name.
 - Finally, you're storing the apps in a dict and then asking for values(), so you can't expect any order in top_rated_apps.
 - And, something really minor, Count is being imported unnecessarily.

Kind regards,

achuni.

review: Needs Fixing
126. By Danny Tamez

Remove unused import.
Ensure that as few rows as necessary are iterated on.
Ensure that order is preserved in final result.

127. By Danny Tamez

Refactor to filter as few as needed.

Revision history for this message
Anthony Lenton (elachuni) wrote :

Thanks Danny!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/webcatalog/tests/test_views.py'
--- src/webcatalog/tests/test_views.py 2012-05-08 07:03:56 +0000
+++ src/webcatalog/tests/test_views.py 2012-05-23 22:34:23 +0000
@@ -697,13 +697,17 @@
697 for link in links:697 for link in links:
698 self.assertContains(response, link)698 self.assertContains(response, link)
699699
700 def make_high_medium_and_low_apps(self):700 def make_high_medium_and_low_apps(self, score_high=5.0, score_mid=4.0,
701 score_low=1.0):
701 high = self.factory.make_application(package_name='high',702 high = self.factory.make_application(package_name='high',
702 wilson_score=5.0, ratings_average=5, ratings_total=500)703 wilson_score=score_high, ratings_average=5,
704 ratings_total=500)
703 mid = self.factory.make_application(package_name='mid',705 mid = self.factory.make_application(package_name='mid',
704 wilson_score=4.0, ratings_average=4, ratings_total=400)706 wilson_score=score_mid, ratings_average=4,
707 ratings_total=400)
705 low = self.factory.make_application(package_name='low',708 low = self.factory.make_application(package_name='low',
706 wilson_score=1.0, ratings_average=1, ratings_total=100)709 wilson_score=score_low, ratings_average=1,
710 ratings_total=100)
707 return (high, mid, low)711 return (high, mid, low)
708712
709 def test_top_rated_apps_in_response(self):713 def test_top_rated_apps_in_response(self):
@@ -712,7 +716,7 @@
712 with patch_settings(NUMBER_TOP_RATED_APPS=1):716 with patch_settings(NUMBER_TOP_RATED_APPS=1):
713 response = self.client.get(reverse('wc-index'))717 response = self.client.get(reverse('wc-index'))
714718
715 self.assertEqual(1, response.context[0]['top_rated_apps'].count())719 self.assertEqual(1, len(response.context[0]['top_rated_apps']))
716 self.assertEqual(high, response.context[0]['top_rated_apps'][0])720 self.assertEqual(high, response.context[0]['top_rated_apps'][0])
717721
718 def test_top_rated_apps_display_correctly(self):722 def test_top_rated_apps_display_correctly(self):
@@ -746,6 +750,23 @@
746 self.assertContains(response, expected, 2)750 self.assertContains(response, expected, 2)
747 self.assertNotContains(response, app_not_included.package_name)751 self.assertNotContains(response, app_not_included.package_name)
748752
753 def test_top_rated_apps_does_not_include_duplicate_packages(self):
754 high, mid, low = self.make_high_medium_and_low_apps(5.0, 4.0, 1.0)
755 high2, mid2, low2 = self.make_high_medium_and_low_apps(4.8, 3.8, 0.8)
756 high3, mid3, low3 = self.make_high_medium_and_low_apps(4.6, 3.6, 0.6)
757 high4, mid4, low4 = self.make_high_medium_and_low_apps(4.4, 3.4, 0.4)
758
759 with patch_settings(NUMBER_TOP_RATED_APPS=4):
760 response = self.client.get(reverse('wc-index'))
761
762 apps = response.context['top_rated_apps']
763 # should have only first high, first mid and first low
764 # and in that order
765 self.assertEqual(len(apps), 3)
766 self.assertEqual(apps[0], high)
767 self.assertEqual(apps[1], mid)
768 self.assertEqual(apps[2], low)
769
749 def test_includes_cache_headers(self):770 def test_includes_cache_headers(self):
750 with patch_settings(CACHE_MIDDLEWARE_SECONDS=600):771 with patch_settings(CACHE_MIDDLEWARE_SECONDS=600):
751 response = self.client.get(reverse('wc-index'))772 response = self.client.get(reverse('wc-index'))
752773
=== modified file 'src/webcatalog/views.py'
--- src/webcatalog/views.py 2012-05-08 07:03:56 +0000
+++ src/webcatalog/views.py 2012-05-23 22:34:23 +0000
@@ -116,15 +116,25 @@
116 featured_apps = [Application.objects.find_best(package_name=app)116 featured_apps = [Application.objects.find_best(package_name=app)
117 for app in settings.FEATURED_APPS]117 for app in settings.FEATURED_APPS]
118 featured_apps = [x for x in featured_apps if x]118 featured_apps = [x for x in featured_apps if x]
119 top_rated_apps = (Application.objects.filter(wilson_score__gt=0)119 apps = (Application.objects.filter(wilson_score__gt=0)
120 .order_by('-wilson_score'))120 .order_by('-wilson_score'))
121 num = settings.NUMBER_TOP_RATED_APPS121 # filter out apps that are top rated more than once because
122 top_rated_apps = top_rated_apps[:num]122 # of different distroseries
123
124 seen = set()
125 top_rated = []
126 for app in apps:
127 if not app.package_name in seen:
128 top_rated.append(app)
129 seen.add(app.package_name)
130 if len(top_rated) == settings.NUMBER_TOP_RATED_APPS:
131 break
132
123 context = RequestContext(request, dict={133 context = RequestContext(request, dict={
124 'depts': depts,134 'depts': depts,
125 'exhibits': exhibits,135 'exhibits': exhibits,
126 'featured_apps': featured_apps,136 'featured_apps': featured_apps,
127 'top_rated_apps': top_rated_apps,137 'top_rated_apps': top_rated,
128 })138 })
129 return render_to_response('webcatalog/index.html',139 return render_to_response('webcatalog/index.html',
130 context_instance=context)140 context_instance=context)

Subscribers

People subscribed via source and target branches