charmworld

Merge lp:~jcsackett/charmworld/better-latency-refactor into lp:~juju-jitsu/charmworld/trunk

better-latency-refactor
Merge into trunk

Proposed by j.c.sackett on 2013-10-17

Status:	Merged
Approved by:	j.c.sackett on 2013-10-17
Approved revision:	427
Merged at revision:	422
Proposed branch:	lp:~jcsackett/charmworld/better-latency-refactor
Merge into:	lp:~juju-jitsu/charmworld/trunk
Diff against target:	364 lines (+180/-76) 2 files modified charmworld/jobs/review.py (+73/-30) charmworld/jobs/tests/test_review.py (+107/-46)
To merge this branch:	bzr merge lp:~jcsackett/charmworld/better-latency-refactor
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Juju Gui Bot	continuous-integration		Approve on 2013-10-17
Aaron Bentley (community)		2013-10-17	Approve on 2013-10-17
Review via email: mp+191615@code.launchpad.net

Commit message

Better latency calculation for tasks.

Description of the change

charmworld/jobs/review.py
-------------------------
The criteria for bugtasks being pulled has changed--we want to filter to only
bugs with linked branches; if a bug doesn't have a linked branch, it is managed
solely in launchpad--it only becomes a review queue item once a branch has been
linked.

The latency calculations for both tasks and proposals have been broken out into
their own functions.

`calculate_task_latency` tries to calculate latency based on information about
when the branch was linked. This process can get compilated.
* If there's only one branch linked event, we can safely use its date as the
  start_date.
* If there are multiple events, whether b/c there are multiple branches
  currently linked or because several branches have been linked and unlinked we
  have to find the first linked event that corresponds to a currently linked
  branch, as that is the "review item" that's actually being measured.
  `caclulate_task_latency` filters the events against the list of
  linked_branches to do this.
* There is always the chance filtering results in zero branches; this is
  because a branch could have been linked, and then the name could be changed
  (e.g. the user could have changed his/her display name). Since the branch
  event is only recorded as a string, the string doesn't get updated, and we
  can't match it to the currently linked branch. An example of this is in bug
  #1125869. This should be very rare--there has to have been several link
  events, and the name of the branch has to have been changed. Because it's
  rare, and there's no real way to recover, we abort calculating latency in this
  case.

`calculate_proposal_latency` does the same thing the inlined code used to do,
but as part of the ongoing attempts to make the review code more manageable it
seemed wise to move it to its own function to mirror the task processing.

charmworld/jobs/tests/test_review.py
------------------------------------
Tests for the latency functions have been added.

The Mock* classes have been replaced by functions that return actual mocks,
which make it much easier to add things like `task.bug.activity.entries`.

Revision history for this message

Aaron Bentley (abentley) wrote on 2013-10-17:

As discussed on IRC, please don't calculate NOW and WINDOW at module import time (e.g. calculate it in update_review_queue and pass it into the other functions.)

I actually prefer fakes[1] such as MockTask, MockBug, MockProposal, because match the real implementation's behaviour better. They will puke if you try to access a member that doesn't exist, or use it the wrong way. For example, if you have a typo in a method name, a MagicMock won't notice, and your code may not work in production. However, I won't block on this basis, because it's a safety / convenience tradeoff, and ultimately a matter of personal preference.

[1] I'm using "fake" in this sense: http://www.martinfowler.com/bliki/TestDouble.html

review: Approve

Revision history for this message

Juju Gui Bot (juju-gui-bot) wrote on 2013-10-17:

FAILED: Autolanding.
More details in the following jenkins job:
http://162.213.35.27:8080/job/charmworld-autoland-lxc/43/
Executed test runs:

review: Needs Fixing (continuous-integration)

lp:~jcsackett/charmworld/better-latency-refactor updated on 2013-10-17

427. By j.c.sackett on 2013-10-17: Shut up, lint.

Revision history for this message

Juju Gui Bot (juju-gui-bot) on 2013-10-17:

review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Antonio Rosales

Charmworld Developers

Deryck Hodge

j.c.sackett

 === modified file 'charmworld/jobs/review.py'
 --- charmworld/jobs/review.py	2013-10-04 19:01:39 +0000
 +++ charmworld/jobs/review.py	2013-10-17 20:16:11 +0000
@@ -38,10 +38,20 @@
      # older than the window that is still open. Right now this is probably the
      # more performant way to go about it, but that's going to change as the
      # data outside of our window gets larger.
++    try:
++        lp_credentials = get_ini()['lp_credentials_file']
++    except KeyError:
++        lp_credentials = None
      with Timer() as timer:
--        lp = Launchpad.login_anonymously('charm-tools',
--                                         'production',
--                                         version='devel')
++        if lp_credentials:
++            lp = Launchpad.login_with('charm-tools',
++                                      'production',
++                                      version='devel',
++                                      credentials_file=lp_credentials)
++        else:
++            lp = Launchpad.login_anonymously('charm-tools',
++                                             'production',
++                                             version='devel')
          # Entity roots for items of interests
          charm = lp.distributions['charms']
@@ -50,13 +60,13 @@
      log.debug("lp login time %0.2f", timer.duration())
++    branch_filter = "Show only Bugs with linked Branches"
      with Timer() as timer:
          bugs = charm.searchTasks(
              tags=['new-formula', 'new-charm'],
--            tags_combinator="Any")
--        # Best practice  dictates that charmers be subscribed to all bugs
--        # and merge proposals for official source package branches.
--        charmers_bugs = charmers.searchTasks()
++            tags_combinator="Any",
++            linked_branches=branch_filter)
++        charmers_bugs = charmers.searchTasks(linked_branches=branch_filter)
          proposals = charmers.getRequestedReviews()
          contributor_proposals = charm_contributors.getRequestedReviews()
@@ -69,27 +79,67 @@
              itertools.chain(proposals, contributor_proposals, doc_proposals))
++def calculate_proposal_latency(proposal, now):
++    if proposal.date_reviewed:
++        latency = (
++            proposal.date_reviewed - proposal.date_review_requested)
++    else:
++        # `now` has been calculated in UTC, and LP returns UTC TZ
++        # datetimes. We can safely remove the TZ, letting us get the
++        latency = (
++            now - proposal.date_review_requested.replace(tzinfo=None))
++    return latency.total_seconds()
++
++
++def calculate_task_latency(task, now):
++    link_entries = [e for e in task.bug.activity.entries if
++                    e['whatchanged'] == 'branch linked']
++    if len(link_entries) == 1:
++        start_date = link_entries[0]['datechanged']
++        start_date = datetime.datetime.strptime(
++            start_date.split('.')[0], '%Y-%m-%dT%H:%M:%S')
++    elif len(link_entries) > 1:
++        #Filter down to links associated with the currently linked branches
++        linked_branches = task.bug.linked_branches
++        branch_names = [link.branch.display_name for link in linked_branches]
++        current_entries = [e for e in link_entries if
++                           e['newvalue'] in branch_names]
++        if current_entries != []:
++            start_date = current_entries[0]['datechanged']
++            start_date = datetime.datetime.strptime(
++                start_date.split('.')[0], '%Y-%m-%dT%H:%M:%S')
++        else:
++            # At some point, the name of the branch changed, so we can't find
++            # the event when it was added. This happens rarely, so we just
++            # abandon stats for this task.
++            return
++    elif link_entries == []:
++        # We have no entries, probably b/c the lp step failed to login with
++        # authentication.
++        return
++
++    if task.date_fix_committed:
++        end_date = task.date_fix_committed.replace(tzinfo=None)
++    else:
++        end_date = now
++    latency = end_date - start_date
++    return latency.total_seconds()
++
++
  def update_review_queue(db, tasks, proposals, log):
      '''Updates the review queue information.'''
++    now = datetime.datetime.utcnow()
++    window = now - datetime.timedelta(days=180)  # 6 month window for latency
      bug_statuses = [
          'New', 'Confirmed', 'Triaged', 'In Progress', 'Fix Committed']
--    proposal_statuses = ['Needs review', 'Needs fixing', 'Needs information']
--    now = datetime.datetime.utcnow()
--    window = now - datetime.timedelta(days=180)  # 6 month window for latency
--
      r_queue = []
      latencies = []
      for task in tasks:
          if (task.date_created.replace(tzinfo=None) >= window or
              task.status in bug_statuses):
--            if task.date_confirmed:
--                latency = task.date_confirmed - task.date_created
--            else:
--                # Now has been calculated in UTC, and LP returns UTC TZ
--                # datetimes. We can safely remove the TZ, letting us get the
--                # timedelta.
--                latency = now - task.date_created.replace(tzinfo=None)
--            latencies.append(latency.total_seconds())
++            latency = calculate_task_latency(task, now)
++            if latency:
++                latencies.append(latency)
          if task.status in bug_statuses:
              entry = {
@@ -102,19 +152,13 @@
+             }
              r_queue.append(entry)
++    proposal_statuses = [
++        'Needs review', 'Needs fixing', 'Needs information']
      for proposal in proposals:
          if (proposal.date_created.replace(tzinfo=None) >= window or
              proposal.queue_status in proposal_statuses):
--            if proposal.date_reviewed:
--                latency = (
--                    proposal.date_reviewed - proposal.date_review_requested)
--            else:
--                # Now has been calculated in UTC, and LP returns UTC TZ
--                # datetimes. We can safely remove the TZ, letting us get the
--                latency = (
--                    now - proposal.date_review_requested.replace(tzinfo=None))
--            latencies.append(latency.total_seconds())
--
++            latency = calculate_proposal_latency(proposal, now)
++            latencies.append(latency)
          if proposal.queue_status in proposal_statuses:
              parts = proposal.target_branch_link.rsplit('/', 3)
              if parts[-1] == "trunk":
@@ -131,7 +175,6 @@
              entry = {
                  'type': 'proposal',
--                'latency': latency.total_seconds(),
                  'date_modified': last_modified,
                  'date_created': proposal.date_created,
                  'summary': "Merge for %s from %s" % (target, origin),
 === modified file 'charmworld/jobs/tests/test_review.py'
 --- charmworld/jobs/tests/test_review.py	2013-10-15 20:04:46 +0000
 +++ charmworld/jobs/tests/test_review.py	2013-10-17 20:16:11 +0000
@@ -3,7 +3,13 @@
  import datetime
  import logging
--from charmworld.jobs.review import update_review_queue
++import mock
++
++from charmworld.jobs.review import (
++    calculate_proposal_latency,
++    calculate_task_latency,
++    update_review_queue,
++)
  from charmworld.testing import MongoTestBase
@@ -12,45 +18,45 @@
  TWO_DAYS = NOW - datetime.timedelta(2)
--class MockBug:
--    date_last_updated = NOW
--
--
--class MockTask:
--
--    def __init__(self, confirmed=False):
--        self.date_created = TWO_DAYS
--        self.title = u'Bug #0 in a collection: "A bug task"'
--        self.web_link = "http://example.com"
--        self.bug = MockBug()
--        if confirmed:
--            self.status = "Confirmed"
--            self.date_confirmed = YESTERDAY
--        else:
--            self.date_confirmed = None
--            self.status = "New"
--
--
--class MockCommentCollection:
--    total_size = 0
--
--
--class MockProposal:
--
--    def __init__(self, reviewed=False):
--        base = 'https://api.launchpad.net/devel/'
--        self.target_branch_link = base + '~charmers/charms/precise/foo/trunk'
--        self.source_branch_link = base + '~bar/charms/precise/foo/fnord'
--        self.all_comments = MockCommentCollection()
--        self.date_review_requested = TWO_DAYS
--        self.date_created = TWO_DAYS
--        self.web_link = 'http://example.com'
--        if reviewed:
--            self.date_reviewed = YESTERDAY
--            self.queue_status = "Approved"
--        else:
--            self.date_reviewed = None
--            self.queue_status = "Needs review"
++def get_mock_task(confirmed=False, entries=[], linked_branches=[]):
++    if confirmed:
++        status = "Confirmed"
++        date_fix_committed = YESTERDAY
++    else:
++        date_fix_committed = None
++        status = "New"
++    return mock.Mock(
++        status=status,
++        date_fix_committed=date_fix_committed,
++        date_created=TWO_DAYS,
++        title=u'Bug #0 in a collection: "A bug task"',
++        web_link="http://example.com",
++        bug=mock.Mock(
++            date_last_updated=NOW,
++            linked_branches=linked_branches,
++            activity=mock.Mock(entries=entries)
++        ),
++    )
++
++
++def get_mock_proposal(reviewed=False):
++    base = 'https://api.launchpad.net/devel/'
++    if reviewed:
++        date_reviewed = YESTERDAY
++        queue_status = "Approved"
++    else:
++        date_reviewed = None
++        queue_status = "Needs review"
++    return mock.Mock(
++        target_branch_link=base + '~charmers/charms/precise/foo/trunk',
++        source_branch_link=base + '~bar/charms/precise/foo/fnord',
++        all_comments=mock.Mock(total_size=0),
++        date_review_requested=TWO_DAYS,
++        date_created=TWO_DAYS,
++        web_link='http://example.com',
++        date_reviewed=date_reviewed,
++        queue_status=queue_status,
++    )
  class ReviewTest(MongoTestBase):
@@ -63,7 +69,7 @@
      def test_review_handles_bugs(self):
          log = logging.getLogger('foo')
--        update_review_queue(self.db, [MockTask()], [], log)
++        update_review_queue(self.db, [get_mock_task()], [], log)
          self.assertEqual(1, self.db.review_queue.count())
          item = self.db.review_queue.find_one()
          self.assertEqual(NOW.date(), item['date_modified'].date())
@@ -74,7 +80,7 @@
      def test_review_handles_proposals(self):
          log = logging.getLogger('foo')
--        update_review_queue(self.db, [], [MockProposal()], log)
++        update_review_queue(self.db, [], [get_mock_proposal()], log)
          self.assertEqual(1, self.db.review_queue.count())
          item = self.db.review_queue.find_one()
          self.assertEqual(TWO_DAYS.date(), item['date_modified'].date())
@@ -84,10 +90,65 @@
          self.assertEqual('http://example.com', item['item'])
          self.assertEqual('Needs review', item['status'])
--    def test_review_calculates_latency(self):
++    def test_calculate_task_latency_with_two_valid_links(self):
++        linked_branches = [
++            mock.Mock(branch=mock.Mock(display_name='lp:~foo/bar/bjorn')),
++            mock.Mock(branch=mock.Mock(display_name='lp:~foo/bar/baz'))
++        ]
++        two_days = datetime.datetime.strftime(
++            TWO_DAYS, '%Y-%m-%dT%H:%M:%S')
++        yesterday = datetime.datetime.strftime(
++            YESTERDAY, '%Y-%m-%dT%H:%M:%S')
++        entries = [
++            {'whatchanged': 'branch linked',
++             'newvalue': 'lp:~foo/bar/baz',
++             'datechanged': two_days},
++            {'whatchanged': 'branch linked',
++             'newvalue': 'lp:~foo/bar/bjorn',
++             'datechanged': yesterday}]
++        latency = calculate_task_latency(
++            get_mock_task(
++                True, entries=entries, linked_branches=linked_branches), NOW)
++        self.assertEqual(60 * 60 * 24, int(latency))
++
++    def test_calculate_task_latency_bug_with_one_valid_link(self):
++        linked_branches = [
++            mock.Mock(branch=mock.Mock(display_name='lp:~foo/bar/bjorn'))
++        ]
++        two_days = datetime.datetime.strftime(
++            TWO_DAYS, '%Y-%m-%dT%H:%M:%S')
++        yesterday = datetime.datetime.strftime(
++            YESTERDAY, '%Y-%m-%dT%H:%M:%S')
++        entries = [
++            {'whatchanged': 'branch linked',
++             'newvalue': 'lp:~foo/bar/baz',
++             'datechanged': yesterday},
++            {'whatchanged': 'branch linked',
++             'newvalue': 'lp:~foo/bar/bjorn',
++             'datechanged': two_days}]
++        latency = calculate_task_latency(
++            get_mock_task(
++                True, entries=entries, linked_branches=linked_branches), NOW)
++        self.assertEqual(60 * 60 * 24, int(latency))
++
++    def test_calculate_task_latency_bug_with_one_link(self):
++        date_changed = datetime.datetime.strftime(
++            TWO_DAYS, '%Y-%m-%dT%H:%M:%S')
++        entries = [{'whatchanged': 'branch linked',
++                    'newvalue': 'lp:~foo/bar/baz',
++                    'datechanged': date_changed}]
++        latency = calculate_task_latency(
++            get_mock_task(True, entries=entries), NOW)
++        self.assertEqual(60 * 60 * 24, int(latency))
++
++    def test_calculate_proposal_latency(self):
++        latency = calculate_proposal_latency(get_mock_proposal(True), NOW)
++        self.assertEqual(60 * 60 * 24, int(latency))
++
++    def test_review_job_saves_latency(self):
          log = logging.getLogger('foo')
          self.assertEqual(0, self.db.review_queue_latency.count())
--        update_review_queue(self.db, [], [MockProposal(True)], log)
++        update_review_queue(self.db, [], [get_mock_proposal(True)], log)
          self.assertEqual(1, self.db.review_queue_latency.count())
          item = self.db.review_queue_latency.find_one()
          self.assertEqual(NOW.date(), item['date'].date())
@@ -96,14 +157,14 @@
          self.assertEqual(1, datetime.timedelta(seconds=item['min']).days)
      def test_review_discards_closed_bugs(self):
--        tasks = [MockTask(), MockTask()]
++        tasks = [get_mock_task(), get_mock_task()]
          tasks[0].status = 'Fix released'
          log = logging.getLogger('foo')
          update_review_queue(self.db, tasks, [], log)
          self.assertEqual(1, self.db.review_queue.count())
      def test_review_discards_approved_proposals(self):
--        proposals = [MockProposal(), MockProposal()]
++        proposals = [get_mock_proposal(), get_mock_proposal()]
          proposals[0].queue_status = 'Approved'
          log = logging.getLogger('foo')
          update_review_queue(self.db, [], proposals, log)