Merge into trunk : oops-integration : Code : pkgme service

Status:	Rejected
Rejected by:	James Westby on 2012-03-01
Proposed branch:	lp:~james-w/pkgme-service/oops-integration
Merge into:	lp:pkgme-service
Prerequisite:	lp:~james-w/pkgme-service/log-oopses
Diff against target:	311 lines (+123/-42) 11 files modified dev_config/manifests/pkgme_service.pp (+8/-14) dev_config/templates/django.wsgi.erb (+42/-10) dev_config/templates/production_credentials.cfg.erb (+15/-0) dev_config/templates/production_paths.py.erb (+0/-13) django_project/dev.cfg (+10/-0) django_project/main.cfg (+1/-0) django_project/manage.py (+22/-4) django_project/production.cfg (+5/-0) django_project/production_credentials.cfg.example (+15/-0) fabtasks/deploy.py (+3/-1) src/djpkgme/views.py (+2/-0)
To merge this branch:	bzr merge lp:~james-w/pkgme-service/oops-integration
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Canonical Consumer Applications Hackers		2012-01-25	Pending
Review via email: mp+90033@code.launchpad.net

Description of the change

Hi,

Here's a sketch of how oops integration works for a django app.

As it stands this hooks in at the wsgi layer, so it doesn't have
any effect on a local environment (with runserver.)

As it is written it just writes the oops to the filesystem. It's
easy to have it write the oops to rabbit instead I think.

You can spin up this branch in ec2, hit /pkgme/+oops and see
an oops appear in ~/pkgme-service/oopses on the instance.

As it stands the caller doesn't get told the oops id, that's a bug in django.
I'm putting an irc log from the end where lifeless and jamesh discuss
ways of doing it.

I'm going to set this to "Work in progress" as I don't think we should
land it as-is. I mainly did this to have a better understanding for discussing
it on the IS call tomorrow.

Thanks,

James

<james_w> lifeless, have you looked in to how to encourage django to use a response with the oops id on an uncaught exception?
<james_w> if you don't provide a 500 template then you get the oops one, but the oops it points to is an exception complaining that you don't have a 500 template
<lifeless> jamesh: has
<lifeless> oh, for oops-tools itself I know we need to provide a 500 template
<lifeless> there is a patch for django, last I checked it was a bit stalled
<lifeless> https://code.djangoproject.com/ticket/16674
<james_w> oh, I thought that was what the django.py fixed
<lifeless> it works around it
<james_w> http://ec2-107-22-92-8.compute-1.amazonaws.com/pkgme/+oops is what I'm getting currently
<lifeless> https://code.djangoproject.com/attachment/ticket/16674/wsgi-expose-exc-info-v2.patch
<lifeless> thats the actual fix
<lifeless> the django module in oops-wsgi sniffs the exception to ensure you get an oops
<lifeless> but it can't influence django to make it expose the error to the wsgi stack
<jamesh> james_w: you mean to display the OOPS ID on the error page?
<lifeless> so the main oops module just sees a successful reponse
<james_w> jamesh, yeah
<james_w> lifeless, ah, I see
<jamesh> james_w: I haven't been able to work out a good way to solve that with the new infrastructure
<james_w> so if it were properly fixed then oops' response would take over
<james_w> ok, it's not so important to us at this point
<lifeless> james_w: do you get an OOPS emitted to (rabbit|disk)
<james_w> lifeless, yep
<jamesh> the old oops code we have in U1 can preallocate the oops IDs on demand, so we can use that in the error page
<lifeless> jamesh: we could allow that too, if you wanted to use a timeuuid or something, and put the id in the context
<jamesh> one idea I had for the new code might be to set a short lived cookie in the WSGI middleware, and look that up with JS in the error page
<lifeless> jamesh: ooh, nice.
<jamesh> that has the downside that cookies are shared between all tabs in a browser
<james_w> lifeless, for this patch, how much do you care about a test?
<james_w> it's not the easiest thing to test after all
<lifeless> 'meh'
<jamesh> so if you open 20 pages at once and five fail, you might see the same OOPS ID on each
<lifeless> yeah
<lifeless> uhm
* poolie has quit (Quit: Ex-Chat)
<lifeless> wonder if you can inspect the response headers from the page itself
<lifeless> jamesh: shoving id in the context would be fine.
<lifeless> james_w: should be easy to test if you want to; just have an outer start_response(*args, **kwargs) and check kwargs is empty
<james_w> lifeless, ah, good point
<jamesh> I don't think there is a way to inspect response headers unfortunately
<lifeless> so, a django error template that puts content['oops.context']['id'] = str(uuid.uuid1()) => win
<lifeless> jamesh: ^
<wgrant> Hmm
<wgrant> Regression
<wgrant> 5633 AssertionError: Ambiguous view name.
<wgrant> GET: 5629 Other: 4 Robots: 98 Local: 239
<wgrant> 168 http://feeds.launchpad.net/%7Ebarry/latest-bugs.atom (Person:latest-bugs.atom)
<wgrant> OOPS-00033c67366d1e7513397953f71c9b29, OOPS-001af265e7d5322fdd12c9b12b4fed8c, OOPS-00ae6fedb93aa14a7fc508d51b9fa6a7, OOPS-010f606986cdc5008d999fabe085e6aa, OOPS-01432c961a9928b26bb0d210d723a53b
<wgrant> Ah
<wgrant> I bet it's the bug listings release
<wgrant> It doesn't work with feeds.
<wgrant> Which are always anonymous, so were never tested.
<wgrant> lifeless: ^^
<lifeless> grah
<lifeless> I guess we need to rollback
<jamesh> lifeless: I guess that would work. Do we lose anything by having the view assign an ID instead of the publisher doing so?
<wgrant> lifeless: It's been broken for a day, and it's a major thing.
<wgrant> "it" == bug listings
<wgrant> So I think we should keep it on.
<wgrant> It's not as if anybody uses feeds anyway.
<lifeless> well, 5633 polls for them
<lifeless> wgrant: lets start with a bug
<lifeless> jamesh: I think its fine to allow the earliest point that knows an error is on the way to assign an id - the hash based aproach is just one way to get a unique id
<lifeless> jamesh: oops can render errors itself
<lifeless> jamesh: or you could pass a django template into oops
<lifeless> jamesh: or you can assign the id earlier, if the framework really wants to render its own errors
<lifeless> jamesh: the existing facilities are the trivial string template that oops has itself or a callback you can supply to the middleware constructor
<jamesh> lifeless: I think giving frameworks the ability to render their own error pages makes python-oops-wsgi more appealing, since it can be dropped into an existing app
<lifeless> jamesh: I think the first thing I would attempt would be to discard the django error page but wire up the callback to let me use a django template to render the error page
<james_w> so, I can't do this patch as I am yet to get close to having a working buildout
<wgrant> lifeless: Hm
<lifeless> james_w: oh? what happens?
<wgrant> 'bugs.dynamic_bug_listings.enabled pageid:Person:latest-bugs.atom 2 '?
<lifeless> wgrant: might work, we should try
<wgrant> That seems to be the only popular affected page.
<jamesh> lifeless: well, once we've got that django bug report I filed fixed, that should be possible: if start_response() is called with an exc_info, just re-raise it
<lifeless> jamesh: that seems to be slightly different
<jamesh> then present an error page after the exception bubbles back up to the middleware
<lifeless> oh right
<lifeless> so yes, get the middleware's start_response to be called with exc_info
<jamesh> with that in mind, the debug error page Django presents is pretty nice
<lifeless> ah, make_app(template=...)
<jamesh> so when developing an app, I'd prefer to see that than an oops-wsgi error page
<lifeless> jamesh: sure, but not for deployed apps ?
<lifeless> jamesh: so have dev mode eat the exceptions and render itself
<lifeless> jamesh: anyhow, I'm happy to let the app allocate ids
<lifeless> pragmatic
<james_w> lifeless, confusing error messages
<jamesh> lifeless: I'm of two minds there. The error pages we've got wired up in U1 don't do much , so would be pretty easy to reimplement in the middleware
<lifeless> james_w: pastebin ?
<james_w> apparently when it said I might have misspelled a dependency, it really meant that it couldn't install from the cache because the cache is currently empty
<lifeless> james_w: you're using the LP cache, right ?
<jamesh> on the other hand, if I could just get/create an OOPS ID in the existing error page code, that would be a much smaller change
<james_w> lifeless, nope
<lifeless> james_w: use the LP cache
<jamesh> and I suspect other people trying to integrate python-oops would feel the same
<lifeless> jamesh: these are not exclusive options
<jamesh> lifeless: I understand that.
<lifeless> :)
<lifeless> I think we should make it easy to get going
<jamesh> I'm just thinking out loud about why I might pick one option over the other
<lifeless> with better/more organised stuff being something you can grow into
<lifeless> jamesh: I'm glad you are doing so :) I was just adding commentary

lp:~james-w/pkgme-service/oops-integration updated on 2012-01-28

41. By James Westby on 2012-01-28: Try out the new oops stuff.

Unmerged revisions

41. By James Westby on 2012-01-28: Try out the new oops stuff.
40. By James Westby on 2012-01-25: Install the extra packages (until pkgme-service-dependencies depends on them)
39. By James Westby on 2012-01-25: Merge the log oops branch.
38. By James Westby on 2012-01-24: Add a basic oops integration to the wsgi for devstaging deployments.

pkgme service

Merge lp:~james-w/pkgme-service/oops-integration into lp:pkgme-service

Commit message

Description of the change

Unmerged revisions

Preview Diff

Subscribers

 === modified file 'dev_config/manifests/pkgme_service.pp'
 --- dev_config/manifests/pkgme_service.pp	2012-01-19 19:13:04 +0000
 +++ dev_config/manifests/pkgme_service.pp	2012-01-28 21:41:26 +0000
@@ -23,7 +23,7 @@
     ensure => 'present',
     gid => $unix_user,
     require => Group[$unix_user],
--}
++}
  exec { "psql-create-user-$postgres_user":
    command => "psql -c \"create user $postgres_user with password '$::postgres_password'\"",
@@ -58,17 +58,6 @@
    ],
+ }
--file { "$::basedir/django_project/production_paths.py":
--  content => template("production_paths.py.erb"),
--  owner => $unix_user,
--  group => $unix_user,
--  mode => 644,
--  require => [
--      User[$unix_user],
--      Group[$unix_user],
--  ],
--}
--
  file { "$::basedir/pkgme.log":
    owner => $unix_user,
    group => $unix_user,
@@ -76,6 +65,13 @@
    ensure => present,
+ }
++file { "$::basedir/oopses":
++  owner => $unix_user,
++  group => $unix_user,
++  mode => 755,
++  ensure => directory,
++}
++
  file { "$::basedir/celery.log":
    owner => $unix_user,
    group => $unix_user,
@@ -97,7 +93,6 @@
    user => $unix_user,
    require => [
               File["$::basedir/django_project/production_credentials.cfg"],
--             File["$::basedir/django_project/production_paths.py"],
               File["$::basedir/pkgme.log"],
               ],
    logoutput => on_failure,
@@ -117,7 +112,6 @@
    content => template("django.wsgi.erb"),
    require => [
      File["$::basedir/django_project/production_credentials.cfg"],
--    File["$::basedir/django_project/production_paths.py"],
+   ]
+ }
 === modified file 'dev_config/templates/django.wsgi.erb'
 --- dev_config/templates/django.wsgi.erb	2012-01-18 23:45:14 +0000
 +++ dev_config/templates/django.wsgi.erb	2012-01-28 21:41:26 +0000
@@ -1,19 +1,51 @@
  import os
  import sys
--extra_paths = [
--    '<%= basedir %>',
--    '<%= basedir %>/src',
--    '<%= basedir %>/django_project',
--    '<%= basedir %>/sourcecode/pkgme',
--    '<%= basedir %>/sourcecode/pkgme-binary',
--]
++basepath = '<%= basedir %>'
++extra_paths = [basepath, '<%= basedir %>/src', '<%= basedir %>/django_project']
++
++sourcecode_path = os.path.abspath(os.path.join(basepath, 'sourcecode'))
++for filename in os.listdir(sourcecode_path):
++    dependency_path = os.path.join(sourcecode_path, filename)
++    if os.path.isdir(dependency_path):
++        extra_paths.append(dependency_path)
++
  for path in extra_paths:
      if path not in sys.path:
--        sys.path.append(path)
++        sys.path.insert(0, path)
  os.environ['PKGME_LOG_DIR'] = '<%= basedir %>'
  os.environ['DJANGO_SETTINGS_MODULE'] = 'django_project.settings'
--import django.core.handlers.wsgi
--application = django.core.handlers.wsgi.WSGIHandler()
++from django.conf import settings
++
++# This is needed to workaround a bug in DJango, see the file it is
++# implemented in for more details.
++from oops_wsgi.django import OOPSWSGIHandler
++
++application = OOPSWSGIHandler()
++
++###
++# Wrap the application in the Oops wsgi app to catch unhandled exceptions
++# and create oops for them.
++#
++# First we create the config that defines what to do with the oopses.
++
++import oops_dictconfig
++import oops_timeline
++from oops_wsgi import make_app, install_hooks
++from timeline import wsgi as timeline_wsgi
++from timeline_django import hooks, middleware, timeline_cursor
++
++config = oops_dictconfig.config_from_dict(settings.OOPSES)
++install_hooks(config)
++oops_timeline.install_hooks(config)
++
++receiver = hooks.TimelineReceiver(middleware.get_timeline)
++receiver.connect_to_signals()
++timeline_cursor.insert_timeline_cursors(middleware.get_timeline)
++
++application = timeline_wsgi.make_app(application)
++
++# Then we wrap the django app in the oops one
++application = make_app(application, config, oops_on_status=['500'])
 === modified file 'dev_config/templates/production_credentials.cfg.erb'
 --- dev_config/templates/production_credentials.cfg.erb	2012-01-24 21:11:22 +0000
 +++ dev_config/templates/production_credentials.cfg.erb	2012-01-28 21:41:26 +0000
@@ -31,3 +31,18 @@
  class = logging.handlers.WatchedFileHandler
  formatter = verbose
  filename = <%= basedir %>/django.log
++
++[oops_amqp_publisher]
++type = amqp
++host = <%= rabbit_host %>:<%= rabbit_port %>
++user = <%= rabbit_user %>
++password = <%= rabbit_password %>
++vhost = <%= rabbit_vhost %>
++exchange_name = oopses
++routing_key = oopses
++
++[oops_datedir_publisher]
++type = datedir
++error_dir = <%= basedir %>/oopses
++instance_id = production
++only_new = True
 === removed file 'dev_config/templates/production_paths.py.erb'
 --- dev_config/templates/production_paths.py.erb	2012-01-18 23:45:14 +0000
 +++ dev_config/templates/production_paths.py.erb	1970-01-01 00:00:00 +0000
@@ -1,13 +0,0 @@
--import os
--import sys
--
--for path in [
--    '<%= basedir %>/sourcecode/pkgme-binary',
--    '<%= basedir %>/sourcecode/pkgme',
--    '<%= basedir %>/django_project',
--    '<%= basedir %>/src',
--    '<%= basedir %>',
--    ]:
--    sys.path.insert(0, path)
--
--os.environ['PKGME_LOG_DIR'] = '<%= basedir %>'
 === modified file 'django_project/dev.cfg'
 --- django_project/dev.cfg	2012-01-24 20:49:55 +0000
 +++ django_project/dev.cfg	2012-01-28 21:41:26 +0000
@@ -14,6 +14,16 @@
      django_configglue
      djkombu
  databases = databases
++debug = False
++oopses = oops_config
++
++[oops_config]
++publishers = oops_dev_publisher
++
++[oops_dev_publisher]
++type = datedir
++error_dir = oopses
++instance_id = dev
  [django_file_logging_handler]
  level = WARNING
 === modified file 'django_project/main.cfg'
 --- django_project/main.cfg	2012-01-24 21:16:59 +0000
 +++ django_project/main.cfg	2012-01-28 21:41:26 +0000
@@ -28,6 +28,7 @@
  middleware_classes =
      django.middleware.common.CommonMiddleware
      django.contrib.sessions.middleware.SessionMiddleware
++    timeline_django.request_tracker.TimelineMiddleware
      django.middleware.csrf.CsrfViewMiddleware
      django.contrib.auth.middleware.AuthenticationMiddleware
      django.contrib.messages.middleware.MessageMiddleware
 === modified file 'django_project/manage.py'
 --- django_project/manage.py	2012-01-13 01:22:19 +0000
 +++ django_project/manage.py	2012-01-28 21:41:26 +0000
@@ -1,8 +1,12 @@
  #!/usr/bin/env python
--try:
--    import production_paths
--except ImportError:
--    pass
++import os
++import sys
++
++sourcecode_path = os.path.abspath(os.path.join(os.path.dirname(os.path.dirname(__file__)), 'sourcecode'))
++for filename in os.listdir(sourcecode_path) + ['../src', '../django_project']:
++    dependency_path = os.path.join(sourcecode_path, filename)
++    if os.path.isdir(dependency_path):
++        sys.path.insert(0, dependency_path)
  from django.core.management import execute_manager
  import imp
@@ -15,5 +19,19 @@
  import settings
++from oops_celery import setup_oops_reporter
++from oops_dictconfig import config_from_dict
++from timeline import Timeline
++from timeline_django import hooks, timeline_cursor
++
++# TODO: OOPSES in settings
++config = config_from_dict(settings.OOPSES)
++setup_oops_reporter(config)
++timeline = Timeline()
++receiver = hooks.TimelineReceiver(lambda: timeline)
++receiver.connect_to_signals()
++timeline_cursor.insert_timeline_cursors(lambda: timeline)
++
  if __name__ == "__main__":
++    # TODO: catch errors and generate oopses for manage.py commands.
      execute_manager(settings)
 === modified file 'django_project/production.cfg'
 --- django_project/production.cfg	2012-01-23 15:38:33 +0000
 +++ django_project/production.cfg	2012-01-28 21:41:26 +0000
@@ -6,6 +6,11 @@
  #static_root =
  #static_url =
  #admin_media_prefix =
++oopses = oops_config
++
++[oops_config]
++publishers = oops_amqp_publisher
++             oops_datedir_publisher
  [djpkgme]
  pkgme_output_directory = /srv/pkgme-service.canonical.com/var/packaged-applications/
 === modified file 'django_project/production_credentials.cfg.example'
 --- django_project/production_credentials.cfg.example	2012-01-24 20:49:55 +0000
 +++ django_project/production_credentials.cfg.example	2012-01-28 21:41:26 +0000
@@ -37,3 +37,18 @@
  class = logging.handlers.WatchedFileHandler
  formatter = verbose
  filename = /var/log/pkgme-service/django.log
++
++[oops_amqp_publisher]
++type = amqp
++host = #amqp host and optional port, e.g. localhost:5867
++user = #amqp user
++password = #amqp password
++vhost = #amqp vhost
++exchange_name = #amqp exchange name
++routing_key = #amqp routing key
++
++[oops_datedir_publisher]
++type = datedir
++error_dir = /var/cache/oopses
++instance_id = production
++only_new = True
 === modified file 'fabtasks/deploy.py'
 --- fabtasks/deploy.py	2012-01-24 23:38:20 +0000
 +++ fabtasks/deploy.py	2012-01-28 21:41:26 +0000
@@ -218,12 +218,14 @@
      run('echo "postfix postfix/main_mailer_type select No configuration" | sudo debconf-set-selections')
      # Install the dependencies needed to get puppet going
      # TODO: move the rest of the dependencies to puppet
--    run('sudo apt-get install -q -y --force-yes pkgme-service-dependencies bzr apache2 libapache2-mod-wsgi rabbitmq-server postgresql-8.4 puppet')
++    run('sudo apt-get install -q -y --force-yes pkgme-service-dependencies bzr apache2 libapache2-mod-wsgi rabbitmq-server postgresql-8.4 puppet python-oops-wsgi python-oops-datedir-repo python-oops-celery python-oops-amqp python-timeline python-oops-timeline')
      # Grab the branches
      # TODO: investigate re-using IS' config-manager config
      run('bzr branch -q %s pkgme-service' % branch)
      run('bzr branch -q %s pkgme-service/sourcecode/pkgme' % pkgme_branch)
      run('bzr branch -q %s pkgme-service/sourcecode/pkgme-binary' % pkgme_binary_branch)
++    run('bzr branch -q lp:~james-w/+junk/python-timeline-django pkgme-service/sourcecode/python-timeline-django')
++    run('bzr branch -q lp:~james-w/python-oops-dictconfig/amqp-config pkgme-service/sourcecode/python-oops-dictconfig')
      run('cd pkgme-service/sourcecode/pkgme && python setup.py build')
      run('cd pkgme-service/sourcecode/pkgme-binary && python setup.py build')
      # Grab canonical-memento and use it?
 === modified file 'src/djpkgme/views.py'
 --- src/djpkgme/views.py	2012-01-24 20:49:55 +0000
 +++ src/djpkgme/views.py	2012-01-28 21:41:26 +0000
@@ -6,4 +6,6 @@
      return HttpResponse(ALL_CLEAR)
  def oops(request):
++    from django.contrib.auth.models import User
++    [a for a in User.objects.filter(first_name__contains="a")]
      raise AssertionError("Manually generated OOPS")