Merge lp:~cjwatson/python-oops-amqp/publisher-handle-channel-errors into lp:python-oops-amqp
Status: | Needs review |
---|---|
Proposed branch: | lp:~cjwatson/python-oops-amqp/publisher-handle-channel-errors |
Merge into: | lp:python-oops-amqp |
Diff against target: |
85 lines (+29/-2) 4 files modified
NEWS (+5/-0) oops_amqp/publisher.py (+11/-0) oops_amqp/tests/test_publisher.py (+8/-0) oops_amqp/utils.py (+5/-2) |
To merge this branch: | bzr merge lp:~cjwatson/python-oops-amqp/publisher-handle-channel-errors |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Launchpad code reviewers | Pending | ||
Review via email: mp+367748@code.launchpad.net |
Commit message
Handle AMQP channel errors (particularly NotFound) in the publisher.
Description of the change
amqp 2.4.0 included a change to drain events before publishing. This means that if we try to publish an OOPS to a nonexistent exchange, then some future publishing attempt will raise a NotFound exception, which is a channel error rather than a connection error and so wasn't previously handled.
To try to minimise confusion resulting from this (which can be considerable - it took me several days to track down what was happening in Launchpad's test suite), spend a short time waiting for a response for the broker after publishing an OOPS. This will typically allow us to detect channel errors immediately, which we now handle; even if we don't manage to handle them immediately, they'll be handled the next time we try to publish something on the same channel.
It would be possible to handle channel errors more economically by just reopening a channel on the same connection rather than reopening the entire connection, but reopening the connection seems to work well enough for now.
Unmerged revisions
- 24. By Colin Watson
-
Handle AMQP channel errors (particularly NotFound) in the publisher.
amqp 2.4.0 included a change to drain events before publishing. This
means that if we try to publish an OOPS to a nonexistent exchange, then
some future publishing attempt will raise a NotFound exception, which is
a channel error rather than a connection error and so wasn't previously
handled.To try to minimise confusion resulting from this (which can be
considerable - it took me several days to track down what was happening
in Launchpad's test suite), spend a short time waiting for a response
for the broker after publishing an OOPS. This will typically allow us
to detect channel errors immediately, which we now handle; even if we
don't manage to handle them immediately, they'll be handled the next
time we try to publish something on the same channel.It would be possible to handle channel errors more economically by just
reopening a channel on the same connection rather than reopening the
entire connection, but reopening the connection seems to work well
enough for now.
I do have some concerns about what would happen if the broker was up but slow and there was an OOPS storm: we'd incur an extra second in each case. I added this mainly because it was difficult to reproduce the problem in a test case otherwise. On the other hand, an extra second isn't terrible in that sort of situation, and might even allow the broker to recover more quickly by slowing down the input slightly.