Code review comment for lp:~rvb/maas/bug-1384001-redux

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Ok I just had a long chat with Raph about this and the upshot is:

1. We came to some middle (haha) ground with the middleware. Raph was referring to general exceptions that can get thrown from anywhere, such as NoConnectionsAvailable. I was referring to exceptions that are declared in the AMP message itself, which are *known* (or should be) to the call sites as being specific to that call. So the upshot is that I am happy for middleware to catch general errors but I still think call sites are responsible for catching known exceptions, because they are best-placed to deal with them. For example, in the case of PowerOn, if we get a PowerActionAlreadyInProgress error, because that code still deals with potentially many nodes at once, it needs to catch and deal with the error because otherwise you'd rollback the entire transaction and potentially leave nodes started up when they should not be. However we decided to compromise on this for now and leave this caught in the middleware, and let Graham look at how to deal with it when he gets around to finishing the work on making node operations work on one at a time.

2. My second and third points above still stand. I convinced Raph that the third needs callsite-specific values with the example of AMT vs IPMI failures; AMT is very much slower at everything and needs a slower retry period.

Did I cover everything Raph? :)

review: Approve

« Back to merge proposal