Merge lp:~allenap/maas/no-requests-for-simplestreams into lp:maas/trunk

Proposed by Gavin Panella on 2015-06-05
Status: Merged
Approved by: Gavin Panella on 2015-06-05
Approved revision: 3993
Merged at revision: 3997
Proposed branch: lp:~allenap/maas/no-requests-for-simplestreams
Merge into: lp:maas/trunk
Diff against target: 47 lines (+23/-0)
1 file modified
src/provisioningserver/plugin.py (+23/-0)
To merge this branch: bzr merge lp:~allenap/maas/no-requests-for-simplestreams
Reviewer Review Type Date Requested Status
Blake Rouse (community) 2015-06-05 Approve on 2015-06-05
Review via email: mp+261235@code.launchpad.net

Commit message

Experimentally force simplestreams to use `urllib2` instead of `requests`.

The latter appears to be allowing simplestreams to leak file-descriptors; streaming requests that are not explicitly closed remain referenced by a connection pool and thus not garbage-collected.

To post a comment you must log in.
Raphaël Badin (rvb) wrote :

Looks good, but let's get proof that this solves the problem we're seeing before landing this…

Ricardo Bánffy (rbanffy) wrote :

Seems OK.

Gavin Panella (allenap) wrote :

This is meant first as a cowboy-patch for OIL to see if it helps. If it does help I'll probably pursue a proper fix in simplestreams.

Thanks for the reviews :)

Blake Rouse (blake-rouse) wrote :

So I was able to test this branch. I setup the cluster to sync from the region every 10 seconds. With it syncing every 10 seconds with this change there where no CLOSE_WAIT sockets.

sudo lsof | grep maas | grep CLOSE_WAIT | wc -l
0

With this change disabled, so simplestreams using python-requests I got many CLOSE_WAITS

sudo lsof | grep maas | grep CLOSE_WAIT | wc -l
88
sudo lsof | grep maas | grep CLOSE_WAIT | wc -l
104

And it keeps growing. This fixes the problem lets land this now. So 1.8 is stable, without the cluster will need to keep being restarted. At some point people will reach a limit no matter the ulimit.

review: Approve
Scott Moser (smoser) wrote :

python-requests uses urllib3 which does connection pooling.
it'd seem easy enough to reproduce this outside of maas.
just look at sstream-query . you could run that in a loop, and we should see similar leakage.

Ricardo Bánffy (rbanffy) wrote :

I was left wondering if we should adopt a convention for naming "cowboy patches" (if that's the term we'll use) so that we don't confuse them with long-term solutions.

I've been using "bug-NNNNN-something-meaningful" as my personal sanity-keeping memory aid, but the "bug-" part could be morphed into something more interesting.

Gavin Panella (allenap) wrote :

> python-requests uses urllib3 which does connection pooling. it'd seem
> easy enough to reproduce this outside of maas. just look at
> sstream-query . you could run that in a loop, and we should see
> similar leakage.

Thanks Scott. I've tried this:

  $ python -i /usr/bin/sstream-query \
  > --keyring=/usr/share/keyrings/ubuntu-cloudimage-keyring.gpg \
  > http://maas.ubuntu.com/images/ephemeral-v2/daily/
  ...
  >>> import time
  >>> while True:
  ... main()
  ... time.sleep(0.1)

but I wasn't able to accumulate connections in CLOSE_WAIT. Could there
be a code path taken (or not) that might mean we're missing something?
I've found five code paths through contentsource.py that /might/ leak (I
have put together a patch to fix them, but it's not tested:
http://paste.ubuntu.com/11648573/).

Gavin Panella (allenap) wrote :

> I was left wondering if we should adopt a convention for naming
> "cowboy patches" (if that's the term we'll use) so that we don't
> confuse them with long-term solutions.
>
> I've been using "bug-NNNNN-something-meaningful" as my personal
> sanity-keeping memory aid, but the "bug-" part could be morphed into
> something more interesting.

Raphaël moved this into a module called provisioningserver.monkey today.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/plugin.py'
2--- src/provisioningserver/plugin.py 2015-05-18 20:17:28 +0000
3+++ src/provisioningserver/plugin.py 2015-06-05 15:01:24 +0000
4@@ -20,6 +20,7 @@
5 import os
6 import socket
7 from socket import error as socket_error
8+import sys
9
10 from provisioningserver.utils.debug import (
11 register_sigusr2_thread_dump_handler,
12@@ -45,6 +46,27 @@
13 from zope.interface import implementer
14
15
16+def force_simplestreams_to_use_urllib2():
17+ """Monkey-patch `simplestreams` to use `urllib2`.
18+
19+ This prevents the use of `requests` which /may/ be helping simplestreams
20+ to lose file-descriptors.
21+ """
22+ import simplestreams.contentsource
23+
24+ if sys.version_info > (3, 0):
25+ import urllib.request as urllib_request
26+ import urllib.error as urllib_error
27+ else:
28+ import urllib2 as urllib_request
29+ urllib_error = urllib_request
30+
31+ vars(simplestreams.contentsource).update(
32+ URL_READER=simplestreams.contentsource.Urllib2UrlReader,
33+ URL_READER_CLASSNAME="Urllib2UrlReader", urllib_error=urllib_error,
34+ urllib_request=urllib_request)
35+
36+
37 @implementer(ICredentialsChecker)
38 class SingleUsernamePasswordChecker:
39 """An `ICredentialsChecker` for a single username and password."""
40@@ -231,6 +253,7 @@
41 def makeService(self, options):
42 """Construct the MAAS Cluster service."""
43 register_sigusr2_thread_dump_handler()
44+ force_simplestreams_to_use_urllib2()
45
46 from provisioningserver import services
47 from provisioningserver.config import Config