MAAS stops working and deployment fails after `Loading ephemeral` step

Bug #1908452 reported by Patricia Domingues
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Adam Collard
simplestreams
Fix Released
Undecided
Adam Collard
simplestreams (Ubuntu)
Fix Released
Undecided
Paride Legovini
Focal
Fix Released
Undecided
Paride Legovini
Groovy
Fix Released
Undecided
Paride Legovini
Hirsute
Fix Released
Undecided
Paride Legovini

Bug Description

[Impact]

The bug is about simplestreams possibly getting stuck waiting forever for an an HTTP response that never comes, e.g. because of networking issues. This can potentially affect any package depending on simplestreams, but specifically it was reported affecting MAAS, where it causes server deployments to timeout.

[Test Plan]
Install an iptables rule to block SSL handshaking w/ the MAAS simplestreams repo:

-------------------------
$ sudo iptables -A INPUT -p tcp -s 91.189.88.136 -m string --string maas.io --algo bm -j DROP
-------------------------

Run the reproducer described below, and verify that it hangs indefinitely (I recommend waiting 60s):

-------------------------
$ cat repro.py
#!/usr/bin/env python3

from simplestreams.contentsource import RequestsUrlReader

url = "https://images.maas.io/ephemeral-v3/stable/streams/v1/index.sjson"
r = RequestsUrlReader(url)
-------------------------

With the fix applied, verify that it does timeout in ~10s.

[Regression Potential]

Scenarios where it takes more than 10s to initiate a connection are unlikely, but possible. Code that does not properly handle a timeout exception in these situations may begin to fail.

[Original Description]

= How to determine you are seeing this problem =
Does your MAAS server seem to get "hung up", where deployments suddenly start failing w/ lots of connection timeouts to the MAAS server?

Get a list of pids of your regiond processes:
$ ps -ef | grep regiond

Run strace on each one to see if one is stuck in a connect() or recv() call:
$ sudo strace -p $pid
recv(...

(normally you should see a lot of epoll_ctl() calls go by if not hung)

If one is hung, use lsof to see what it is connected to:
sudo lsof -i -a -p $pid

If you see an open connection to your images server, then this maybe your problem. sudo kill -9 of the hung pid will cause it to respawn and recover.

Related branches

Revision history for this message
Patricia Domingues (patriciasd) wrote :

attaching rackd.log and regiond.log

Revision history for this message
Patricia Domingues (patriciasd) wrote :

attaching one server full history events log

Revision history for this message
Patricia Domingues (patriciasd) wrote :

attaching one server console log when it fails, but when we see this problem, all systems have the same log output

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

This issue has been seen again, causing automated test runs over the seasonal break to fail.

Revision history for this message
dann frazier (dannf) wrote :

I believe I've narrowed this down to a routing issue on our MAAS server which I've now fixed. I'll go ahead and mark as Invalid, and will reopen if it happens again with the correct routes installed.

Changed in maas:
status: New → Invalid
Changed in maas:
status: Invalid → New
Revision history for this message
Patricia Domingues (patriciasd) wrote :

I'm reopening as we have the same issue again:
```
 Thu, 07 Jan. 2021 10:22:18 Marking node failed - Node operation 'Deploying' timed out after 30 minutes.
  Thu, 07 Jan. 2021 09:52:35 Loading ephemeral
  Thu, 07 Jan. 2021 09:52:14 Performing PXE boot
```

MAAS was restarted 17h before it failed ^

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Note machine that failed was starmie in Needham.

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Failures also seen on ms10-35-mcdivittB0-kernel and ms10-34-mcdivittB0-kernel:

Thu, 07 Jan. 2021 10:18:18 Marking node failed - Node operation 'Deploying' timed out after 30 minutes.
Thu, 07 Jan. 2021 09:49:11 Loading ephemeral

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

Please find attached log from 7th Jan 2021 for starmie - very similar to the log posted in comment #3

Revision history for this message
dann frazier (dannf) wrote :

This issue is currently reproducing on our MAAS server again. One thing I've noticed is that the MAAS server isn't reliably responding on port 5240. When just trying to wget the MAAS UI homepage, it's a crap shoot whether or not it works or times out in SYN_SENT state:

root@maas:~# wget http://localhost:5240 -O - > /dev/null
--2021-01-11 21:04:57-- http://localhost:5240/
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:5240... connected.
HTTP request sent, awaiting response... 302 Found
Location: MAAS/ [following]
--2021-01-11 21:04:57-- http://localhost:5240/MAAS/
Reusing existing connection to localhost:5240.
HTTP request sent, awaiting response... 302 Found
Location: /MAAS/r/ [following]
--2021-01-11 21:04:57-- http://localhost:5240/MAAS/r/
Reusing existing connection to localhost:5240.
HTTP request sent, awaiting response... 200 OK
Length: 3724 (3.6K) [text/html]
Saving to: ‘STDOUT’

- 100%[===================>] 3.64K --.-KB/s in 0s

2021-01-11 21:04:57 (201 MB/s) - written to stdout [3724/3724]

root@maas:~# wget http://localhost:5240 -O - > /dev/null
--2021-01-11 21:04:59-- http://localhost:5240/
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:5240... ^C
root@maas:~# wget http://localhost:5240 -O - > /dev/null
--2021-01-11 21:05:01-- http://localhost:5240/
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:5240... ^C
root@maas:~# wget http://localhost:5240 -O - > /dev/null
--2021-01-11 21:05:03-- http://localhost:5240/
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:5240... connected.
HTTP request sent, awaiting response... 302 Found
Location: MAAS/ [following]
--2021-01-11 21:05:03-- http://localhost:5240/MAAS/
Reusing existing connection to localhost:5240.
HTTP request sent, awaiting response... 302 Found
Location: /MAAS/r/ [following]
--2021-01-11 21:05:03-- http://localhost:5240/MAAS/r/
Reusing existing connection to localhost:5240.
HTTP request sent, awaiting response... 200 OK
Length: 3724 (3.6K) [text/html]
Saving to: ‘STDOUT’

- 100%[===================>] 3.64K --.-KB/s in 0s

2021-01-11 21:05:03 (203 MB/s) - written to stdout [3724/3724]

Revision history for this message
dann frazier (dannf) wrote :
Download full text (5.2 KiB)

I ran tcpdump on lo during a failed/hanging wget attempt. This showed the SYN packet coming in, but no SYN|ACK response going out. So the packet is coming in, but nothing appears to be accept()ing it.

I took a look to see what pids are listening on port 5240:
tcp6 0 0 :::5240 :::* LISTEN 1684/python3
tcp6 0 0 :::5240 :::* LISTEN 1694/python3
tcp6 0 0 :::5240 :::* LISTEN 1703/python3
tcp6 51 0 :::5240 :::* LISTEN 1680/python3

I then ran strace on those processes, and ran wget a few times until it hung:

strace of hanging wget (mainly for timestamp):
19:12:28.826032 connect(4, {sa_family=AF_INET6, sin6_port=htons(5240), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28^Cstrace: Process 609353 detached
 <detached ...>

strace of the 4 python processes (regiond):
# strace -tt -p 1684 -p 1694 -p 1703 -p1680
[pid 1703] 19:12:28.089944 <... epoll_wait resumed>[], 10, 397) = 0
[pid 1703] 19:12:28.090230 epoll_wait(5, <unfinished ...>
[pid 1694] 19:12:28.178543 <... epoll_wait resumed>[], 9, 206) = 0
[pid 1694] 19:12:28.178798 epoll_wait(5, [], 9, 0) = 0
[pid 1694] 19:12:28.179059 epoll_wait(5, <unfinished ...>
[pid 1703] 19:12:28.190505 <... epoll_wait resumed>[], 10, 100) = 0
[pid 1703] 19:12:28.190798 epoll_wait(5, [], 10, 0) = 0
[pid 1703] 19:12:28.191106 epoll_wait(5, <unfinished ...>
[pid 1684] 19:12:28.247185 <... epoll_wait resumed>[], 10, 300) = 0
[pid 1684] 19:12:28.247463 epoll_wait(5, [], 10, 0) = 0
[pid 1684] 19:12:28.247775 epoll_wait(5, <unfinished ...>
[pid 1694] 19:12:28.677734 <... epoll_wait resumed>[], 9, 498) = 0
[pid 1694] 19:12:28.677978 epoll_wait(5, [], 9, 0) = 0
[pid 1694] 19:12:28.678228 epoll_wait(5, <unfinished ...>
[pid 1703] 19:12:28.690792 <... epoll_wait resumed>[], 10, 499) = 0
[pid 1703] 19:12:28.691054 epoll_wait(5, [], 10, 0) = 0
[pid 1703] 19:12:28.691333 epoll_wait(5, <unfinished ...>
[pid 1684] 19:12:28.747457 <... epoll_wait resumed>[], 10, 499) = 0
[pid 1684] 19:12:28.747698 epoll_wait(5, [], 10, 0) = 0
[pid 1684] 19:12:28.747951 epoll_wait(5, [], 10, 31) = 0
[pid 1684] 19:12:28.779333 epoll_wait(5, <unfinished ...>
[pid 1694] 19:12:28.972708 <... epoll_wait resumed>[], 9, 294) = 0
[pid 1694] 19:12:28.973067 epoll_wait(5, <unfinished ...>
[pid 1703] 19:12:29.090921 <... epoll_wait resumed>[], 10, 399) = 0

When the wget does not hang, we expect to see epoll return an fd followed by an accept(), such as:
[pid 1684] 19:12:30.247918 epoll_wait(5, [{EPOLLIN, {u32=19, u64=140711718551571}}], 10, 499) = 1
[pid 1684] 19:12:30.319911 accept4(19, {sa_family=AF_INET6, sin6_port=htons(55268), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 14

But we do not see that in the above output - each process just seems to be in an epoll_wait() loop, never noticing a new connection. But that's only 3 of the processes. We see nothing from pid 1680. What's it doing...

Read more...

Revision history for this message
dann frazier (dannf) wrote :

I tried a couple of things to create a reproducer based on the evidence in comment #11:

 1) Started an image sync and, while it was still in-progress, added an iptables rule to block access to images.maas.io.
 2) Created my own http server[*] where I made all GET requests hang, and directed MAAS to use it as its image server.

Neither was able to reproduce the issue. In both cases, strace shows all 4 regiond processes still entering epoll_wait(), and all wget commands to :5240 continued to be answered. Something more subtle must be going on here - but I don't have any more ideas about how to simulate it. Based on advice from the MAAS team, our next step will be to upgrade our server to 2.9 and see if the problem reoccurs.

[*]
#!/usr/bin/python3

import http.server
import socketserver
import time

PORT = 1978

class BrokenHTTPRequestHandler(http.server.SimpleHTTPRequestHandler):
    def do_GET(self):
        """Serve a GET request."""
        time.sleep(900000)

Handler = BrokenHTTPRequestHandler

with socketserver.TCPServer(("", PORT), Handler) as httpd:
    print("serving at port", PORT)
    httpd.serve_forever()

Revision history for this message
dann frazier (dannf) wrote :

The issue is still reproducible with MAAS 2.9.0 (9137-g.8e920a12b):

wget hangs:

ubuntu@maas:~$ wget --tries=1 http://localhost:5240/MAAS/rpc -O /dev/null
--2021-01-14 01:26:52-- http://localhost:5240/MAAS/rpc
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:5240... connected.
HTTP request sent, awaiting response... Read error (Connection timed out) in headers.
Giving up.

One of the regiond processes is hung in a read():
ubuntu@maas:~$ sudo netstat -pan | grep LISTEN | grep 5240
tcp6 0 0 :::5240 :::* LISTEN 678/python3
tcp6 0 0 :::5240 :::* LISTEN 677/python3
tcp6 0 0 :::5240 :::* LISTEN 679/python3
tcp6 51 0 :::5240 :::* LISTEN 675/python3
ubuntu@maas:~$ sudo strace -p 675
strace: Process 675 attached
read(10,

Which is on a socket connected to images.maas.io:
ubuntu@maas:~$ sudo lsof -i -a -p 675 | grep 10u
python3 675 root 10u IPv4 2839669346 0t0 TCP maas:57546->images-maas-io.sawo.canonical.com:https (ESTABLISHED)

Revision history for this message
dann frazier (dannf) wrote :

Note that, in our environment, I've observed in 2 occasions that https connections to images.maas.io sometimes fail initially w/ an SSL error, then succeed on subsequent attempts[*]. We've no proxy configured in our environment, and I'm told no transparent one either, so it's possible the issue is on the server side. I wonder if this is related to the regiond infinite read() hang - and if so - why doesn't it eventually fail like wget does? I wonder if there's a way to simulate this w/ a standalone script that approximates whatever regiond is doing at this point. Would this do the trick?

sstream-query --keyring /snap/maas/current/usr/share/keyrings/ubuntu-cloudimage-keyring.gpg https://images.maas.io/ephemeral-v3/daily/

Unfortunately we can't readily reproduce the SSL failure, we just have to wait for it to happen again naturally.

[*]
ubuntu@maas:~$ wget https://images.maas.io
--2021-01-14 15:22:05-- https://images.maas.io/
Resolving images.maas.io (images.maas.io)... 91.189.88.136, 2001:67c:1360:8001:1::4
Connecting to images.maas.io (images.maas.io)|91.189.88.136|:443... connected.
Unable to establish SSL connection.
ubuntu@maas:~$ wget https://images.maas.io
--2021-01-14 15:23:25-- https://images.maas.io/
Resolving images.maas.io (images.maas.io)... 91.189.88.136, 2001:67c:1360:8001:1::4
Connecting to images.maas.io (images.maas.io)|91.189.88.136|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1460 (1.4K) [text/html]
Saving to: ‘index.html’

index.html 100%[===================>] 1.43K --.-KB/s in 0s

2021-01-14 15:23:25 (37.7 MB/s) - ‘index.html’ saved [1460/1460]

Revision history for this message
dann frazier (dannf) wrote :

I'm able to reproduce on a freshly installed focal + MAAS 1:2.9.0-9137-g.8e920a12b-0ubuntu1~20.04.1 PPA install on a system in the same lab. This will hopefully make things easier to debug than our snap install (both cause we can add debug code in-place, and because it won't impact our production server).

Revision history for this message
dann frazier (dannf) wrote :

Last week I instrumented the deb-based install I mentioned in Comment #15 so that I can get a traceback next time it hangs:

ubuntu@avoton01:~$ sudo grep FAULT /etc/systemd/system/maas-regiond.service
Environment="PYTHONFAULTHANDLER=1"

Unfortunately, the issue has not reoccurred there since.

Revision history for this message
dann frazier (dannf) wrote :
Download full text (9.6 KiB)

The issue has reproduced again - here's a backtrace I was able to capture from the hung regiond process:

Thread 0x00007fa414f8d700 (most recent call first):
  File "/usr/lib/python3.8/threading.py", line 306 in wait
  File "/usr/lib/python3.8/threading.py", line 558 in wait
  File "/usr/lib/python3/dist-packages/crochet/_eventloop.py", line 189 in _result
  File "/usr/lib/python3/dist-packages/crochet/_eventloop.py", line 229 in wait
  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 126 in wrapper
  File "/usr/lib/python3/dist-packages/maasserver/bootresources.py", line 1359 in _import_resources_without_lock
  File "/usr/lib/python3/dist-packages/maasserver/bootresources.py", line 1429 in _import_resources_with_lock
  File "/usr/lib/python3/dist-packages/maasserver/utils/__init__.py", line 193 in call_with_lock
  File "/usr/lib/python3/dist-packages/maasserver/utils/orm.py", line 692 in call_with_connection
  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 192 in wrapper
  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 860 in callInContext
  File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 85 in callWithContext
  File "/usr/lib/python3/dist-packages/twisted/python/context.py", line 122 in callWithContext
  File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 266 in <lambda>
  File "/usr/lib/python3/dist-packages/twisted/python/threadpool.py", line 250 in inContext
  File "/usr/lib/python3/dist-packages/twisted/_threads/_team.py", line 190 in doWork
  File "/usr/lib/python3/dist-packages/twisted/_threads/_threadworker.py", line 46 in work
  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 825 in worker
  File "/usr/lib/python3.8/threading.py", line 870 in run
  File "/usr/lib/python3.8/threading.py", line 932 in _bootstrap_inner
  File "/usr/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007fa415bdd700 (most recent call first):
  File "/usr/lib/python3.8/threading.py", line 302 in wait
  File "/usr/lib/python3.8/queue.py", line 170 in get
  File "/usr/lib/python3/dist-packages/twisted/_threads/_threadworker.py", line 45 in work
  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 825 in worker
  File "/usr/lib/python3.8/threading.py", line 870 in run
  File "/usr/lib/python3.8/threading.py", line 932 in _bootstrap_inner
  File "/usr/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007fa41641e700 (most recent call first):
  File "/usr/lib/python3.8/threading.py", line 302 in wait
  File "/usr/lib/python3.8/queue.py", line 170 in get
  File "/usr/lib/python3/dist-packages/twisted/_threads/_threadworker.py", line 45 in work
  File "/usr/lib/python3/dist-packages/provisioningserver/utils/twisted.py", line 825 in worker
  File "/usr/lib/python3.8/threading.py", line 870 in run
  File "/usr/lib/python3.8/threading.py", line 932 in _bootstrap_inner
  File "/usr/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007fa416c5f700 (most recent call first):
  File "/usr/lib/python3.8/threading.py", line 302 in wait
 ...

Read more...

Revision history for this message
dann frazier (dannf) wrote :

Here's a similar report of pythonssl hanging in the handshake:
  https://bugs.python.org/issue34438

Suggestion there from upstream is to add a timeout.

simplestreams seems to be using a requests object to make the connection. From simplestreams/contentsource.py:

381 self.req = requests.get(url, stream=True, auth=auth, headers=headers)

And while I'm not sure this is the canonical docs for this version, I did come across a statement at https://requests.readthedocs.io/en/master/user/quickstart/ that says:

= Note =
timeout is not a time limit on the entire response download; rather, an exception is raised if the server has not issued a response for timeout seconds (more precisely, if no bytes have been received on the underlying socket for timeout seconds). If no timeout is specified explicitly, requests do not time out.

So it seems like adding a timeout might be the way to go. I'm not sure at what layer though - should simplestreams hardcode one itself? Should it allow one to be passed in?

Revision history for this message
dann frazier (dannf) wrote :
Download full text (3.5 KiB)

A reproducer would be nice you say? Well, you can point your MAAS server to this URL for MAAS images:
  https://images.maas.io/ephemeral-v3/daily/

Then you can simulate whatever-the-heck is going on in our lab environment by adding an iptables rule that will drop packets containing the server cert from the image server:

-------------
$ sudo iptables -A INPUT -p tcp -s 91.189.88.136 -m string --string maas.io --algo bm -j DROP
-------------

Then initiate an image sync. You'll find that roughly 25% of API connections hang indefinitely:

-------------
ubuntu@maas:~$ wget --tries=1 http://localhost:5240/MAAS/rpc -O /dev/null
--2021-01-29 20:59:42-- http://localhost:5240/MAAS/rpc
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:5240... connected.
HTTP request sent, awaiting response...
-------------

What's that? Would be nice if it were a standalone reproducer? Well, you can do the same thing with the same iptables rule and a simple script that makes the same simplestreams API call as regiond does:

-------------------------
$ cat repro.py
#!/usr/bin/env python3

from simplestreams.contentsource import RequestsUrlReader

url = "https://images.maas.io/ephemeral-v3/stable/streams/v1/index.sjson"
r = RequestsUrlReader(url)
-------------------------

This will hang seemingly forever. strace can be used to demonstrate it is also hung in a read call, as I showed regiond was in Comment #11:

-------------------------
$ sudo strace -p 562333
strace: Process 562333 attached
read(20, ^Cstrace: Process 562333 detached
 <detached ...>
-------------------------

And if you ^c it, it will show same backtrace I extracted from regiond in comment #17:

-------------------------
$ ./repro.py
^CTraceback (most recent call last):
  File "./repro.py", line 6, in <module>
    r = RequestsUrlReader(url)
  File "/usr/lib/python3/dist-packages/simplestreams/contentsource.py", line 381, in __init__
    self.req = requests.get(url, stream=True, auth=auth, headers=headers)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 996, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 366, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 370, in...

Read more...

Revision history for this message
dann frazier (dannf) wrote :
Download full text (4.2 KiB)

Here's a demo of adding a timeout. I'm not sure if this has to be done in the simplestreams library or if it can somehow be set by a caller:

$ diff -u /usr/lib/python3/dist-packages/simplestreams/contentsource.py.orig /usr/lib/python3/dist-packages/simplestreams/contentsource.py
--- /usr/lib/python3/dist-packages/simplestreams/contentsource.py.orig 2021-01-29 21:39:55.695029811 +0000
+++ /usr/lib/python3/dist-packages/simplestreams/contentsource.py 2021-01-29 21:40:57.327987600 +0000
@@ -378,7 +378,7 @@
         if headers == {}:
             headers = None

- self.req = requests.get(url, stream=True, auth=auth, headers=headers)
+ self.req = requests.get(url, stream=True, auth=auth, headers=headers, timeout=10)
         self.r_iter = None
         if buflen is None:
             buflen = READ_BUFFER_SIZE

------------------

$ ./repro.py
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 485, in wrap_socket
    cnx.do_handshake()
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1915, in do_handshake
    self._raise_ssl_error(self._ssl, result)
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1622, in _raise_ssl_error
    raise WantReadError()
OpenSSL.SSL.WantReadError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 996, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 366, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 488, in wrap_socket
    raise timeout("select timed out")
socket.timeout: select timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 400, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 379, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 330, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='images.maas.io', port=443): Read timed out. (read timeout=10)

During handling of the above...

Read more...

Revision history for this message
Lee Trager (ltrager) wrote :

@dannf - Thanks for the great deep dive!

It seems the issue is in python3-simplestreams. According to [1] "Nearly all production code should use this parameter in nearly all requests. Failure to do so can cause your program to hang indefinitely"

The timeout needs to be set in RequestsUrlReader which is fairly deep in SimpleStreams code, MAAS does not directly interact with it at all. simplestreams uses Urllib2UrlReader as an alternative if requests isn't available and no timeout is set there either. RequestsUrlReader/Urllib2UrlReader is called by UrlMirrorReader which MAAS does call.

We need to figure out a good default timeout value for both RequestsUrlReader and Urllib2UrlReader. We also need to determine whether this should be configurable by the caller or not, MAAS will most likely use the default value.

[1] https://requests.readthedocs.io/en/master/user/quickstart/#timeouts

Changed in maas:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Lee Trager (ltrager)
Changed in simplestreams:
status: New → In Progress
assignee: nobody → Adam Collard (adam-collard)
Changed in simplestreams:
status: In Progress → Fix Committed
Paride Legovini (paride)
Changed in simplestreams (Ubuntu):
status: New → Triaged
Revision history for this message
dann frazier (dannf) wrote :

fyi, we saw this recur even after switching to an HTTP images mirror. We were seeing deployments fail, and I found a regiond process was again hung in a system call reading a socket from images.maas.io:80. This is our production MAAS so I didn't spend a lot of time debugging it - I kill -9'd the process and MAAS respawned/recovered. But the highlight here is that this does not appear to be specific to SSL connections.

dann frazier (dannf)
description: updated
dann frazier (dannf)
description: updated
Revision history for this message
dann frazier (dannf) wrote :
Download full text (3.3 KiB)

We've hit this again on our production MAAS server, see below. We're in a position where we regularly need to check if our regiond processes are hung and restart them. I'd therefore like to ask for a bump in priority here. Any chance for a fix soon on the 2.9 branch? Does MAAS need simplestreams to cut a release before this fix can be picked up?

ubuntu@maas:~$ ps -ef | grep region
root 633597 633491 0 14:07 ? 00:02:28 python3 /snap/maas/11778/sbin/regiond
root 633814 633597 0 14:07 ? 00:33:34 python3 /snap/maas/11778/sbin/regiond
root 633816 633597 0 14:07 ? 00:34:25 python3 /snap/maas/11778/sbin/regiond
root 633817 633597 0 14:07 ? 00:35:24 python3 /snap/maas/11778/sbin/regiond
root 633818 633597 0 14:07 ? 00:34:17 python3 /snap/maas/11778/sbin/regiond
ubuntu 648515 648489 0 18:19 pts/0 00:00:00 grep --color=auto region
ubuntu@maas:~$ sudo strace -p 633597
strace: Process 633597 attached
epoll_wait(3, [], 10, 94) = 0
epoll_wait(3, [], 10, 0) = 0
epoll_wait(3, [], 10, 99) = 0
epoll_wait(3, [], 10, 399) = 0
epoll_wait(3, [], 10, 0) = 0
epoll_wait(3, ^Cstrace: Process 633597 detached
 <detached ...>

ubuntu@maas:~$ sudo strace -p 633814
strace: Process 633814 attached
recvfrom(30, ^Cstrace: Process 633814 detached
 <detached ...>

ubuntu@maas:~$ sudo lsof -i -a -p 633814
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python3 633814 root 10u IPv6 696471999 0t0 TCP maas:5240->10.229.192.54:54706 (ESTABLISHED)
python3 633814 root 14u IPv6 680163730 0t0 TCP *:5251 (LISTEN)
python3 633814 root 15u IPv4 696151424 0t0 TCP localhost:51510->localhost:postgresql (ESTABLISHED)
python3 633814 root 16u IPv6 683429160 0t0 TCP maas:5240->1.general.morphis.uk.vpn:38664 (CLOSE_WAIT)
python3 633814 root 17u IPv4 680183835 0t0 TCP localhost:59674->localhost:postgresql (ESTABLISHED)
python3 633814 root 18u IPv4 696173835 0t0 TCP localhost:51520->localhost:postgresql (ESTABLISHED)
python3 633814 root 19u IPv6 680176399 0t0 TCP maas:5251->maas:45382 (CLOSE_WAIT)
python3 633814 root 20u IPv4 696078469 0t0 TCP localhost:51486->localhost:postgresql (ESTABLISHED)
python3 633814 root 21u IPv6 680176401 0t0 TCP *:5240 (LISTEN)
python3 633814 root 22u IPv4 696206297 0t0 TCP localhost:51932->localhost:postgresql (ESTABLISHED)
python3 633814 root 23u IPv6 680199940 0t0 TCP maas:5240->2.general.dannf.us.vpn:55866 (CLOSE_WAIT)
python3 633814 root 26u IPv6 680154669 0t0 TCP maas:5240->2.general.dannf.us.vpn:55848 (ESTABLISHED)
python3 633814 root 27u IPv4 696430193 0t0 TCP localhost:51948->localhost:postgresql (ESTABLISHED)
python3 633814 root 28u IPv6 680199946 0t0 TCP maas:5240->2.general.dannf.us.vpn:55868 (CLOSE_WAIT)
python3 633814 root 29u IPv4 696373443 0t0 TCP localhost:51938->localhost:postgresql (ESTABLISHED)
python3 633814 root 30u IPv4 696457905 0t0 TCP maas:49652->images-maas-io.sawo.canonical.com:http (ESTABLISHED)
python3 633814 root 31u IPv4 696394600 ...

Read more...

Paride Legovini (paride)
Changed in simplestreams (Ubuntu):
assignee: nobody → Paride Legovini (paride)
Paride Legovini (paride)
Changed in simplestreams (Ubuntu Focal):
status: New → Triaged
Changed in simplestreams (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Paride Legovini (paride) wrote :

smoser sponsored the upload:

$ dput ubuntu ../out/simplestreams_0.1.0-46-gb1f40d99-0ubuntu1_source.changes
Checking signature on .changes
gpg: ../out/simplestreams_0.1.0-46-gb1f40d99-0ubuntu1_source.changes: Valid signature from 1E4410A4024BC6F0
Checking signature on .dsc
gpg: ../out/simplestreams_0.1.0-46-gb1f40d99-0ubuntu1.dsc: Valid signature from 1E4410A4024BC6F0
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading simplestreams_0.1.0-46-gb1f40d99-0ubuntu1.dsc: done.
  Uploading simplestreams_0.1.0-46-gb1f40d99.orig.tar.gz: done.
  Uploading simplestreams_0.1.0-46-gb1f40d99-0ubuntu1.debian.tar.xz: done.
  Uploading simplestreams_0.1.0-46-gb1f40d99-0ubuntu1_source.buildinfo: done.
  Uploading simplestreams_0.1.0-46-gb1f40d99-0ubuntu1_source.changes: done.
Successfully uploaded packages.

Unfortunately the fix for this bug (f37d2ed2) doesn't mention this bug with the right syntax, so this bug won't get auto-updated by Launchpad. Will update it manually.

Changed in simplestreams:
status: Fix Committed → Fix Released
Changed in simplestreams (Ubuntu):
status: In Progress → Fix Released
Paride Legovini (paride)
Changed in simplestreams (Ubuntu Focal):
assignee: nobody → Paride Legovini (paride)
Paride Legovini (paride)
description: updated
Paride Legovini (paride)
Changed in simplestreams (Ubuntu Focal):
status: Triaged → In Progress
Paride Legovini (paride)
Changed in simplestreams (Ubuntu Groovy):
assignee: nobody → Paride Legovini (paride)
Changed in simplestreams (Ubuntu Hirsute):
assignee: nobody → Paride Legovini (paride)
Changed in simplestreams (Ubuntu Groovy):
status: New → In Progress
Changed in simplestreams (Ubuntu Hirsute):
status: New → In Progress
Paride Legovini (paride)
Changed in simplestreams (Ubuntu):
status: Fix Released → Confirmed
status: Confirmed → Fix Released
Paride Legovini (paride)
Changed in simplestreams (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in simplestreams (Ubuntu Groovy):
status: In Progress → Fix Committed
Changed in simplestreams (Ubuntu Hirsute):
status: In Progress → Fix Committed
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Patricia, or anyone else affected,

Accepted simplestreams into hirsute-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/simplestreams/0.1.0-30-g3cc8988a-0ubuntu1.21.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-hirsute to verification-done-hirsute. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-hirsute. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed verification-needed-hirsute
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Patricia, or anyone else affected,

Accepted simplestreams into groovy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/simplestreams/0.1.0-30-g3cc8988a-0ubuntu1.20.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-groovy to verification-done-groovy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-groovy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed-groovy
dann frazier (dannf)
description: updated
Revision history for this message
dann frazier (dannf) wrote :
Download full text (7.4 KiB)

= hirsute verification =
# Start by showing we can still reproduce the problem w/o the -proposed packages:
ubuntu@avoton02:~$ sudo iptables -A INPUT -p tcp -s 91.189.88.136 -m string --string maas.io --algo bm -j DROP
ubuntu@avoton02:~$ python3 ./repro.py & sleep 60
[1] 3386
# 60 seconds have passed, still hung:
ubuntu@avoton02:~$ sudo strace -p 3386
strace: Process 3386 attached
read(3, ^Cstrace: Process 3386 detached
 <detached ...>

ubuntu@avoton02:~$ fg
python3 ./repro.py
^CTraceback (most recent call last):
  File "/home/ubuntu/./repro.py", line 6, in <module>
    r = RequestsUrlReader(url)
  File "/usr/lib/python3/dist-packages/simplestreams/contentsource.py", line 381, in __init__
    self.req = requests.get(url, stream=True, auth=auth, headers=headers)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1012, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 411, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 428, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 472, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3.9/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib/python3.9/ssl.py", line 1040, in _create
    self.do_handshake()
  File "/usr/lib/python3.9/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
KeyboardInterrupt

# Now upgrade and demonstrate the problem is fixed

ubuntu@avoton02:~$ sudo apt install python3-simplestreams simplestreams -y
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be upgraded:
  python3-simplestreams simplestreams
2 upgraded, 0 newly installed, 0 to remove and 68 not upgraded.
Need to get 31.8 kB/37.8 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu hirsute-proposed/main amd64 python3-simplestreams all 0.1.0-30-g3cc8988a-0ubuntu1.21.04.1 [31.8 kB]
Fetched 31.8 kB in 0s (119 kB/s)
(Reading database ... 79414 files and directories currently instal...

Read more...

tags: added: verification-done-hirsute
removed: verification-needed-hirsute
Revision history for this message
dann frazier (dannf) wrote :
Download full text (7.2 KiB)

= groovy verification =
# Start by showing we can still reproduce the problem w/o the -proposed packages:
ubuntu@avoton07:~$ sudo iptables -A INPUT -p tcp -s 91.189.88.136 -m string --string maas.io --algo bm -j DROP
ubuntu@avoton07:~$ cat > repro.py
#!/usr/bin/env python3

from simplestreams.contentsource import RequestsUrlReader

url = "https://images.maas.io/ephemeral-v3/stable/streams/v1/index.sjson"
r = RequestsUrlReader(url)
ubuntu@avoton07:~$ python3 ./repro.py & sleep 60
[1] 3225

# 60 seconds have passed, still hung:
ubuntu@avoton07:~$ sudo strace -p 3225
strace: Process 3225 attached
read(3, ^Cstrace: Process 3225 detached
 <detached ...>

ubuntu@avoton07:~$ fg
python3 ./repro.py
^CTraceback (most recent call last):
  File "./repro.py", line 6, in <module>
    r = RequestsUrlReader(url)
  File "/usr/lib/python3/dist-packages/simplestreams/contentsource.py", line 381, in __init__
    self.req = requests.get(url, stream=True, auth=auth, headers=headers)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 978, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 361, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 377, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 488, in wrap_socket
    cnx.do_handshake()
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1933, in do_handshake
    result = _lib.SSL_do_handshake(self._ssl)
KeyboardInterrupt

# Now upgrade and demonstrate the problem is fixed
ubuntu@avoton07:~$ sudo apt install simplestreams python3-simplestreams
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be upgraded:
  python3-simplestreams simplestreams
2 upgraded, 0 newly installed, 0 to remove and 67 not upgraded.
Need to get 37.8 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu groovy-proposed/main amd64 python3-simplestreams all 0.1.0-30-g3cc8988a-0ubuntu1.20.10.1 [31.8 kB]
Get:2 http://archive.ubuntu.com/ubuntu groovy-proposed/universe amd64 simplestreams all 0.1.0-30...

Read more...

tags: added: verification-done-groovy
removed: verification-needed-groovy
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Patricia, or anyone else affected,

Accepted simplestreams into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/simplestreams/0.1.0-30-g3cc8988a-0ubuntu1.20.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

tags: added: verification-needed-focal
Revision history for this message
dann frazier (dannf) wrote :
Download full text (7.1 KiB)

= focal verification =
# Start by showing we can still reproduce the problem w/o the -proposed packages:
ubuntu@avoton07:~$ sudo iptables -A INPUT -p tcp -s 91.189.88.136 -m string --string maas.io --algo bm -j DROP
ubuntu@avoton07:~$ cat > repro.py
#!/usr/bin/env python3

from simplestreams.contentsource import RequestsUrlReader

url = "https://images.maas.io/ephemeral-v3/stable/streams/v1/index.sjson"
r = RequestsUrlReader(url)
ubuntu@avoton07:~$ python3 ./repro.py & sleep 60
[1] 3225
ubuntu@avoton07:~$ sudo strace -p 3225
strace: Process 3225 attached
read(3, ^Cstrace: Process 3225 detached
 <detached ...>

ubuntu@avoton07:~$ fg
python3 ./repro.py
^CTraceback (most recent call last):
  File "./repro.py", line 6, in <module>
    r = RequestsUrlReader(url)
  File "/usr/lib/python3/dist-packages/simplestreams/contentsource.py", line 381, in __init__
    self.req = requests.get(url, stream=True, auth=auth, headers=headers)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 978, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 361, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 377, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 488, in wrap_socket
    cnx.do_handshake()
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1933, in do_handshake
    result = _lib.SSL_do_handshake(self._ssl)
KeyboardInterrupt

# Now upgrade and demonstrate the problem is fixed
ubuntu@avoton07:~$ sudo apt install simplestreams python3-simplestreams
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be upgraded:
  python3-simplestreams simplestreams
2 upgraded, 0 newly installed, 0 to remove and 67 not upgraded.
Need to get 37.8 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu groovy-proposed/main amd64 python3-simplestreams all 0.1.0-30-g3cc8988a-0ubuntu1.20.10.1 [31.8 kB]
Get:2 http://archive.ubuntu.com/ubuntu groovy-proposed/universe amd64 simplestreams all 0.1.0-30-g3cc8988a-0ubuntu1.20.10.1 [6012 B]
Fet...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
dann frazier (dannf) wrote :

Restoring verification-needed-focal tag; bdmurray correctly points out my focal verification was actually done on groovy :(

tags: added: verification-needed verification-needed-focal
removed: verification-done verification-done-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package simplestreams - 0.1.0-30-g3cc8988a-0ubuntu1.21.04.1

---------------
simplestreams (0.1.0-30-g3cc8988a-0ubuntu1.21.04.1) hirsute; urgency=medium

  * Add 10s timeout to out-going requests to avoid blocking.
    - Via: d/patches/0001-Add-10s-timeout-to-out-going-requests.patch
      Upstream cherry-pick. Thanks: Adam Collard. (LP: #1908452)

 -- Paride Legovini <email address hidden> Fri, 11 Jun 2021 16:51:14 +0200

Changed in simplestreams (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for simplestreams has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package simplestreams - 0.1.0-30-g3cc8988a-0ubuntu1.20.10.1

---------------
simplestreams (0.1.0-30-g3cc8988a-0ubuntu1.20.10.1) groovy; urgency=medium

  * Add 10s timeout to out-going requests to avoid blocking.
    - Via: d/patches/0001-Add-10s-timeout-to-out-going-requests.patch
      Upstream cherry-pick. Thanks: Adam Collard. (LP: #1908452)

 -- Paride Legovini <email address hidden> Fri, 11 Jun 2021 16:51:14 +0200

Changed in simplestreams (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
dann frazier (dannf) wrote :
Download full text (7.1 KiB)

= focal verification =
# Start by showing we can still reproduce the problem w/o the -proposed packages:
ubuntu@dannf-test:~$ sudo iptables -A INPUT -p tcp -s 91.189.88.136 -m string --string maas.io --algo bm -j DROP
ubuntu@dannf-test:~$ cat > repro.py
#!/usr/bin/env python3

from simplestreams.contentsource import RequestsUrlReader

url = "https://images.maas.io/ephemeral-v3/stable/streams/v1/index.sjson"
r = RequestsUrlReader(url)
ubuntu@dannf-test:~$ python3 ./repro.py & sleep 60
[1] 1687
ubuntu@dannf-test:~$ sudo strace -p 1687
strace: Process 1687 attached
read(3, ^Cstrace: Process 1687 detached
 <detached ...>

ubuntu@dannf-test:~$ fg
python3 ./repro.py
^CTraceback (most recent call last):
  File "./repro.py", line 6, in <module>
    r = RequestsUrlReader(url)
  File "/usr/lib/python3/dist-packages/simplestreams/contentsource.py", line 381, in __init__
    self.req = requests.get(url, stream=True, auth=auth, headers=headers)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 996, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 366, in connect
    self.sock = ssl_wrap_socket(
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 485, in wrap_socket
    cnx.do_handshake()
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1914, in do_handshake
    result = _lib.SSL_do_handshake(self._ssl)
KeyboardInterrupt

# Now upgrade and demonstrate the problem is fixed

ubuntu@dannf-test:~$ sudo apt install python3-simplestreams simplestreams
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be upgraded:
  python3-simplestreams simplestreams
2 upgraded, 0 newly installed, 0 to remove and 24 not upgraded.
Need to get 37.8 kB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu focal-proposed/main amd64 python3-simplestreams all 0.1.0-30-g3cc8988a-0ubuntu1.20.04.1 [31.8 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-proposed/universe amd64 simplestreams all 0.1.0-30-g3cc8988a-0ubuntu1.20.04.1 [...

Read more...

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package simplestreams - 0.1.0-30-g3cc8988a-0ubuntu1.20.04.1

---------------
simplestreams (0.1.0-30-g3cc8988a-0ubuntu1.20.04.1) focal; urgency=medium

  * Add 10s timeout to out-going requests to avoid blocking.
    - Via: d/patches/0001-Add-10s-timeout-to-out-going-requests.patch
      Upstream cherry-pick. Thanks: Adam Collard. (LP: #1908452)

 -- Paride Legovini <email address hidden> Fri, 11 Jun 2021 16:51:14 +0200

Changed in simplestreams (Ubuntu Focal):
status: Fix Committed → Fix Released
Changed in maas:
assignee: Lee Trager (ltrager) → nobody
Changed in maas:
status: In Progress → Fix Committed
milestone: none → 3.4.0
Changed in maas:
milestone: 3.4.0 → 3.4.0-rc2
Changed in maas:
assignee: nobody → Adam Collard (adam-collard)
Alberto Donato (ack)
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.