Merge ~andersson123/autopkgtest-cloud:cache-amqp-fix into autopkgtest-cloud:master

Proposed by Tim Andersson
Status: Merged
Merged at revision: ea3422504760925873da0b96b8a6ef59576f64c4
Proposed branch: ~andersson123/autopkgtest-cloud:cache-amqp-fix
Merge into: autopkgtest-cloud:master
Diff against target: 108 lines (+46/-2)
4 files modified
charms/focal/autopkgtest-web/config.yaml (+6/-0)
charms/focal/autopkgtest-web/reactive/autopkgtest_web.py (+3/-0)
charms/focal/autopkgtest-web/webcontrol/cache-amqp (+36/-2)
mojo/service-bundle (+1/-0)
Reviewer Review Type Date Requested Status
Brian Murray Approve
Paride Legovini Disapprove
Review via email: mp+456445@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Paride Legovini (paride) wrote :

AFAICT currently the autopkgtest-cloud service would keep working fine if the web workers are down, as those are only used for user interaction. Here we add an dependency on the web units being up, in order for the process to work, and I think we don't want this.

Also: loading the queue contents in two very different ways depending on whether the unit is leader or not will make debugging difficult.

The AutopkgtestQueueContents class already has checks for leader, adding an "external" check like proposed here calls for other changes in that logic.

What I think is that there should be another way to query rabbitmq for queue length, without popping / pushing all the queue items.

review: Disapprove
Revision history for this message
Tim Andersson (andersson123) wrote :

I don't understand how this adds a dependency on the web units being up. The only dependency is that the non-leader web unit depends on the leader web-unit to be functional, somewhat. If there's something I'm missing let me know, please elaborate.

I don't see why loading it in two different ways makes debugging difficult - the tracebacks would always be explicitly different (one would involve urllib, one would involve amqplib), and we can easily find out which web unit is the leader, thus helping debugging. We can also make the debug logging more verbose.

Using the leader check from AutopkgtestQueueContents does make sense, I'll amend for that.

As far as I understand, that's not how rabbitmq works. See the link below:
https://stackoverflow.com/questions/10709533/is-it-possible-to-view-rabbitmq-message-contents-directly-from-the-command-line

I think there's maybe ways with rabbitmqctl and rabbitmq_admin but we're not running this script from the rabbitmq machine.

Revision history for this message
Tim Andersson (andersson123) :
Revision history for this message
Brian Murray (brian-murray) wrote :

Paride raises a valid about hard coding the cookies for the servers in the script in part because this would need updating whenever we add or replace web servers. (Something we've had to do in the past!)

An alternate approach would be to write the IP addresses of all the autopkgtest-web servers to ~/autopkgtest-cloud.conf. This file is already read by cache-amqp and could be used to connect to the other apache2 servers directly and grab the queues.json file and by passing the haproxy server.

Does that sound like a workable solution to everyone?

review: Needs Fixing
Revision history for this message
Tim Andersson (andersson123) wrote :

I thought about what you suggest but you can't ssh between web units, so that'd be quite involved to make that possible.

I think we could *not* hard code the cookies with something like this (pseudocode):

for i in range(x):
    # get queues.json until it's different to the one on disk
    srv_name = "S" + str(i)
    etc

What do you think? I think incorporating the ip addresses is much more involved IMO

Revision history for this message
Tim Andersson (andersson123) wrote :

Or, actually, we could incorporate the cookies in ~/autopkgtest-cloud.conf ?

I think that's probably the best approach

Revision history for this message
Tim Andersson (andersson123) wrote :
Revision history for this message
Brian Murray (brian-murray) wrote :

This looks good I've just one in-line comment regarding clarifying the types of cookies in the configuration. Feel free to merge this after making that change. Thanks!

review: Approve
Revision history for this message
Tim Andersson (andersson123) wrote :

Great, thanks! I'll merge once CI has passed.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/charms/focal/autopkgtest-web/config.yaml b/charms/focal/autopkgtest-web/config.yaml
2index cc97824..20def8c 100644
3--- a/charms/focal/autopkgtest-web/config.yaml
4+++ b/charms/focal/autopkgtest-web/config.yaml
5@@ -34,3 +34,9 @@ options:
6 default:
7 description: "$no_proxy environment variable (for accessing sites \
8 other than github, like login.ubuntu.com and launchpad.net)"
9+ cookies:
10+ type: string
11+ default:
12+ description: "SRVNAME cookies for each autopkgtest-web unit. \
13+ Each web unit has a cookie assigned which tells the \
14+ haproxy server which web unit to redirect a request to."
15diff --git a/charms/focal/autopkgtest-web/reactive/autopkgtest_web.py b/charms/focal/autopkgtest-web/reactive/autopkgtest_web.py
16index db60fcb..93eaa21 100644
17--- a/charms/focal/autopkgtest-web/reactive/autopkgtest_web.py
18+++ b/charms/focal/autopkgtest-web/reactive/autopkgtest_web.py
19@@ -54,10 +54,12 @@ def setup_rabbitmq(rabbitmq):
20 "amqp.available",
21 "config.set.storage-url-internal",
22 "config.set.hostname",
23+ "config.set.cookies",
24 )
25 def write_autopkgtest_cloud_conf(rabbitmq):
26 swiftinternal = config().get("storage-url-internal")
27 hostname = config().get("hostname")
28+ cookies = config().get("cookies")
29 rabbituser = rabbitmq.username()
30 rabbitpassword = rabbitmq.password()
31 rabbithost = rabbitmq.private_address()
32@@ -71,6 +73,7 @@ def write_autopkgtest_cloud_conf(rabbitmq):
33 database_ro=/home/ubuntu/public/autopkgtest.db
34 SwiftURL={swiftinternal}
35 ExternalURL=https://{hostname}/results
36+ cookies={cookies}
37
38 [amqp]
39 uri=amqp://{rabbituser}:{rabbitpassword}@{rabbithost}""".format(
40diff --git a/charms/focal/autopkgtest-web/webcontrol/cache-amqp b/charms/focal/autopkgtest-web/webcontrol/cache-amqp
41index 512b495..33eb496 100755
42--- a/charms/focal/autopkgtest-web/webcontrol/cache-amqp
43+++ b/charms/focal/autopkgtest-web/webcontrol/cache-amqp
44@@ -10,6 +10,7 @@ import sys
45 import tempfile
46 import time
47 import urllib.parse
48+import urllib.request
49
50 import amqplib.client_0_8 as amqp
51 from amqplib.client_0_8.exceptions import AMQPChannelException
52@@ -259,9 +260,42 @@ if __name__ == "__main__":
53 except KeyError:
54 print("No database found", file=sys.stderr)
55 sys.exit(1)
56+ try:
57+ webpage = cp["web"]["ExternalURL"].replace("/results", "")
58+ except KeyError:
59+ print("No external url found!")
60+ sys.exit(1)
61+ try:
62+ cookies = cp["web"]["cookies"]
63+ except KeyError:
64+ print("No cookies in config!")
65+ sys.exit(1)
66
67- aq = AutopkgtestQueueContents(amqp_uri, database)
68- queue_contents = aq.get_queue_contents()
69+ queue_contents = {}
70+ # We only actually want to interact with the rabbitmq queue if
71+ # we're the leader - otherwise these two cache-amqp processes
72+ # make the queue size go crazy in the KPI
73+ if os.path.isfile("/run/autopkgtest-web-is-leader"):
74+ # Get queue details from rabbitmq directly
75+ aq = AutopkgtestQueueContents(amqp_uri, database)
76+ queue_contents = aq.get_queue_contents()
77+ else:
78+ # We get queues.json from autopkgtest.ubuntu.com, see if it's
79+ # the same as the version on disk. If it's not, then this webworker
80+ # (the non leader one) has acquired queues.json from the leader.
81+ # If they're the same we've queried ourselves for queues.json
82+ # and we increment the SRVNAME cookie to query the other
83+ # webworker (the leader)
84+ queued_local = {}
85+ if os.path.isfile(args.output):
86+ with open(args.output, "r") as f:
87+ queued_local = json.load(f)
88+ for srvname_cookie in cookies.split(" "):
89+ req = urllib.request.Request(webpage + "/queues.json")
90+ req.add_header("Cookie", "SRVNAME=" + srvname_cookie)
91+ queue_contents = json.loads(urllib.request.urlopen(req).read())
92+ if queue_contents != queued_local:
93+ break
94
95 with tempfile.NamedTemporaryFile(
96 mode="w", dir=os.path.dirname(args.output), delete=False
97diff --git a/mojo/service-bundle b/mojo/service-bundle
98index 17d8bbd..42d0566 100644
99--- a/mojo/service-bundle
100+++ b/mojo/service-bundle
101@@ -191,6 +191,7 @@ applications:
102 swift-web-credentials: include-file://{{local_dir}}/swift-web-credentials.conf
103 https-proxy: {{ https_proxy }}
104 no-proxy: {{ no_proxy }}
105+ cookies: S0 S1
106 {%- endif %}
107 haproxy:
108 charm: cs:haproxy

Subscribers

People subscribed via source and target branches