Merge ~r00ta/maas:lp-2058377-fix-ha-architectures into maas:master

Proposed by Jacopo Rota
Status: Merged
Approved by: Jacopo Rota
Approved revision: fab53ebca4a870a481dc772288823122ddf38a61
Merge reported by: MAAS Lander
Merged at revision: not available
Proposed branch: ~r00ta/maas:lp-2058377-fix-ha-architectures
Merge into: maas:master
Diff against target: 181 lines (+110/-5)
3 files modified
src/maasserver/models/bootresource.py (+22/-2)
src/maasserver/testing/factory.py (+3/-3)
src/maasserver/tests/test_bootresources.py (+85/-0)
Reviewer Review Type Date Requested Status
MAAS Lander Approve
Alexsander de Souza Approve
Review via email: mp+462700@code.launchpad.net

Commit message

fix: lp-2058377. Fix get_hwe_kernels, get_latest_complete_set queries.

Description of the change

Currently, the query

resource_sets = self.sets.order_by("-id").annotate(
   files_count=Count("files__id", distinct=True),
   files_size=Sum("files__size"),
   sync_size=Sum("files__bootresourcefilesync__size"),
   )

is calculating the wrong sum for the files size because the `Sum("files__bootresourcefilesync__size")` is performing a FULL OUTER JOIN on the bootresourcefilesync set, ending up with duplicated rows and duplicating the files_size by consequence.

The original query in 3.4 was

>>> print(b.sets.order_by("-id").annotate(
... files_count=Count("files__id"),
... files_size=Sum("files__largefile__size"),
... files_total_size=Sum("files__largefile__total_size"),
... ).query)

SELECT
  "maasserver_bootresourceset"."id",
  "maasserver_bootresourceset"."created",
  "maasserver_bootresourceset"."updated",
  "maasserver_bootresourceset"."resource_id",
  "maasserver_bootresourceset"."version",
  "maasserver_bootresourceset"."label",
  COUNT(
    "maasserver_bootresourcefile"."id"
  ) AS "files_count",
  SUM("maasserver_largefile"."size") AS "files_size",
  SUM(
    "maasserver_largefile"."total_size"
  ) AS "files_total_size"
FROM
  "maasserver_bootresourceset"
  LEFT OUTER JOIN "maasserver_bootresourcefile" ON (
    "maasserver_bootresourceset"."id" = "maasserver_bootresourcefile"."resource_set_id"
  )
  LEFT OUTER JOIN "maasserver_largefile" ON (
    "maasserver_bootresourcefile"."largefile_id" = "maasserver_largefile"."id"
  )
WHERE
  "maasserver_bootresourceset"."resource_id" = 9
GROUP BY
  "maasserver_bootresourceset"."id"
ORDER BY
  "maasserver_bootresourceset"."id" DESC

and in the case of multiple resource sets it would have produced (on a 3.4 env)

 id | created | updated | resource_id | version | label | files_count | files_size | files_total_size
----+-------------------------------+-------------------------------+-------------+----------+-----------+-------------+------------+------------------
 21 | 2024-03-20 09:22:53.089051+00 | 2024-03-20 09:22:53.109408+00 | 9 | 20240319 | candidate | 3 | 11563368 | 592121637
 15 | 2024-03-20 06:43:25.761506+00 | 2024-03-20 08:58:58.392482+00 | 9 | 20240318 | stable | 3 | 592011128 | 592011128

The proposed query leverages on a subquery to sum the files sizes in the resource set using a subquery. The resulting query is

 SELECT
  "maasserver_bootresourceset"."id",
  "maasserver_bootresourceset"."created",
  "maasserver_bootresourceset"."updated",
  "maasserver_bootresourceset"."resource_id",
  "maasserver_bootresourceset"."version",
  "maasserver_bootresourceset"."label",
  COUNT(
    DISTINCT "maasserver_bootresourcefile"."id"
  ) AS "files_count",
  (
    SELECT
      SUM(U0."size") AS "group_size"
    FROM
      "maasserver_bootresourcefile" U0
    WHERE
      U0."resource_set_id" = "maasserver_bootresourceset"."id"
    GROUP BY
      U0."resource_set_id"
    LIMIT
      1
  ) AS "files_size",
  SUM(
    "maasserver_bootresourcefilesync"."size"
  ) AS "sync_size"
FROM
  "maasserver_bootresourceset"
  LEFT OUTER JOIN "maasserver_bootresourcefile" ON (
    "maasserver_bootresourceset"."id" = "maasserver_bootresourcefile"."resource_set_id"
  )
  LEFT OUTER JOIN "maasserver_bootresourcefilesync" ON (
    "maasserver_bootresourcefile"."id" = "maasserver_bootresourcefilesync"."file_id"
  )
WHERE
  "maasserver_bootresourceset"."resource_id" = 9
GROUP BY
  "maasserver_bootresourceset"."id"
ORDER BY
  "maasserver_bootresourceset"."id" DESC

and it would produce (on a 3.5 env)

 id | created | updated | resource_id | version | label | files_count | files_size | sync_size
----+-------------------------------+-------------------------------+-------------+----------+-----------+-------------+------------+------------
 20 | 2024-03-20 09:27:41.751309+00 | 2024-03-20 09:27:41.770736+00 | 8 | 20240318 | stable | 3 | 592105406 | 11528040
 14 | 2024-03-20 08:04:51.163702+00 | 2024-03-20 09:27:01.581783+00 | 8 | 20240319 | candidate | 3 | 592215851 | 1184431702
(2 rows)

To post a comment you must log in.
Revision history for this message
Alexsander de Souza (alexsander-souza) wrote :

+1

review: Approve
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b lp-2058377-fix-ha-architectures lp:~r00ta/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: fab53ebca4a870a481dc772288823122ddf38a61

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/src/maasserver/models/bootresource.py b/src/maasserver/models/bootresource.py
2index 7d46aae..a5a94f6 100644
3--- a/src/maasserver/models/bootresource.py
4+++ b/src/maasserver/models/bootresource.py
5@@ -14,7 +14,9 @@ from django.db.models import (
6 IntegerField,
7 JSONField,
8 Manager,
9+ OuterRef,
10 Prefetch,
11+ Subquery,
12 Sum,
13 )
14
15@@ -200,6 +202,7 @@ class BootResourceManager(Manager):
16 UPD: because of 3.4 transition from "subarches" to "platforms",
17 expect some confusion ahead.
18 """
19+ from maasserver.models.bootresourcefile import BootResourceFile
20 from maasserver.utils.osystems import (
21 get_release_version_from_string,
22 parse_subarch_kernel_string,
23@@ -210,9 +213,17 @@ class BootResourceManager(Manager):
24 if not architecture:
25 architecture = ""
26
27+ # In order to calculate the sum of the bootresourcefilesync we have to FULL OUTER JOIN the tables and the rows are
28+ # duplicated: this is why we need a subquery to calculate the total files size.
29+ subquery = (
30+ BootResourceFile.objects.filter(resource_set_id=OuterRef("pk"))
31+ .values("resource_set_id")
32+ .annotate(group_size=Sum("size"))
33+ .values("group_size")[:1]
34+ )
35 sets_prefetch = BootResourceSet.objects.annotate(
36 files_count=Count("files__id", distinct=True),
37- files_size=Sum("files__size"),
38+ files_size=Subquery(subquery),
39 sync_size=Sum("files__bootresourcefilesync__size"),
40 )
41 sets_prefetch = sets_prefetch.prefetch_related("files")
42@@ -571,15 +582,24 @@ class BootResource(CleanSave, TimestampedModel):
43 def get_latest_complete_set(self):
44 """Return latest `BootResourceSet` where all `BootResouceFile`'s
45 are complete."""
46+ from maasserver.models.bootresourcefile import BootResourceFile
47 from maasserver.models.node import RegionController
48
49 if (
50 not hasattr(self, "_prefetched_objects_cache")
51 or "sets" not in self._prefetched_objects_cache
52 ):
53+ # In order to calculate the sum of the bootresourcefilesync we have to FULL OUTER JOIN the tables and the rows are
54+ # duplicated: this is why we need a subquery to calculate the total files size.
55+ subquery = (
56+ BootResourceFile.objects.filter(resource_set_id=OuterRef("pk"))
57+ .values("resource_set_id")
58+ .annotate(group_size=Sum("size"))
59+ .values("group_size")[:1]
60+ )
61 resource_sets = self.sets.order_by("-id").annotate(
62 files_count=Count("files__id", distinct=True),
63- files_size=Sum("files__size"),
64+ files_size=Subquery(subquery),
65 sync_size=Sum("files__bootresourcefilesync__size"),
66 )
67 else:
68diff --git a/src/maasserver/testing/factory.py b/src/maasserver/testing/factory.py
69index 8ddeb37..71d7ced 100644
70--- a/src/maasserver/testing/factory.py
71+++ b/src/maasserver/testing/factory.py
72@@ -2490,10 +2490,10 @@ class Factory(maastesting.factory.Factory):
73 )
74
75 if synced:
76- for st in synced:
77- sync_size = rfile.size if st[1] < 0 else st[1]
78+ for region_controller, size in synced:
79+ sync_size = rfile.size if size < 0 else size
80 rfile.bootresourcefilesync_set.create(
81- region=st[0], size=sync_size
82+ region=region_controller, size=sync_size
83 )
84
85 return rfile
86diff --git a/src/maasserver/tests/test_bootresources.py b/src/maasserver/tests/test_bootresources.py
87index 8a2ae38..b9e5902 100644
88--- a/src/maasserver/tests/test_bootresources.py
89+++ b/src/maasserver/tests/test_bootresources.py
90@@ -715,6 +715,91 @@ class TestBootResourceTransactional(MAASTransactionServerTestCase):
91 logger.output,
92 )
93
94+ def test_get_latest_complete_set_not_in_sync_between_regions(self):
95+ self.patch(bootresources.Event.objects, "create_region_event")
96+ name, architecture, product = make_product()
97+ second_region = factory.make_RegionController()
98+ resource = factory.make_BootResource(
99+ rtype=BOOT_RESOURCE_TYPE.SYNCED,
100+ name=name,
101+ architecture=architecture,
102+ kflavor="generic",
103+ )
104+ resource_set = factory.make_BootResourceSet(
105+ resource,
106+ version=product["version_name"],
107+ )
108+ factory.make_boot_resource_file_with_content(
109+ resource_set,
110+ filename=product["ftype"],
111+ filetype=product["ftype"],
112+ synced=[
113+ (self.region, -1),
114+ (second_region, 0),
115+ ], # Not in sync between regions
116+ )
117+ self.assertIsNone(resource.get_latest_complete_set())
118+
119+ def test_get_latest_complete_set_has_completed(self):
120+ self.patch(bootresources.Event.objects, "create_region_event")
121+ name, architecture, product = make_product()
122+ second_region = factory.make_RegionController()
123+ resource = factory.make_BootResource(
124+ rtype=BOOT_RESOURCE_TYPE.SYNCED,
125+ name=name,
126+ architecture=architecture,
127+ kflavor="generic",
128+ )
129+ resource_set = factory.make_BootResourceSet(
130+ resource,
131+ version=product["version_name"],
132+ )
133+ factory.make_boot_resource_file_with_content(
134+ resource_set,
135+ filename=product["ftype"],
136+ filetype=product["ftype"],
137+ synced=[
138+ (self.region, -1),
139+ (second_region, -1),
140+ ], # In Sync between regions
141+ )
142+ self.assertEqual(resource_set, resource.get_latest_complete_set())
143+
144+ def test_get_latest_complete_multiple_sets_incomplete(self):
145+ self.patch(bootresources.Event.objects, "create_region_event")
146+ name, architecture, product = make_product()
147+ second_region = factory.make_RegionController()
148+ resource = factory.make_BootResource(
149+ rtype=BOOT_RESOURCE_TYPE.SYNCED,
150+ name=name,
151+ architecture=architecture,
152+ kflavor="generic",
153+ )
154+ resource_set = factory.make_BootResourceSet(
155+ resource,
156+ version=product["version_name"],
157+ )
158+ incomplete_resource_set = factory.make_BootResourceSet(
159+ resource,
160+ version="alpha",
161+ )
162+ factory.make_boot_resource_file_with_content(
163+ resource_set,
164+ filename=product["ftype"],
165+ filetype=product["ftype"],
166+ synced=[
167+ (self.region, -1),
168+ (second_region, -1),
169+ ], # This is the latest completed
170+ )
171+ factory.make_boot_resource_file_with_content(
172+ incomplete_resource_set,
173+ filename=product["ftype"],
174+ filetype=product["ftype"],
175+ synced=[(self.region, 0)], # Not in sync yet
176+ )
177+ self.assertEqual(resource_set, resource.get_latest_complete_set())
178+
179 def test_insert_doesnt_print_error_when_first_import(self):
180 name, architecture, product = make_product()
181 with transaction.atomic():

Subscribers

People subscribed via source and target branches