Merge ~mdeslaur/ubuntu-cve-tracker:check-cves-perf1 into ubuntu-cve-tracker:master

Proposed by Marc Deslauriers
Status: Merged
Merged at revision: eb8e65266e3f7a038f589060a9dff7be4cd69e0c
Proposed branch: ~mdeslaur/ubuntu-cve-tracker:check-cves-perf1
Merge into: ubuntu-cve-tracker:master
Diff against target: 82 lines (+17/-15)
1 file modified
scripts/check-cves (+17/-15)
Reviewer Review Type Date Requested Status
Steve Beattie Approve
Review via email: mp+462263@code.launchpad.net

Commit message

commit cb191ce1d077e41c34d4c27766e07bf97a54dc0a
Author: Marc Deslauriers <email address hidden>
Date: Tue Mar 12 15:02:34 2024 -0400

    check-cves: improve performance with sets instead of lists

    CVEIgnoreNotForUsList and CVEIgnoreMistriagedList were gigantic
    lists of CVEs and check-cves keeps searching for cves in them.
    Switching them to sets dramatically improves performance.

commit bcc2a35fad29a1dd31900e70ea4f44f38fd20f31
Author: Marc Deslauriers <email address hidden>
Date: Tue Mar 12 12:55:12 2024 -0400

    check-cves: simplify mistriaged() logic

    We don't need to call source_map.madison here and perform expensive
    searches, we just need to know if we can see this package in the
    source map. Bail out as soon as we do.

Description of the change

This merge proposal dramatically improves check-cves performance:

BEFORE:
$ time ./scripts/check-cves --import-missing-debian

real 3m17.930s
user 3m17.660s
sys 0m0.274s

AFTER:
$ time ./scripts/check-cves --import-missing-debian

real 0m5.845s
user 0m5.571s
sys 0m0.292s

To post a comment you must log in.
Revision history for this message
Steve Beattie (sbeattie) wrote :

Awesome catch, thanks! Merged.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/scripts/check-cves b/scripts/check-cves
2index 8a9e8fc..2abf4cc 100755
3--- a/scripts/check-cves
4+++ b/scripts/check-cves
5@@ -223,22 +223,24 @@ def import_debian(handler):
6 today = datetime_date.today()
7 known = set(CVEKnownList + CVEIgnoreList)
8
9+ def ever_existed(pkg):
10+ for rel in source:
11+ if pkg in source[rel]:
12+ return True
13+ return False
14+
15 def mistriaged(cve):
16- if cve in CVEIgnoreNotForUsList and \
17- cve not in CVEIgnoreMistriagedList and \
18+ if cve in CVEIgnoreNotForUsSet and \
19+ cve not in CVEIgnoreMistriagedSet and \
20 handler.debian[cve]['state'] == 'FOUND':
21 # check that at least one of the assigned packages exist
22 # in Ubuntu
23- pkgs = {}
24 for pkg in handler.debian[cve]['pkgs'].keys():
25 # ignore package if debian says is not affected
26 if handler.debian[cve]['pkgs'][pkg]['state'] == '<not-affected>':
27 continue
28- answer = source_map.madison(source, pkg)
29- if len(answer) > 0:
30- pkgs[pkg] = answer
31- if len(pkgs) > 0:
32- return True
33+ if ever_existed(pkg):
34+ return True
35 return False
36
37 # pull in CVEs from data/DSA/list
38@@ -1347,7 +1349,7 @@ class CVEHandler(xml.sax.handler.ContentHandler):
39 def add_cve(self, cve, packages, priority=None):
40 # remove from not-for-us.txt if adding and ensure we remove any
41 # mistriaged_hint from the description
42- if cve in CVEIgnoreNotForUsList:
43+ if cve in CVEIgnoreNotForUsSet:
44 cmd = ['sed', '-i', '/^%s #.*$/d' % cve, './ignored/not-for-us.txt']
45 subprocess.call(cmd)
46 self.cve_data[cve]['desc'] = self.cve_data[cve]['desc'].replace(mistriaged_hint, '')
47@@ -1469,7 +1471,7 @@ class CVEHandler(xml.sax.handler.ContentHandler):
48 def ignore_cve(self, cve, reason):
49 # Append to ignore list unless is already in CVEIgnoreList and then
50 # append to the ignored/ignore-mistriaged.txt
51- txtfile = 'ignore-mistriaged.txt' if cve in CVEIgnoreNotForUsList else 'not-for-us.txt'
52+ txtfile = 'ignore-mistriaged.txt' if cve in CVEIgnoreNotForUsSet else 'not-for-us.txt'
53 with open('%s/ignored/%s' % (destdir, txtfile), 'a') as f:
54 f.write('%s # %s\n' % (cve, reason))
55
56@@ -1588,21 +1590,21 @@ class CheckCVETest(unittest.TestCase):
57 ignored_notforus_path = 'ignored/not-for-us.txt'
58 if destdir != './' and destdir != '.':
59 ignored_notforus_path = os.path.join(destdir, ignored_notforus_path)
60-# CVEIgnoreNotForUsList is a list of all CVEs that we have previously
61+# CVEIgnoreNotForUsSet is a set of all CVEs that we have previously
62 # chosen to ignore since they don't apply to software in Ubuntu
63-CVEIgnoreNotForUsList = cve_lib.parse_CVEs_from_uri(ignored_notforus_path)
64+CVEIgnoreNotForUsSet = set(cve_lib.parse_CVEs_from_uri(ignored_notforus_path))
65
66 ignored_mistriaged_path = 'ignored/ignore-mistriaged.txt'
67 if destdir != './' and destdir != '.':
68 ignored_mistriaged_path = os.path.join(destdir, ignored_mistriaged_path)
69-# CVEIgnoreMistriagedList is a list of all CVEs that we want to definitely
70+# CVEIgnoreMistriagedSet is a set of all CVEs that we want to definitely
71 # ignore when doing mistriaged CVE detection - they should exist in both
72 # CVEIgnoreNotForUsList and CVEIgnoreMistriagedList
73-CVEIgnoreMistriagedList = cve_lib.parse_CVEs_from_uri(ignored_mistriaged_path)
74+CVEIgnoreMistriagedSet = set(cve_lib.parse_CVEs_from_uri(ignored_mistriaged_path))
75
76 # CVEIgnoreList is a list of all CVEs we know about already. These will be
77 # ignored when checking MITRE for new CVEs
78-CVEIgnoreList = list(CVEIgnoreNotForUsList)
79+CVEIgnoreList = list(CVEIgnoreNotForUsSet)
80
81 CVEKnownList = []
82 CVEKnownList += [cve for cve in os.listdir(destdir + "/ignored/") if cve.startswith('CVE-')]

Subscribers

People subscribed via source and target branches