Merge ~emitorino/ubuntu-cve-tracker:sec_586_extend_subprojects into ubuntu-cve-tracker:master
- Git
- lp:~emitorino/ubuntu-cve-tracker
- sec_586_extend_subprojects
- Merge into master
Status: | Merged |
---|---|
Merged at revision: | 0871352e9c7385fa781393a17224d059c40300bf |
Proposed branch: | ~emitorino/ubuntu-cve-tracker:sec_586_extend_subprojects |
Merge into: | ubuntu-cve-tracker:master |
Diff against target: |
230 lines (+102/-31) 2 files modified
scripts/active_edit (+9/-3) scripts/check-cves (+93/-28) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Ubuntu Security Team | Pending | ||
Alex Murray | Pending | ||
Review via email: mp+419425@code.launchpad.net |
Commit message
SEC-586: Help detect external software during cve triage
Description of the change
Since the loading of external subprojects is already supported in $UCT, this MP is suggesting if one or more packages present in the corresponding supported.txt files of the external subprojects could be affected by a CVE being processed.
Spyros Seimenis (sespiros) wrote : | # |
Emilia Torino (emitorino) wrote (last edit ): | # |
Hey Spyros, thanks a lot for your feedback! Please find some comments/questions above.
> Nice feature! Overall it makes the output clearer and organized. I tested the
> features locally and I have some first functionality related comments:
> - changes in the embargoed repo don't seem to get reported similarly to the
> subprojects case (detect_
> new file in both the embargoed and subprojects repos to test this, only the
> one in subprojects was reported.
Ah right, we could add emargoed cves were created. Thanks!
> - The subprojects section that reports potential subprojects suffers from the
> same issues that the normal process does, struggling to match packages to
> descriptions. The normal process though performs an extra step "hints = words
> & allsrcs" which 1) filters the hints list using known package names and 2)
> filters out some common words such as "the" "and" etc. Maybe try and apply a
> similar filtering process to the subprojects section. In my test run I had
> subprojects incorrectly matching CVEs to project that had just the word "the"
> in common.
So I refactored out the logic to get hints from the cve description, in get_software_
https:/
https:/
https:/
We need to decide if we prefer to filter those out while confirming the pkgs affected, or we prefer to ignored those at all. The issue with ignoring those is the high probability of missing affected cves. The counterpart is the high volume of false positives.
>
> Also I have a couple of other opinionated cosmetic changes/ideas:
> - section banners (===) could be shorter (compared to the *** sections) so
> that it is easier to parse output between multiple CVEs
Sure
> - section banners could have spaces in titles like "=== Something ===" instead
> of "===something==="
> - content inside sections (under the === lines) could be idented by 1 space
> like the rest of the text
> - ===cve details=== -> === CVE details ===
> - "subproject summary" => "Potentially affected sub-projects" (and maybe the
> extra line is not needed)
I tried to be consistent with the current format used. If you see there are other occurrences like "print(
" in https:/
> - Affected subprojects section could be under the debian section since the
> debian one probably is more important and potentially provides more info that
> are useful to take an action on the CVE
The point here is that a CVE could not affect debian at all so it will be incorrect to keep it there. Correct?
> - "triage summary" could use a larger banner to make it easier to parse when
> viewing all of the scrollback (instead of having the same banner like the per-
> CVE sections)
Sure
> - the "ple...
Emilia Torino (emitorino) wrote : | # |
> Hey Spyros, thanks a lot for your feedback! Please find some
> comments/questions above.
>
> > Nice feature! Overall it makes the output clearer and organized. I tested
> the
> > features locally and I have some first functionality related comments:
> > - changes in the embargoed repo don't seem to get reported similarly to the
> > subprojects case (detect_
> a
> > new file in both the embargoed and subprojects repos to test this, only the
> > one in subprojects was reported.
>
> Ah right, we could add emargoed cves were created. Thanks!
>
> > - The subprojects section that reports potential subprojects suffers from
> the
> > same issues that the normal process does, struggling to match packages to
> > descriptions. The normal process though performs an extra step "hints =
> words
> > & allsrcs" which 1) filters the hints list using known package names and 2)
> > filters out some common words such as "the" "and" etc. Maybe try and apply a
> > similar filtering process to the subprojects section. In my test run I had
> > subprojects incorrectly matching CVEs to project that had just the word
> "the"
> > in common.
>
> So I refactored out the logic to get hints from the cve description, in
> get_software_
> that not only universe packages can have those names but also pypi ones:
>
> https:/
> https:/
> https:/
>
> We need to decide if we prefer to filter those out while confirming the pkgs
> affected, or we prefer to ignored those at all. The issue with ignoring those
> is the high probability of missing affected cves. The counterpart is the high
> volume of false positives.
>
I have also removed the "common words" from the suggestion. Thanks! I though that we could miss the CVE but you can manually add the package when you are reading the CVE description.
> >
> > Also I have a couple of other opinionated cosmetic changes/ideas:
> > - section banners (===) could be shorter (compared to the *** sections) so
> > that it is easier to parse output between multiple CVEs
>
> Sure
Done
>
> > - section banners could have spaces in titles like "=== Something ==="
> instead
> > of "===something==="
> > - content inside sections (under the === lines) could be idented by 1 space
> > like the rest of the text
> > - ===cve details=== -> === CVE details ===
> > - "subproject summary" => "Potentially affected sub-projects" (and maybe the
> > extra line is not needed)
>
> I tried to be consistent with the current format used. If you see there are
> other occurrences like "print(
> entry==
> " in https:/
> tracker/
> have a preference and I am happy to update it if preferred.
Done
>
> > - Affected subprojects section could be under the debian section since the
> > debian one probably is more important and potentially provides more info
> that
> > are useful to take a...
Emilia Torino (emitorino) wrote : | # |
> Hey Spyros, thanks a lot for your feedback! Please find some
> comments/questions above.
>
> > Nice feature! Overall it makes the output clearer and organized. I tested
> the
> > features locally and I have some first functionality related comments:
> > - changes in the embargoed repo don't seem to get reported similarly to the
> > subprojects case (detect_
> a
> > new file in both the embargoed and subprojects repos to test this, only the
> > one in subprojects was reported.
>
> Ah right, we could add emargoed cves were created. Thanks!
>
> > - The subprojects section that reports potential subprojects suffers from
> the
> > same issues that the normal process does, struggling to match packages to
> > descriptions. The normal process though performs an extra step "hints =
> words
> > & allsrcs" which 1) filters the hints list using known package names and 2)
> > filters out some common words such as "the" "and" etc. Maybe try and apply a
> > similar filtering process to the subprojects section. In my test run I had
> > subprojects incorrectly matching CVEs to project that had just the word
> "the"
> > in common.
>
> So I refactored out the logic to get hints from the cve description, in
> get_software_
> that not only universe packages can have those names but also pypi ones:
>
> https:/
> https:/
> https:/
>
> We need to decide if we prefer to filter those out while confirming the pkgs
> affected, or we prefer to ignored those at all. The issue with ignoring those
> is the high probability of missing affected cves. The counterpart is the high
> volume of false positives.
>
I have also removed the "common words" from the suggestion. Thanks! I though that we could miss the CVE but you can manually add the package when you are reading the CVE description.
> >
> > Also I have a couple of other opinionated cosmetic changes/ideas:
> > - section banners (===) could be shorter (compared to the *** sections) so
> > that it is easier to parse output between multiple CVEs
>
> Sure
Done
>
> > - section banners could have spaces in titles like "=== Something ==="
> instead
> > of "===something==="
> > - content inside sections (under the === lines) could be idented by 1 space
> > like the rest of the text
> > - ===cve details=== -> === CVE details ===
> > - "subproject summary" => "Potentially affected sub-projects" (and maybe the
> > extra line is not needed)
>
> I tried to be consistent with the current format used. If you see there are
> other occurrences like "print(
> entry==
> " in https:/
> tracker/
> have a preference and I am happy to update it if preferred.
Done
>
> > - Affected subprojects section could be under the debian section since the
> > debian one probably is more important and potentially provides more info
> that
> > are useful to take a...
Emilia Torino (emitorino) wrote : | # |
> Hey Spyros, thanks a lot for your feedback! Please find some
> comments/questions above.
>
> > Nice feature! Overall it makes the output clearer and organized. I tested
> the
> > features locally and I have some first functionality related comments:
> > - changes in the embargoed repo don't seem to get reported similarly to the
> > subprojects case (detect_
> a
> > new file in both the embargoed and subprojects repos to test this, only the
> > one in subprojects was reported.
>
> Ah right, we could add emargoed cves were created. Thanks!
>
> > - The subprojects section that reports potential subprojects suffers from
> the
> > same issues that the normal process does, struggling to match packages to
> > descriptions. The normal process though performs an extra step "hints =
> words
> > & allsrcs" which 1) filters the hints list using known package names and 2)
> > filters out some common words such as "the" "and" etc. Maybe try and apply a
> > similar filtering process to the subprojects section. In my test run I had
> > subprojects incorrectly matching CVEs to project that had just the word
> "the"
> > in common.
>
> So I refactored out the logic to get hints from the cve description, in
> get_software_
> that not only universe packages can have those names but also pypi ones:
>
> https:/
> https:/
> https:/
>
> We need to decide if we prefer to filter those out while confirming the pkgs
> affected, or we prefer to ignored those at all. The issue with ignoring those
> is the high probability of missing affected cves. The counterpart is the high
> volume of false positives.
>
I have also removed the "common words" from the suggestion. Thanks! I though that we could miss the CVE but you can manually add the package when you are reading the CVE description.
> >
> > Also I have a couple of other opinionated cosmetic changes/ideas:
> > - section banners (===) could be shorter (compared to the *** sections) so
> > that it is easier to parse output between multiple CVEs
>
> Sure
Done
>
> > - section banners could have spaces in titles like "=== Something ==="
> instead
> > of "===something==="
> > - content inside sections (under the === lines) could be idented by 1 space
> > like the rest of the text
> > - ===cve details=== -> === CVE details ===
> > - "subproject summary" => "Potentially affected sub-projects" (and maybe the
> > extra line is not needed)
>
> I tried to be consistent with the current format used. If you see there are
> other occurrences like "print(
> entry==
> " in https:/
> tracker/
> have a preference and I am happy to update it if preferred.
Done
>
> > - Affected subprojects section could be under the debian section since the
> > debian one probably is more important and potentially provides more info
> that
> > are useful to take a...
Preview Diff
1 | diff --git a/scripts/active_edit b/scripts/active_edit |
2 | index af98ce6..3371340 100755 |
3 | --- a/scripts/active_edit |
4 | +++ b/scripts/active_edit |
5 | @@ -75,12 +75,18 @@ def create_or_update_external_subproject_cves(cve, pkgname): |
6 | print("\n") |
7 | |
8 | if options.autoconfirm or ans.startswith("y"): |
9 | + if affected_releases: |
10 | + print('\n================= External subproject update ==================') |
11 | + print(" %s - %s: " % (cve, pkgname)) |
12 | + |
13 | for release in affected_releases: |
14 | with open(os.path.join(cve_lib.get_external_subproject_cve_dir(release), cve), "a") as f: |
15 | state = "needs-triage" |
16 | entry = "%s_%s: %s\n" % (release, pkgname, state) |
17 | f.write(entry) |
18 | f.close() |
19 | + print(" %s" % release) |
20 | + |
21 | else: |
22 | print("Aborted\n") |
23 | |
24 | @@ -146,10 +152,10 @@ def update_cve(cve, pkgname, fixed_in=None, fixed_in_release=None, fixed_in_rele |
25 | and (not cve_lib.is_active_esm_release(r) or r == 'precise'): |
26 | continue |
27 | state = "needs-triage" |
28 | - if cve_lib.is_active_esm_release(r): |
29 | - state = "ignored (out of standard support)" |
30 | - elif not pkg_in_rel(pkgname, release): |
31 | + if not pkg_in_rel(pkgname, release): |
32 | continue |
33 | + elif cve_lib.is_active_esm_release(r): |
34 | + state = "ignored (out of standard support)" |
35 | elif r == 'upstream' and fixed_in is not None: |
36 | state = "released (%s)" % fixed_in |
37 | elif fixed_in_release_version and r == fixed_in_release: |
38 | diff --git a/scripts/check-cves b/scripts/check-cves |
39 | index a08a998..6c3d820 100755 |
40 | --- a/scripts/check-cves |
41 | +++ b/scripts/check-cves |
42 | @@ -72,7 +72,8 @@ for release in list(source.keys()): |
43 | # remove common words which also happen to be names |
44 | # of source packages since our ignore suggestion is |
45 | # likely to sometimes contain these |
46 | -allsrcs.difference_update(set(['an', 'and', 'context', 'file', 'modules', 'the', 'when'])) |
47 | +common_words = ['an', 'and', 'context', 'file', 'modules', 'the', 'when'] |
48 | +allsrcs.difference_update(set(common_words)) |
49 | # add boilerplate names too so we can get mysql or postgresql |
50 | # etc even if they don't exist as source package names |
51 | allsrcs.update(set(cve_lib.load_boilerplates().keys())) |
52 | @@ -711,11 +712,40 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
53 | (timestamp, self.num_added, self.num_ignored, self.num_skipped, self.num_added + self.num_ignored, [os.path.basename(x) for x in args])) |
54 | |
55 | def printReport(self): |
56 | + print('\n============================ Triage summary =============================') |
57 | print("\n %4d CVEs added" % self.num_added) |
58 | print(" %4d CVEs ignored" % self.num_ignored) |
59 | print(" %4d CVEs skipped" % self.num_skipped) |
60 | print("---------------------------") |
61 | print("%5d total CVEs triaged" % (self.num_added + self.num_ignored)) |
62 | + updates_detected, updates_details = self.detect_updates_to_external_repositories() |
63 | + if updates_detected: |
64 | + print('\n====================== External updates detected ========================') |
65 | + print(updates_details) |
66 | + print("\n Please remember to push the above changes if appropriate") |
67 | + |
68 | + def detect_updates_to_external_repositories(self): |
69 | + external_repositories = [cve_lib.subprojects_dir, cve_lib.embargoed_dir] |
70 | + updates_detected = False |
71 | + updates_details = "" |
72 | + for repository_dir in external_repositories: |
73 | + cmd = ['git', '-C', repository_dir, 'status', '--porcelain'] |
74 | + try: |
75 | + process = subprocess.run( |
76 | + cmd, |
77 | + stdout=subprocess.PIPE, |
78 | + stderr=subprocess.PIPE, |
79 | + universal_newlines=True, |
80 | + check=True, |
81 | + ) |
82 | + if process.returncode == 0 and process.stdout: |
83 | + if process.stdout: |
84 | + updates_detected = True |
85 | + updates_details += "\n %s \n %s" % (repository_dir, process.stdout) |
86 | + except subprocess.CalledProcessError as exception: |
87 | + print(exception) |
88 | + updates_detected = False |
89 | + return updates_detected, updates_details |
90 | |
91 | def parse_json(self, fp): |
92 | template_nvd = {"CVE_data_type": "CVE", |
93 | @@ -940,6 +970,15 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
94 | f.write('%s <CVE> skip\n\n' % (line_prefix)) |
95 | f.flush() |
96 | |
97 | + def find_hint_in_external_subprojects(self, software_hints_from_cve_description): |
98 | + external_subprojects = {} |
99 | + for subproject in cve_lib.external_releases: |
100 | + if subproject in source: |
101 | + for hint in software_hints_from_cve_description: |
102 | + if hint in source[subproject] and hint not in common_words: |
103 | + external_subprojects.setdefault(subproject, set()).add(hint) |
104 | + return external_subprojects |
105 | + |
106 | def display_cve(self, cve, file=sys.stdout, line_prefix=None, wrap_desc=False): |
107 | class CVEOutput: |
108 | def __init__(self, file, line_prefix=None): |
109 | @@ -984,11 +1023,14 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
110 | if wrap_desc: |
111 | print('%s' % _wrap_desc(self.cve_data[cve]['desc'])) |
112 | else: |
113 | - print('%s' % (self.cve_data[cve]['desc'])) |
114 | + print('\n======================== CVE details ==========================') |
115 | + print(' %s' % cve) |
116 | + print(' %s' % (self.cve_data[cve]['desc'])) |
117 | for cvss in self.cve_data[cve]['cvss']: |
118 | print(' CVSS (%s): %s [%.1f]' % (cvss['source'], cvss['vector'], cvss['baseScore'])) |
119 | if self.debian and cve in self.debian: |
120 | - print('Debian CVE Tracker: %s' % (self.debian[cve]['state'])) |
121 | + print('\n======================= Debian details ========================') |
122 | + print(' Debian CVE Tracker: %s' % (self.debian[cve]['state'])) |
123 | if len(self.debian[cve]['note']): |
124 | print("\t" + "\n\t".join(self.debian[cve]['note'])) |
125 | for pkg in self.debian[cve]['pkgs']: |
126 | @@ -1013,6 +1055,16 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
127 | else: |
128 | proposed_ignore = self.get_ignore_suggestion(self.cve_data[cve]['desc']) |
129 | print('%s ignore %s' % (cve, proposed_ignore)) |
130 | + |
131 | + software_hints_from_cve_desc = self.get_software_hints_from_cve_description(self.cve_data[cve]["desc"]) |
132 | + if software_hints_from_cve_desc: |
133 | + software_hints_per_external_releases = self.find_hint_in_external_subprojects(software_hints_from_cve_desc) |
134 | + if software_hints_per_external_releases: |
135 | + print('\n==================== Subprojects details ======================') |
136 | + print(' External subprojects possibly affected: ') |
137 | + for affected_subproject in software_hints_per_external_releases: |
138 | + print(" - " + affected_subproject + ": " + " - ".join( |
139 | + software_hints_per_external_releases[affected_subproject])) |
140 | # once again, announce formerly embargoed status |
141 | if cve in EmbargoList: |
142 | print('**!!** no longer embargoed **!!**') |
143 | @@ -1035,30 +1087,37 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
144 | if cve in EmbargoList: |
145 | action = 'unembargo' |
146 | reason = "" |
147 | - # Default to Debian state, if available |
148 | - elif self.debian and cve in self.debian and self.debian[cve]['state']: |
149 | - if self.debian[cve]['state'].startswith('NOT-FOR-US:'): |
150 | - action = 'ignore' |
151 | - reason = self.debian[cve]['state'].split(':', 1)[1].lstrip() |
152 | - if self.debian[cve]['state'] == 'FOUND': |
153 | - action = 'add' |
154 | - packages = list(self.debian[cve]['pkgs'].keys()) |
155 | - if len(self.debian[cve]['pkgs']) == 1: |
156 | - pkg = packages[0] |
157 | - if self.debian[cve]['pkgs'][pkg]['state'] == '<itp>': |
158 | - # software has not been admitted into debian |
159 | - action = 'ignore' |
160 | - reason = '%s: <itp> (dbug %s)' % (pkg, self.debian[cve]['pkgs'][pkg]['bug']) |
161 | - |
162 | else: |
163 | - # try and hint if the detected product name contains any known |
164 | - # package names |
165 | - desc = self.get_ignore_suggestion(self.cve_data[cve]['desc']) |
166 | - words = set(desc.lower().split(" ")) |
167 | - hints = words & allsrcs |
168 | - if len(hints) > 0: |
169 | - packages = list(hints) |
170 | - action = 'add' |
171 | + words = self.get_software_hints_from_cve_description(self.cve_data[cve]['desc']) |
172 | + # Default to Debian state, if available |
173 | + if self.debian and cve in self.debian and self.debian[cve]['state']: |
174 | + if self.debian[cve]['state'].startswith('NOT-FOR-US:'): |
175 | + action = 'ignore' |
176 | + reason = self.debian[cve]['state'].split(':', 1)[1].lstrip() |
177 | + if self.debian[cve]['state'] == 'FOUND': |
178 | + action = 'add' |
179 | + packages = list(self.debian[cve]['pkgs'].keys()) |
180 | + if len(self.debian[cve]['pkgs']) == 1: |
181 | + pkg = packages[0] |
182 | + if self.debian[cve]['pkgs'][pkg]['state'] == '<itp>': |
183 | + # software has not been admitted into debian |
184 | + action = 'ignore' |
185 | + reason = '%s: <itp> (dbug %s)' % (pkg, self.debian[cve]['pkgs'][pkg]['bug']) |
186 | + |
187 | + else: |
188 | + # try and hint if the detected product name contains any known |
189 | + # package names |
190 | + hints = words & allsrcs |
191 | + if len(hints) > 0: |
192 | + packages = list(hints) |
193 | + action = 'add' |
194 | + |
195 | + # Regardless of Ubuntu software, try and hint if the detected product |
196 | + # name contains any external software name |
197 | + external_software_hints = self.find_hint_in_external_subprojects(words) |
198 | + if external_software_hints: |
199 | + for software in external_software_hints.values(): |
200 | + packages.extend(list(software)) |
201 | |
202 | # remove any duplicates |
203 | uniq = [] |
204 | @@ -1071,10 +1130,15 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
205 | |
206 | return (action, reason, packages) |
207 | |
208 | + def get_software_hints_from_cve_description(self, cve_description): |
209 | + desc = self.get_ignore_suggestion(cve_description) |
210 | + words = set(desc.lower().split(" ")) |
211 | + return words |
212 | + |
213 | def human_process_cve(self, cve, action='skip', reason='', package=''): |
214 | info = '' |
215 | while info == "" or not info[0] in ['i', 'a', 's', 'q', 'r']: |
216 | - prompt_user('A]dd (or R]epeat), I]gnore forever, S]kip for now, or Q]uit? [%s] ' % (action)) |
217 | + prompt_user('\nA]dd (or R]epeat), I]gnore forever, S]kip for now, or Q]uit? [%s] ' % (action)) |
218 | info = sys.stdin.readline().strip().lower() |
219 | if info == "": |
220 | info = action |
221 | @@ -1107,7 +1171,8 @@ class CVEHandler(xml.sax.handler.ContentHandler): |
222 | _spawn_editor(dst) |
223 | self.saved_cve = cve |
224 | |
225 | - print('Detecting packages built using: %s...' % info, end='') |
226 | + print('\n===================== Dependant packages ======================') |
227 | + print(' Detecting packages built using: %s...' % info, end='') |
228 | sys.stdout.flush() |
229 | built_using = "" |
230 | try: |
Nice feature! Overall it makes the output clearer and organized. I tested the features locally and I have some first functionality related comments: updates_ to_external_ subprojects) . I just "touched" a new file in both the embargoed and subprojects repos to test this, only the one in subprojects was reported.
- changes in the embargoed repo don't seem to get reported similarly to the subprojects case (detect_
- The subprojects section that reports potential subprojects suffers from the same issues that the normal process does, struggling to match packages to descriptions. The normal process though performs an extra step "hints = words & allsrcs" which 1) filters the hints list using known package names and 2) filters out some common words such as "the" "and" etc. Maybe try and apply a similar filtering process to the subprojects section. In my test run I had subprojects incorrectly matching CVEs to project that had just the word "the" in common.
Also I have a couple of other opinionated cosmetic changes/ideas:
- section banners (===) could be shorter (compared to the *** sections) so that it is easier to parse output between multiple CVEs
- section banners could have spaces in titles like "=== Something ===" instead of "===something==="
- content inside sections (under the === lines) could be idented by 1 space like the rest of the text
- ===cve details=== -> === CVE details ===
- "subproject summary" => "Potentially affected sub-projects" (and maybe the extra line is not needed)
- Affected subprojects section could be under the debian section since the debian one probably is more important and potentially provides more info that are useful to take an action on the CVE
- "triage summary" could use a larger banner to make it easier to parse when viewing all of the scrollback (instead of having the same banner like the per-CVE sections)
- the "please choose one option" section is a bit confusing since unlike the other sections this is a section that requires an action. I am thinking the output would look good even without it.
- the "subproject summary" section itself is fine but the extra message "please remember to push" could also be shown at the very end of the output (when running process_cves) similar to the "IMPORTANT" message that you get for example if you had check-syntax failures.