ubuntu-security-tools

Merge ubuntu-security-tools:json-output into ubuntu-security-tools:master

Proposed by Mark Esler on 2022-10-12

Status:	Rejected
Rejected by:	Mark Esler on 2022-10-21
Proposed branch:	ubuntu-security-tools:json-output
Merge into:	ubuntu-security-tools:master
Diff against target:	245 lines (+120/-66) 2 files modified audits/shellcheck-json.sh (+2/-0) audits/uaudit (+118/-66)
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Alex Murray		2022-10-12	Needs Fixing on 2022-10-17
Review via email: mp+431459@code.launchpad.net

This proposal has been superseded by a proposal from 2022-10-14.

Commit message

extends StaticAnalysisTool to allow optional JSON output

Description of the change

Primarily extends the StaticAnalysisTool class. StaticAnalysisTool.cmd_json sets command to generate JSON output. Also adds StaticAnalysisTool.stderr to set if stderr is allowed in output.

bandit and scc have cmd_json examples. Most other tools support JSON output.

Revision history for this message

Alex Murray (alexmurray) wrote on 2022-10-13:

Since both tools support `--format json` why not have a supports_format_json boolean parameter which defaults to False but when set means the tools get run twice (with and without the --format json parameters) and then we can always generated both json and text output? Then you don't have to add the --json top level command-line flag.

review: Needs Fixing

ubuntu-security-tools:json-output updated on 2022-10-14

aca84bd... by Mark Esler on 2022-10-14: uaudit: add json output to all supported static_analysis_tools and clean formatting

Revision history for this message

Mark Esler (eslerm) wrote on 2022-10-14 (last edit on 2022-10-14):

@alexmurray thanks for the review and suggestions.

I added cmd_json for all the tools we use that support json output. Not all of them use `--format json`.

It would be nice to only maintain a single or base command per tool, but I don't believe that is possible. I was hoping we could splice json specific arguments into StaticAnalysisTool.cmd between element 0 and 1, but shellcheck and scc throw a wrench into that.

I reformatted static_analysis_tools to help make it easier to read and lower technical debt.

ubuntu-security-tools:json-output updated on 2022-10-14

fb4bd69... by Mark Esler on 2022-10-14: uaudit: force json output when supported

Revision history for this message

Mark Esler (eslerm) wrote on 2022-10-14:

@alexmurray I removed `--json` so that every tool that supports json output runs StaticAnalysisTool.cmd_json

Revision history for this message

Alex Murray (alexmurray) wrote on 2022-10-17:

Thanks @eslerm - however I would really like to avoid having to duplicate the arguments between cmd and cmd_json - hence my suggestion for a single boolean above. If we have these duplicated then it will only be a matter of time before they start to diverge - so better to have a single source of truth and then a way to make this do json IMO.

If that can't work, how about a `json_flags` argument or similar which would be a list of command-line options to use to have the tool output json instead - and then you could update the existing shellcheck.sh to look for these and DTRT so we don't have to have two shellcheck wrapper scripts as well.

Thoughts?

review: Needs Fixing

Revision history for this message

Mark Esler (eslerm) wrote on 2022-10-18:

I want to deduplicate arguments, but I do not see an elegant way to do so.

In my previous commit, I attempted ~`json_flags` but abandoned that approach. In addition to adding arguments, `json_flags` would need to remove arguments from StaticAnalysisTool.cmd which requires logic about each tool--unless we also add ~`cmd_flags`. Using a ~`json_flags` approach would need something like:
```
+ cmd_base=["scc", "--exclude-dir", ".git,.hg,.svn,.pc"],
+ cmd_txt=["--no-cocomo", "--ci"],
+ cmd_json=["--format", "json"],
+ cmd_end=["."],
```
(Hopefully no other positional arguments come along!)

Revision history for this message

Alex Murray (alexmurray) wrote on 2022-10-19:

What about a output_formats parameter which takes a dict of output format names along with their command-line arguments to make that happen? Then depending on which output format is specified to uaudit we use that, and fallback to a "txt" one if the specified one is not found?

Unmerged commits

fb4bd69... by Mark Esler on 2022-10-14: uaudit: force json output when supported
aca84bd... by Mark Esler on 2022-10-14: uaudit: add json output to all supported static_analysis_tools and clean formatting
421d63c... by Mark Esler on 2022-10-12: uaudit: add ignored directories for json commands
c82a8df... by Mark Esler on 2022-10-05: uaudit: add cmd_json to StaticAnalysisTool
585f6d6... by Mark Esler on 2022-09-15: uaudit add json bandit
63276e0... by Mark Esler on 2022-09-15: uaudit make cmd and tool stderr optional
5174f91... by Mark Esler on 2022-09-15: init additional json output mode

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Christina A Reitbauer

Ubuntu Security Team

 diff --git a/audits/shellcheck-json.sh b/audits/shellcheck-json.sh
 new file mode 100755
 index 0000000..b061d08
 --- /dev/null
 +++ b/audits/shellcheck-json.sh
@@ -0,0 +1,2 @@
++#!/bin/bash
++shellcheck -f json1 $(find . -type f -exec file {} \; | awk -F: '{if ($0 ~ /shell script/) print $1;}' )
 diff --git a/audits/uaudit b/audits/uaudit
 index f9430df..ed65c43 100755
 --- a/audits/uaudit
 +++ b/audits/uaudit
@@ -50,11 +50,15 @@ class StaticAnalysisTool(object):
                   name: str,
                   source: StaticAnalysisToolSource=StaticAnalysisToolSource.SNAP,
                   cmd: list=list(),
++                 cmd_json: list=list(),
++                 stderr: bool=True,
                   emacs_vars: str="",
                   summary: list=list(["cat", OUTPUT_FILE])):
          self.name = name
          self._source = source
          self._cmd = cmd if len(cmd) > 0 else [self.name, '.']
++        self._cmd_json = cmd_json
++        self._stderr = stderr
          self.header = "# -*- mode: compilation; default-directory: \"$PWD\"; %s -*-\n" % emacs_vars
          self._summary = summary
@@ -65,55 +69,87 @@ class StaticAnalysisTool(object):
      def exec_cmd(self) -> list:
          return self._cmd
++    def exec_cmd_json(self) -> list:
++        return self._cmd_json
++
      def summary_cmd(self, out) -> list:
          return [out if i==OUTPUT_FILE else i for i in self._summary]
  static_analysis_tools = [
--    StaticAnalysisTool("cppcheck",
--                       # the ignore path for the .pc directory may not
--                       # strictly be necessary
--                       cmd=['cppcheck', '--max-configs=15', '-j 8', '-q', '-i', '.pc', '.'],
--                       summary=['grep', '-c', '^[a-z]', OUTPUT_FILE],
--                       source=StaticAnalysisToolSource.DEB),
--    StaticAnalysisTool("bandit",
--                       # output in a format which we can easily read (-f
--                       # custom) and which doesn't truncate results (ie -n
--                       # -1)
--                       # https://github.com/PyCQA/bandit/issues/371#issuecomment-413704112
--                       #
--                       # arguments to -x (excluded paths) have to be relative or absolute
--                       # paths so ./.pc rather than just .pc
--                       # https://github.com/PyCQA/bandit/issues/488#issuecomment-583144793
--                       cmd=['bandit', '-r', '-n', '-1', '-x', './.pc', '-f', 'custom', '.'],
--                       summary=['grep', '-c', '^/', OUTPUT_FILE]),
--    StaticAnalysisTool("brakeman",
--                       # output in text format without the pager and turn
--                       # on all warnings
--                       cmd=['brakeman', '--force', '--all', '--quiet', '--no-pager',
--                             '-o', '/dev/stdout', '.'],
--                       emacs_vars=("eval: (defun brakeman-backward-search-filename ()" +
--                                      "         (save-match-data " +
--                                      "           (save-excursion " +
--                                      "            (when (re-search-backward \"^File: \\\\(.*\\\\)$\" (point-min) t)" +
--                                      "               (list (match-string-no-properties 1)))))); " +
--                                      "eval: (setq-local compilation-error-regexp-alist " +
--                                      "        '((\"^Line: \\\\([[:digit:]]+\\\\)$\" brakeman-backward-search-filename 1)));"),
--                       summary=['grep', '-c', '^Message:', OUTPUT_FILE]),
--    StaticAnalysisTool("flawfinder",
--                       summary=['grep', '-c', '^[a-z]', OUTPUT_FILE]),
--    StaticAnalysisTool("shellcheck",
--                       cmd=["shellcheck.sh"],
--                       summary=['grep', '-c', '^\\./', OUTPUT_FILE]),
--    StaticAnalysisTool("gosec",
--                       cmd=['gosec', '-quiet', '-out', '/dev/stdout', './...'],
--                       emacs_vars=("eval: (setq-local compilation-error-regexp-alist " +
--                                   "        '((\"^\\\\[\\\\(.*\\\\):\\\\([[:digit:]]+\\\\)\\\\].*$\" 1 2)));"),
--                       summary=['grep', '-c', '\\[/.*\\]', OUTPUT_FILE]),
--    StaticAnalysisTool("scc",
--                       cmd=['scc', '--no-cocomo', '--ci', '--exclude-dir', '.git,.hg,.svn,.pc', '.'],
--                       summary=["sed", "/^Processed/q", OUTPUT_FILE]),
++    StaticAnalysisTool(
++        "cppcheck",
++        # the ignore path for the .pc directory may not
++        # strictly be necessary
++        cmd=["cppcheck", "--max-configs=15", "-j 8", "-q", "-i", ".pc", "."],
++        # json not yet supported
++        # https://sourceforge.net/p/cppcheck/discussion/development/thread/11df29af5b/?limit=25
++        summary=["grep", "-c", "^[a-z]", OUTPUT_FILE],
++        source=StaticAnalysisToolSource.DEB,
++    ),
++    StaticAnalysisTool(
++        "bandit",
++        # output in a format which we can easily read (-f
++        # custom) and which doesn't truncate results (ie -n
++        # -1)
++        # https://github.com/PyCQA/bandit/issues/371#issuecomment-413704112
++        #
++        # arguments to -x (excluded paths) have to be relative or absolute
++        # paths so ./.pc rather than just .pc
++        # https://github.com/PyCQA/bandit/issues/488#issuecomment-583144793
++        cmd=["bandit", "-r", "-n", "-1", "-x", "./.pc", "-f", "custom", "."],
++        cmd_json=["bandit", "-r", "-n", "-1", "-x", "./.pc", "-f", "json", "."],
++        stderr=False,
++        summary=["grep", "-c", "^/", OUTPUT_FILE],
++    ),
++    StaticAnalysisTool(
++        "brakeman",
++        # output in text format without the pager and turn
++        # on all warnings
++        cmd=["brakeman", "--force", "--all", "--quiet", "--no-pager", "-o", "/dev/stdout"],
++        cmd_json=["brakeman", "--force", "--all", "--quiet", "--no-pager", "-o", "/dev/stdout", "-f", "json"],
++        emacs_vars=(
++            "eval: (defun brakeman-backward-search-filename ()"
++            + "         (save-match-data "
++            + "           (save-excursion "
++            + '            (when (re-search-backward "^File: \\\\(.*\\\\)$" (point-min) t)'
++            + "               (list (match-string-no-properties 1)))))); "
++            + "eval: (setq-local compilation-error-regexp-alist "
++            + '        \'(("^Line: \\\\([[:digit:]]+\\\\)$" brakeman-backward-search-filename 1)));'
++        ),
++        summary=["grep", "-c", "^Message:", OUTPUT_FILE],
++    ),
++    StaticAnalysisTool(
++        "flawfinder",
++        # nb: using sarif style json
++        cmd_json=["flawfinder", "--sarif", "./"],
++        summary=["grep", "-c", "^[a-z]", OUTPUT_FILE],
++    ),
++    StaticAnalysisTool(
++        "shellcheck",
++        cmd=["shellcheck.sh"],
++        cmd_json=["shellcheck-json.sh"],
++        summary=["grep", "-c", "^\\./", OUTPUT_FILE],
++    ),
++    StaticAnalysisTool(
++        "gosec",
++        cmd=["gosec", "-quiet", "-out", "/dev/stdout", "./..."],
++        # nb: using sarif style json
++        cmd_json=["gosec", "-quiet", "-out", "/dev/stdout", "-fmt=sarif", "./..."],
++        emacs_vars=(
++            "eval: (setq-local compilation-error-regexp-alist "
++            + '        \'(("^\\\\[\\\\(.*\\\\):\\\\([[:digit:]]+\\\\)\\\\].*$" 1 2)));'
++        ),
++        summary=["grep", "-c", "\\[/.*\\]", OUTPUT_FILE],
++    ),
++    StaticAnalysisTool(
++        "scc",
++        cmd=["scc", "--no-cocomo", "--ci", "--exclude-dir", ".git,.hg,.svn,.pc", "."],
++        cmd_json=["scc", "--format", "json", "--exclude-dir", ".git,.hg,.svn,.pc", "."],
++        summary=["sed", "/^Processed/q", OUTPUT_FILE],
++    ),
+ ]
++
+ #
  # Utility functions
+ #
@@ -169,12 +205,16 @@ def debug(out):
              pass
--def cmd(command):
++def cmd(command, stderr = True):
      '''Try to execute the given command.'''
      debug(command)
      try:
--        sp = subprocess.Popen(command, stdout=subprocess.PIPE,
--                              stderr=subprocess.STDOUT)
++        if stderr:
++            sp = subprocess.Popen(command, stdout=subprocess.PIPE,
++                                  stderr=subprocess.STDOUT)
++        else:
++            sp = subprocess.Popen(command, stdout=subprocess.PIPE,
++                                  stderr=subprocess.DEVNULL)
      except OSError as ex:
          return [127, str(ex)]
@@ -541,6 +581,35 @@ def audit_packaging(audit_dir):
      write_file(out_fn, out)
      msg("Package audit: %s" % (out_fn_rel))
++def run_static_analysis(tool, output_type):
++        out_fn = os.path.join(audit_dir, tool.name.lower() + "." + output_type)
++        out_fn_rel = '/'.join(out_fn.split('/')[-2:])
++        if os.path.exists(out_fn):
++            warn("Skipping %s. '%s' already exists" % (tool.name, out_fn))
++        else:
++            if output_type == "txt":
++                rc, out = cmd(tool.exec_cmd())
++                out = tool.header.replace("$PWD", os.getcwd()) + out
++            elif output_type == "json":
++                if tool._stderr:
++                    rc, out = cmd(tool.exec_cmd_json())
++                else:
++                    rc, out = cmd(tool.exec_cmd_json(), False)
++            else:
++                error(f"Unknown output_type ({output_type}) while running {tool.name})")
++            # TODO: gosec exits 1 if no go files and actual output not saved
++            if rc != 0:
++                # note: above loop uses error instead of warn
++                warn(f"Attempting to run {tool.name} exited with: {rc}")
++            write_file(out_fn, out)
++            msg(f"Static analysis ({tool.name}): {out_fn_rel}")
++
++        rc, summary = cmd(tool.summary_cmd(out_fn))
++        # strip any trailing newline
++        if summary[-1] == "\n":
++            summary = summary[:-1]
++        details[tool.name] = summary
++
  def audit_code(audit_dir, details, disable_coverity=False):
      '''Audit code'''
@@ -581,26 +650,9 @@ def audit_code(audit_dir, details, disable_coverity=False):
              msg("Code audit (%s): %s" % (i, out_fn_rel))
      for tool in static_analysis_tools:
--        out_fn = os.path.join(audit_dir, tool.name.lower() + ".txt")
--        out_fn_rel = '/'.join(out_fn.split('/')[-2:])
--        if os.path.exists(out_fn):
--            warn("Skipping %s. '%s' already exists" % (tool.name, out_fn))
--        else:
--            rc, out = cmd(tool.exec_cmd())
--            if rc != 0:
--                # note: above loop uses error instead of warn
--                warn(f"Attempting to run {tool.name} exited with: {rc}")
--            # TODO: gosec exits 1 if no go files and actual output not saved
--            out = tool.header.replace("$PWD", os.getcwd()) + out
--            write_file(out_fn, out)
--            msg(f"Static analysis ({tool.name}): {out_fn_rel}")
--
--        rc, summary = cmd(tool.summary_cmd(out_fn))
--        # strip any trailing newline
--        if summary[-1] == "\n":
--            summary = summary[:-1]
--        details[tool.name] = summary
--
++        run_static_analysis(tool, "txt")
++        if tool._cmd_json:
++            run_static_analysis(tool, "json")
      if not disable_coverity:
          # perform coverity analysis if the directory exists (assume in this