Merge into master : testlib_assert_file_type : lp:~pkopylov/qa-regression-testing : Git : Code : QA Regression Testing

Status:	Merged
Merged at revision:	503c71149c028e840958d702aad5271b2f643018
Proposed branch:	~pkopylov/qa-regression-testing:testlib_assert_file_type
Merge into:	qa-regression-testing:master
Diff against target:	39 lines (+13/-5) 1 file modified scripts/testlib.py (+13/-5)
Related bugs:	Link a bug report

Reviewer	Date Requested	Status
Steve Beattie		Approve on 2023-01-27
Alex Murray	2022-12-22	Needs Fixing on 2023-01-18
Review via email: mp+434935@code.launchpad.net

Commit message

scripts/testlib.py: the function assertFileType() was improved in order to handle files that have two or more recognized file-types correctly

Description of the change

When I was running test-bzip2.py on ubuntu 16.04 I met the situation when the /usr/bin/file utility detected an unexpected file type for some file during the test. Actually, that compressed file had two types, but the mentioned utility output only the first one. However, the expected type was the second known type, so it was not printed at all.

Fortunately, the /usr/bin/file utility can output all recognized file's types, not only the first.

This commit changes the utility's command line adding two options: -k and -r, as a result, all recognized types will be output and if even one of them matches to expected then an assertion will be true.

Revision history for this message

Alex Murray (alexmurray) wrote on 2023-01-03:

#

Since the output of file will now include multiple lines, would it be better to explicitly iterate over each line and test them separately to find the match? That would make the code a bit easier to understand IMO rather than trying to include a newline as part of the regex to match.

Revision history for this message

Pavel Kopylov (pkopylov) wrote on 2023-01-09:

#

I refactored the regexp scanning to do it over splitted lines.

Please, pay attention that the second and all the next types of some files are out by the file utility starting with '- ' combination, that is why the regexp has become more complicated.

Revision history for this message

Alex Murray (alexmurray) wrote on 2023-01-18:

#

Thanks - this is looking better - just a couple minor cosmetic changes and I think it should be good to go.

review: Needs Fixing

Revision history for this message

Steve Beattie (sbeattie) wrote on 2023-01-27:

#

Hey Pavel, thanks for this. One issue I hit is that our team still supports Ubuntu 14.04 LTS packages as part of the Ubuntu ESM product, and on 14.04, the file command does not support the -r option. I added the following commit to address the issue:

https://git.launchpad.net/qa-regression-testing/commit/?id=9938b4d6cf7c4be8e585e6296ff90d9d4370367f

as part of the merged branch.

For qa-r-t maintainers, the test-librsvg.py test checks against svg (aka XML) files, and are a good validation case for this.

Ultimately, the more robust solution for this might be to convert the file invocation to pass --mime, though that is not without variances in output across versions of file and requires changing all of the tests expected output. (Python 3's magic module is also a possibility, but s only available as the python-magic package in Python 2, and the 14.04/16.04 era version of that module has a drastically different api.)

review: Approve

Revision history for this message

Marc Deslauriers (mdeslaur) wrote on 2023-06-05:

#

FYI, this caused regressions in a bunch of tests as when file is run with -b, it will add extra line breaks in the output. For example:

Netpbm image data, size = 3684 x 2760, rawbits, pixmap

turns into:

Netpbm image data, size = 3684 x 2760
- , rawbits, pixmap
- data

And:

TIFF image data, little-endian, direntries=21, height=1380, bps=16, compression=none, PhotometricIntepretation=BlackIsZero, description= , manufacturer=Canon, model=PowerShot S95, width=1842

turns into:

TIFF image data, little-endian, direntries=21, height=1380, bps=16, compression=none, PhotometricInterpretation=BlackIsZero, description= , manufacturer=Canon, model=PowerShot S95
- , width=1842
- data

I have hidden the multiple file types functionality behind an option for now until a better solution can be found.

Revision history for this message

Pavel Kopylov (pkopylov) wrote on 2023-06-27 (last edit on 2023-06-27):

#

Dear Marc, it is bad news about regression.

However, I couldn't reproduce the problem. So could you please write an OS version you met the problem?

I guess that the new changes to the test 'test-libraw.py' revealed the problem on Ubuntu 23.04. Unfortunately, I don't have this Ubuntu version available to install on the VM. Could you please send here an output of the file utility for both 'file -b' and 'file -bkr' variants with suspicious files?

PS: Sorry for the long pause, I was on vacation last month.

 diff --git a/scripts/testlib.py b/scripts/testlib.py
 index b18a74c..f30427e 100644
 --- a/scripts/testlib.py
 +++ b/scripts/testlib.py
@@ -1121,7 +1121,10 @@ class TestlibCase(unittest.TestCase):
      def assertFileType(self, filename, filetype, strict=True):
          '''Checks the file type of the file specified'''
--        (rc, report, out) = self._testlib_shell_cmd(["/usr/bin/file", "-b", filename])
++        # Try to see all recognized types of file (thanks to -k option), not only the first.
++        # File can have several detected types, it will be OK if one of those types
++        # equals to an expected filetype
++        (rc, report, out) = self._testlib_shell_cmd(["/usr/bin/file", "-bkr", filename])
          out = out.strip()
          expected = 0
          # Absolutely no idea why this happens on Hardy
@@ -1130,13 +1133,18 @@ class TestlibCase(unittest.TestCase):
          result = 'Got exit code %d, expected %d:\n%s\n' % (rc, expected, report)
          self.assertEqual(expected, rc, result)
++        found = False
          if strict:
--            filetype = '^%s$' % (filetype)
++            # some line extracted from the out should match to the filetype exactly
++            filetype = '(^|- )%s$' % (filetype)
          else:
--            # accept if the beginning of the line matches
--            filetype = '^%s' % (filetype)
++            # just a start portion of some line extracted from the out matches to the filetype
++            filetype = '(^|- )%s' % (filetype)
++        for some_line in out.splitlines(True):
++            if re.search(filetype, some_line) is not None:
++                found = True
          result = 'File type reported by file: [%s], expected regex: [%s]\n' % (out, filetype)
--        self.assertNotEqual(None, re.search(filetype, out), result)
++        self.assertNotEqual(False, found, result)
      def yank_commonname_from_cert(self, certfile):
          '''Extract the commonName from a given PEM'''

QA Regression Testing

Merge ~pkopylov/qa-regression-testing:testlib_assert_file_type into qa-regression-testing:master

Commit message

Description of the change

Preview Diff

Subscribers