smartctl verify fails due to Unicode in Disk Vendor Name

Bug #1773150 reported by KingJ
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Alberto Donato
3.1
Fix Released
Medium
Alberto Donato
3.2
Fix Released
Medium
Alberto Donato
3.3
Fix Released
Medium
Alberto Donato

Bug Description

When running a test against a disk that has a Unicode character in the Vendor's name, the smartctl_validate script is unable to parse the output and marks the test as failed;

INFO: Veriying SMART support for the following drive: /dev/sdc

INFO: Running command: sudo -n smartctl --all /dev/sdc

Traceback (most recent call last):

  File "/tmp/user_data.sh.63aqA3/scripts/testing/smartctl-validate", line 189, in <module>

    sys.exit(run_smartctl(args.storage, test))

  File "/tmp/user_data.sh.63aqA3/scripts/testing/smartctl-validate", line 143, in run_smartctl

    check_SMART_support(storage)

  File "/tmp/user_data.sh.63aqA3/scripts/testing/smartctl-validate", line 85, in check_SMART_support

    match = smart_support_regex.search(output.decode('utf-8'))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 377: invalid start byte

Specifically, the Additional Product ID field has a ▒ character after "DELL":

Add. Product Id: DELL▒

I must admit, I don't know *why* this character is present in smartctl's output, however it does cause MAAS to fail testing for the disk despite otherwise perfectly fine SMART data.

I've attached the full smartctl output.

Tags: tests smartctl

Related branches

Revision history for this message
KingJ (kj-kingj) wrote :
Revision history for this message
KingJ (kj-kingj) wrote :
Changed in maas:
milestone: none → 2.5.0
status: New → Triaged
Lee Trager (ltrager)
Changed in maas:
assignee: nobody → Lee Trager (ltrager)
status: Triaged → In Progress
importance: Undecided → High
importance: High → Medium
Revision history for this message
Lee Trager (ltrager) wrote :

I'm having some trouble reproducing this. I download the output uploaded and was able to read it with

import re
smart_support_regex = re.compile('SMART support is:\s+Available')
output = open('smartctl unicode.txt', 'rb').read()
match = smart_support_regex.search(output.decode('utf-8')
match is not None
True

If you modify

match = smart_support_regex.search(output.decode('utf-8'))

To

match = smart_support_regex.search(output.decode('utf-8', 'replace'))

Does the smartctl test pass?

Changed in maas:
status: In Progress → Incomplete
Revision history for this message
KingJ (kj-kingj) wrote :

Hi Lee,

Sorry for the slow reply, but I can still reproduce this - MAAS 2.4.2 testing a Ubuntu 18.04.01 system. Using your code;

Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> smart_support_regex = re.compile('SMART support is:\s+Available')
>>> output = open('smartctl.txt', 'rb').read()
>>> match = smart_support_regex.search(output.decode('utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 377: invalid start byte

Adding replace to that line, it works;

Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> smart_support_regex = re.compile('SMART support is:\s+Available')
>>> output = open('smartctl.txt', 'rb').read()
>>> match = smart_support_regex.search(output.decode('utf-8', 'replace'))
>>> match is not None
True

In terms of the MAAS testing script, if I change the match line to the one above (line 85) that line no longer fails. However, it crashes on a later line instead when it attempts to print out the results;

./smartctl-validate --storage /dev/sdc
INFO: Veriying SMART support for the following drive: /dev/sdc
INFO: Running command: sudo -n smartctl --all /dev/sdc

INFO: SMART support is available; continuing...
INFO: Verifying and/or validating SMART tests...
INFO: Running command: sudo -n smartctl --xall /dev/sdc

FAILURE: SMART tests have FAILED for: /dev/sdc
The test exited with return code 64! See the smarctl manpage for information on the return code meaning. For more information on the test failures, review the test output provided below.
---------------------------------------------------

Traceback (most recent call last):
  File "./smartctl-validate", line 189, in <module>
    sys.exit(run_smartctl(args.storage, test))
  File "./smartctl-validate", line 171, in run_smartctl
    print(output.decode('utf-8'))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 377: invalid start byte

If I update line 171 to also have the replace argument (i.e. `print(output.decode('utf-8', 'replace'))` ), the script works and returns the test results (attached).

Changed in maas:
milestone: 2.5.0 → 2.5.x
Revision history for this message
Eduardo Pérez (eperezf) wrote :

I'm having this same problem as well.

Testing through MAAS gives the error:
-----------------------------------------

INFO: Veriying SMART support for the following drive: /dev/sdb

INFO: Running command: sudo -n smartctl --all /dev/sdb

Traceback (most recent call last):

  File "/tmp/user_data.sh.cWbGJt/scripts/testing/smartctl-validate", line 189, in <module>

    sys.exit(run_smartctl(args.storage, test))

  File "/tmp/user_data.sh.cWbGJt/scripts/testing/smartctl-validate", line 143, in run_smartctl

    check_SMART_support(storage)

  File "/tmp/user_data.sh.cWbGJt/scripts/testing/smartctl-validate", line 85, in check_SMART_support

    match = smart_support_regex.search(output.decode('utf-8'))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 373: invalid start byte

-------------------------

When I run sudo -n smartctl --all /dev/sdb via SSH it says no errors but in the Information Section the Add. Product ID is DELL�.

That's a very weird value. I'm running a Dell PowerEdge 1950 server with 2 drives: a Western Digital (the one that throws that error) and a Seagate one that passes without problems.

Revision history for this message
Brendan Johnson (brendan.r.johnson) wrote :

I am also having the problem.

-----------------------------------------
INFO: Veriying SMART support for the following drive: /dev/sdf
INFO: Running command: sudo -n smartctl --all /dev/sdf
Traceback (most recent call last):
  File "/tmp/user_data.sh.WYaUVg/scripts/testing/smartctl-validate", line 338, in <module>
    if not execute_smartctl(args.blockdevice, args.test):
  File "/tmp/user_data.sh.WYaUVg/scripts/testing/smartctl-validate", line 275, in execute_smartctl
    device_type, bus_ids = check_SMART_support(blockdevice)
  File "/tmp/user_data.sh.WYaUVg/scripts/testing/smartctl-validate", line 168, in check_SMART_support
    blockdevice, ['--all'], device, output=True, stderr=STDOUT)
  File "/tmp/user_data.sh.WYaUVg/scripts/testing/smartctl-validate", line 69, in run_smartctl
    return check_output(cmd, timeout=TIMEOUT, **kwargs).decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x99 in position 358: invalid start byte

When I run sudo -n smartctl --all /dev/sdX on one of drives returning the above error I see the following for product ID:

Add. Product Id: DELL▒ EQ

Revision history for this message
Björn Tillenius (bjornt) wrote :

This is something we need to fix. I'm not sure the attached smartctl output is a valid test case, since it doesn't contain any 0x99 characters.

It would be nice if someone would attach the exact output of smarctl, but I think we can fix the bug without it.

Changed in maas:
status: Incomplete → Triaged
milestone: 2.5.x → none
no longer affects: maas/2.4
Changed in maas:
assignee: Lee Trager (ltrager) → nobody
Revision history for this message
Jason Kraus (zbyte64) wrote :

Attached is the output of the smartctl command with the funky "DELL?" string inside.

This triggered the error for me.

Revision history for this message
Marcus Gaskamp (mgpanther) wrote :

I managed to export the default scripts and get them working.

The issue seems to be in the syntax of .decode('utf-8') by default the error handling is strict which will break via exception.

In this case that syntax will break the script for all decode exceptions anywhere in the smart output.

This is overly cautious as the smart output is compared to a regex anyway. If the command output didn't match the regex it should still fail the test only more gracefully.

Considering how it's used I think all instances in this script are safe to be replaced with .decode('utf-8', 'replace') which will not stop script execution dead as previously noted.

Perhaps some additional parsing may need to be done but I think this will patch the script to work as expected the vast majority of the time.

There are three places where .decode will break execution. I've included my patch.

Revision history for this message
Marcus Gaskamp (mgpanther) wrote :

Here's the patch

Revision history for this message
Marcus Gaskamp (mgpanther) wrote :

I should add the workaround in my case was downloading the default storage script and creating user uploaded scripts with the patch included.

The script uses "-short" "-long" and "-conveyance" in the name of the test to determine which type of test to do so be sure to pay attention to the name of the new script and the appropriate yaml metadata.

summary: - [2.4.0~rc1] smartctl verify fails due to Unicode in Disk Vendor Name
+ smartctl verify fails due to Unicode in Disk Vendor Name
Changed in maas:
milestone: none → 3.3.0
Changed in maas:
milestone: 3.3.0 → 3.4.0
Alberto Donato (ack)
Changed in maas:
assignee: nobody → Alberto Donato (ack)
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Alberto Donato (ack)
Changed in maas:
milestone: 3.4.0 → 3.4.0-beta1
Alberto Donato (ack)
Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Marking 3.2 as Fix Released (in 3.2.8).

$ git log --oneline -1 f5d99824a595
f5d99824a595 LP:1773150 replace invalid UTF-8 chars in smartctl output

$ git describe --contains f5d99824a595
3.2.8~12

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.