smartctl-validate fails to detect that NVME device is SMART-capable

Bug #1904329 reported by Danny Campbell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Lee Trager
2.8
Fix Released
Medium
Lee Trager
2.9
Fix Released
Medium
Lee Trager

Bug Description

I have WD Black gaming NVMEs in many systems and each fails to detect SMART capability during commission test of smartctl-validate.

When deployed with ubuntu 20.04 I can successfully manually detect SMART is enabled and read out attributes.

version:
MAAS version: 2.8.2 (8577-g.a3e674063)

smartctl-validate output:
INFO: Veriying SMART support for the following drive: /dev/nvme0n1
INFO: Running command: sudo -n smartctl --all /dev/nvme0n1
INFO: Unable to run test. The following drive does not support SMART: /dev/nvme0n1

NVME PN:
WDS500G3X0C-00SJG0

Manual output:
$ sudo apt install -y smartmontools
...
$ sudo -n smartctl --all /dev/nvme0n1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-53-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: WDS500G3X0C-00SJG0
Serial Number: 20155F801224
Firmware Version: 102000WD
PCI Vendor/Subsystem ID: 0x15b7
IEEE OUI Identifier: 0x001b44
Total NVM Capacity: 500,107,862,016 [500 GB]
Unallocated NVM Capacity: 0
Controller ID: 8215
Number of Namespaces: 1
Namespace 1 Size/Capacity: 500,107,862,016 [500 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 001b44 8b460ae51d
Local Time is: Sun Nov 15 15:16:50 2020 UTC
Firmware Updates (0x14): 2 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x001f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size: 128 Pages
Warning Comp. Temp. Threshold: 80 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Namespace 1 Features (0x02): NA_Fields

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
 0 + 5.50W - - 0 0 0 0 0 0
 1 + 3.50W - - 1 1 1 1 0 0
 2 + 3.00W - - 2 2 2 2 0 0
 3 - 0.0700W - - 3 3 3 3 4000 10000
 4 - 0.0025W - - 4 4 4 4 4000 45000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
 0 + 512 0 2
 1 - 4096 0 1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 30 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 1,982,008 [1.01 TB]
Data Units Written: 2,032,199 [1.04 TB]
Host Read Commands: 2,412,892
Host Write Commands: 2,959,947
Controller Busy Time: 36
Power Cycles: 86
Power On Hours: 243
Unsafe Shutdowns: 37
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0

Error Information (NVMe Log 0x01, max 256 entries)
No Errors Logged

Related branches

Lee Trager (ltrager)
Changed in maas:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Lee Trager (ltrager)
milestone: none → 2.9.0rc2
Changed in maas:
status: Triaged → Fix Committed
Changed in maas:
milestone: 2.9.0rc2 → 2.10-next
Changed in maas:
milestone: 3.0.0 → none
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.