Re-commissioning doesn't detect storage changes

Bug #1575567 reported by Jacek Nykis
46
This bug affects 13 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Mike Pontillo
1.9
Fix Released
Critical
Mike Pontillo

Bug Description

I have a node that had recently more disks added including new SSD. I re-commissioned it but new storage devices are not showing in MAAS UI (see attachment)

When I look at commissioning output relevant hardware is showing there for example:
         <lshw:node id="disk:1" claimed="true" class="disk" handle="SCSI:02:00:00:01">
           <lshw:description>SCSI Disk</lshw:description>
           <lshw:product>LOGICAL VOLUME</lshw:product>
           <lshw:vendor>HP</lshw:vendor>
           <lshw:physid>0.0.1</lshw:physid>
           <lshw:businfo>scsi@2:0.0.1</lshw:businfo>
           <lshw:logicalname>/dev/sdb</lshw:logicalname>
           <lshw:dev>8:16</lshw:dev>
           <lshw:version>6.00</lshw:version>
           <lshw:serial>XXX</lshw:serial>
           <lshw:size units="bytes">900151926784</lshw:size>
           <lshw:configuration>
            <lshw:setting id="ansiversion" value="5"/>
            <lshw:setting id="sectorsize" value="512"/>
           </lshw:configuration>
           <lshw:capabilities>
            <lshw:capability id="15000rpm">15000 rotations per minute</lshw:capability>
           </lshw:capabilities>
          </lshw:node>

un maas <none> <none> (no description available)
ii maas-cli 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS command line API tool
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server common files
un maas-dhcp <none> <none> (no description available)
ii maas-dns 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS DNS server
ii maas-proxy 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS Caching Proxy
ii maas-region-controller 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server complete region controller
ii maas-region-controller-min 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS Server minimum region controller
ii python-django-maas 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server Django web framework
ii python-maas-client 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS python API client
ii python-maas-provisioningserver 1.9.1+bzr4543-0ubuntu2~trusty1 all MAAS server provisioning libraries

Related branches

Revision history for this message
Jacek Nykis (jacekn) wrote :
Revision history for this message
Jacek Nykis (jacekn) wrote :

I uploaded full commissioning output here:
https://private-fileshare.canonical.com/~jacek/lp1575567.xml

Changed in maas:
importance: Undecided → Critical
status: New → Triaged
milestone: none → 1.9.3
Revision history for this message
Jacek Nykis (jacekn) wrote :

Update: after I removed the node and added it back in I was able to commission and hardware was detected properly. Of course re-commissioning should still work as expected.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

We do not gather the disk information from lshw. Please provide the output of the commissioning script for "block-devices.out" in the "Commissioning Output" on the node details page. The output for initial commissioning followed by the output from the re-commissioning would be best.

Changed in maas:
status: Triaged → Incomplete
Changed in maas:
milestone: 1.9.3 → 2.0.0
status: Incomplete → Triaged
Revision history for this message
Mike Pontillo (mpontillo) wrote :

I tested this on MAAS 2.0 and didn't see the issue. Checking 1.9 now...

Changed in maas:
status: Triaged → Incomplete
Revision history for this message
Mike Pontillo (mpontillo) wrote :

After looking at a packet capture, I found that on MAAS 1.9.1 the following error occurs during posting of the storage data:

POST /MAAS/metadata//2012-03-01/ HTTP/1.1
Accept-Encoding: identity
Content-Length: 1444
Connection: close
User-Agent: Python-urllib/2.7
Host: 172.16.100.10
Content-Type: multipart/form-data; boundary=fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Authorization: OAuth realm="", oauth_nonce="4c370cd98ceb4cf5933a7f1191ef5fe8", oauth_timestamp="1462832639", oauth_consumer_key="LjsQhh8TPyBhPkFHc8", oauth_signature_method="PLAINTEXT", oauth_version="1.0", oauth_token="uK7JkPEeY8wvePayhx", oauth_signature="%26ZNc6eeddjSWSFSnq5tC9q5Tgxe6t9VQf"

--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="status"

WORKING
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="script_result"

0
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="error"

finished 00-maas-07-block-devices [6/9]: 0
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="op"

signal
--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss
Content-Disposition: form-data; name="00-maas-07-block-devices.out"; filename="00-maas-07-block-devices.out"
Content-Type: application/octet-stream

[
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sda",
  "ID_PATH": "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-0-0-0",
  "PATH": "/dev/sda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "QEMU HARDDISK",
  "RO": "0",
  "SERIAL": "drive-scsi1-0-0-0",
  "SIZE": "1073741824"
 },
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sdb",
  "ID_PATH": "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-0-0",
  "PATH": "/dev/sdb",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "QEMU HARDDISK",
  "RO": "0",
  "SERIAL": "drive-scsi0-0-0",
  "SIZE": "8589934592"
 },
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sdc",
  "ID_PATH": "/dev/disk/by-id/wwn-0x3000000100000001",
  "PATH": "/dev/sdc",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "VIRTUAL-DISK",
  "RO": "1",
  "SERIAL": "3000000100000001",
  "SIZE": "1468006400"
 }
]

--fdxUFynvFVvxTcXYEYRiTjCQlzgfrss--
HTTP/1.1 400 BAD REQUEST
Date: Mon, 09 May 2016 22:24:00 GMT
Server: TwistedWeb/13.2.0
Content-Type: application/json
X-Frame-Options: SAMEORIGIN
Vary: Cookie
Connection: close
Transfer-Encoding: chunked

45
{"__all__": ["Block device with this Node and Name already exists."]}
0

Revision history for this message
Mike Pontillo (mpontillo) wrote :

I think I figured out why this is happening.

In the database, we have a unique constraint on the device name of each storage device. For example, if you have {node1, sda} and you add {node1, sdb}, that's fine.

If you add a disk, and the kernel chooses different names for each device, when we go to insert the new device, its name might clash with the old device name.

Changed in maas:
status: Incomplete → Triaged
assignee: nobody → Mike Pontillo (mpontillo)
Revision history for this message
Mike Pontillo (mpontillo) wrote :

Note: you've got a 50/50 chance of hitting this bug, depending on if the kernel decides to insert the new device before or after your old one. ;-)

Changed in maas:
status: Triaged → Fix Committed
Revision history for this message
Trent Lloyd (lathiat) wrote :

Tested on maas/proposed 1.9.3 and the issue is resolved for me, in my case the existing device details were being updated.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Glad to hear it; thanks for confirming!

Revision history for this message
Virginie Dotta (vdotta) wrote :

Hi Mike,

Similar bug in maas 1.9.3

I do not have disk information provided in the GUI

Revision history for this message
Virginie Dotta (vdotta) wrote :

Here is the output of the 00-maas-07-block-devices.out

[
 {
  "BLOCK_SIZE": "4096",
  "NAME": "sda",
  "ID_PATH": "/dev/disk/by-id/wwn-0x3000000600000001",
  "PATH": "/dev/sda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "VIRTUAL-DISK",
  "RO": "1",
  "SERIAL": "3000000600000001",
  "SIZE": "1468006400"
 },
 {
  "BLOCK_SIZE": "4096",
  "NAME": "vda",
  "PATH": "/dev/vda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "",
  "RO": "0",
  "SIZE": "128849018880"
 }
]

Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
Aymen Frikha (aym-frikha) wrote :

Also have same issue using MAAS version : MAAS Version 2.1.0+bzr5480-0ubuntu1.
There is disks added to the nodes and others removed from nodes. After recommissioning new devices were not detected.
Work on Proliant DL 380 Gen9

Revision history for this message
Grant Slater (firefishy) wrote :

Seeing this issue in MAAS 2.2.2-6099-g8751f91-0ubuntu1~16.04.1

"Block device with this Node and Name already exists."

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.