MAAS WebUI crashes when installing maas-region-controller only

Bug #1103195 reported by Andres Rodriguez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Critical
Raphaël Badin
maas (Ubuntu)
Fix Released
Medium
Andres Rodriguez

Bug Description

MAAS WebUI crashes when isntalling maas-region-controller only:

[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] mod_wsgi (pid=7256): Target WSGI script '/usr/share/maas/wsgi.py' cannot be loaded as Python module.
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] mod_wsgi (pid=7256): Exception occurred processing WSGI script '/usr/share/maas/wsgi.py'.
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] Traceback (most recent call last):
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/share/maas/wsgi.py", line 30, in <module>
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] start_up()
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/dist-packages/maasserver/start_up.py", line 59, in start_up
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] inner_start_up()
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/dist-packages/maasserver/start_up.py", line 84, in inner_start_up
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] NodeGroup.objects.ensure_master()
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/dist-packages/maasserver/models/nodegroup.py", line 95, in ensure_master
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] 'master', 'master', '127.0.0.1', dhcp_key=generate_omapi_key(),
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/dist-packages/provisioningserver/omshell.py", line 95, in generate_omapi_key
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] key = run_repeated_keygen(tmpdir)
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/dist-packages/provisioningserver/omshell.py", line 56, in run_repeated_keygen
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] key_id = call_dnssec_keygen(tmpdir)
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/dist-packages/provisioningserver/omshell.py", line 45, in call_dnssec_keygen
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] env=env)
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/subprocess.py", line 537, in check_output
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] process = Popen(stdout=PIPE, *popenargs, **kwargs)
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/subprocess.py", line 680, in __init__
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] errread, errwrite)
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] File "/usr/lib/python2.7/subprocess.py", line 1277, in _execute_child
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] raise child_exception
[Tue Jan 22 16:49:58 2013] [error] [client 192.168.123.3] OSError: [Errno 2] No such file or directory

Installing maas-dns fixes the problem. However, this should not happen. This seems to be an issue of MAAS region controller trying to start DNS service when the cluster controller is not installed along with the region even when it has not being configured to do so.

Related branches

Changed in maas (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Andres Rodriguez (andreserl)
description: updated
Revision history for this message
Raphaël Badin (rvb) wrote :

We've generating the omapi_key when the nodegroup objects are generated. For the master nodegroup (corresponding to the main cluster controller), this happens when the application starts up. If the maas-dns package is not installed, the tool to generate the key might not be installed (it's a dependency of maas-dns)… hence the error.

I suggest we refactor the code to generate the keys "on-demand" (i.e. when an action implying that maas-dns is installed is triggered) rather that in advance.

Changed in maas:
status: New → Triaged
importance: Undecided → Critical
Raphaël Badin (rvb)
Changed in maas:
assignee: nobody → Raphaël Badin (rvb)
status: Triaged → In Progress
Revision history for this message
Raphaël Badin (rvb) wrote :

Actually, as suggested by Andres, adding bind9utils as a dependency to the region controller package is a much simpler way to fix this. Since bind9utils is a tiny collection of utilities, it's really harmless.

Revision history for this message
Raphaël Badin (rvb) wrote :

Doing what I suggest above would indeed fix the immediate problem but this leads to bigger problems:

The code assumes in two places (see bellow) that the first cluster to connect is the cluster controller installed on the same machine as the region controller. This is based on the assumption that there is always a "master" cluster controller installed alongside the region controller.:
a) the first controller to connect is automatically accepted (this might not be what we want if the first controller is connecting from a remote location)
b) the code that accepts the first cluster assumes it's connecting from localhost and so does not bother trying to update nodegroup.maas_url (the url that the cluster should use to contact the MAAS server).

I think we've got two choices here:
- in the packaging, we reflect the assumption I told about above and we add maas-cluster-controller as a dependency of maas-region-controller.
- we fix the two problems I mention above ( a): only accept the first controller to connect if it's connecting from localhost, b) update nodegroup.maas_url for the first controller if the controller does not connect from localhost).

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

(FWIW somebody landed a fix recently to suppress the initialization of NodeGroup.maas_url if the given URL had localhost as its hostname. It seems like a better way to deal with the maas_url part of the problem.)

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

(Correction: I meant to say that AIUI, that fix suppresses _any_ initialization of maas_url, regardless of whether the calling code thinks it's dealing with the master cluster)

Revision history for this message
Raphaël Badin (rvb) wrote :

The fix you're talking about did not touch the code used when a cluster connects for the first time.

Revision history for this message
Raphaël Badin (rvb) wrote :

I guess this is in fact a bug distinct from this one, I've filed bug 1104215.

Revision history for this message
Gavin Panella (allenap) wrote :

To ensure that we accept as "master" only the cluster controller on
the local machine, can the region compare the UUID it's given against
the UUID it can see on the filesystem?

Even if the region is in an HA configuration, the cluster controller
will connect to localhost and therefore the region app server reached
will be the one on the same machine, and thus able to see the same
UUID.

Then we can probably get rid of the special cases for maas_url and
whatnot.

If this isn't suitable, then I suggest going with rvba's second
proposal:

> - we fix the two problems I mention above ( a): only accept the
> first controller to connect if it's connecting from localhost, b)
> update nodegroup.maas_url for the first controller if the
> controller does not connect from localhost).

Revision history for this message
Raphaël Badin (rvb) wrote :

> To ensure that we accept as "master" only the cluster controller on
> the local machine, can the region compare the UUID it's given against
> the UUID it can see on the filesystem?

A simpler way (which we already use to know if we need to update nodegroup.maas_url or not when a cluster connects) is to see if the cluster connects from 'localhost' or a remote host.

(Note that this conversation should happen on bug 1104215)

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Marking the maas task invalid as it's fixed in packaging.

Changed in maas (Ubuntu):
status: Confirmed → In Progress
Changed in maas:
status: In Progress → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package maas - 1.2+bzr1351+dfsg-0ubuntu1

---------------
maas (1.2+bzr1351+dfsg-0ubuntu1) raring; urgency=low

  * New upstream bugfix release.

  [ Raphaël Badin ]
  * debian/control: maas-region-controller depends on bind9utils.
    (LP: #1103195)

  [ Andres Rodriguez ]
  * debian/control: Depends on distro-info for maas-cluster-controller
    instead of maas-region-controller (LP: #1103194)
  * debian/maas-region-controller.install: Install commissioning-user-data
    instead of maas-cluster-controller. (LP: #1103203)
 -- Andres Rodriguez <email address hidden> Tue, 22 Jan 2013 17:51:26 -0500

Changed in maas (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.