Node always belongs to the same nodegroup/cluster

Bug #1148016 reported by Andres Rodriguez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Unassigned

Bug Description

While trying to work on the FPI, I found out that the nodes always belong to the same nodegroup/cluster controller even though they have been registered with different clusters.

My setup is the following:

1. maas region node (node: maas)
2. maas cluster node (node: cluster)
3. Added a 2nd maas-cluster-controller in the maas-region node (node: maas).

Them I proceeded with enlistment/commissioning:

1. Enlisted/Commissioned node01 with maas-cluster in cluster node (192.168.123.3)
2. Enlisted/Commissioned node02 with maas-cluster in maas node (192.168.123.2)

Then I did the following:

>>> from maasserver.models import Node
>>> node = Node.objects.get(hostname='node01')
>>> node.nodegroup.get_any_interface().ip
'192.168.123.3'
>>> node = Node.objects.gethostname='node02')
>>> node.nodegroup.get_any_interface().ip
'192.168.123.3'

(Note that I added a function get_any_interface that obtains the first interface in a nodegroup/cluster, however this is really unrelated to the issue because I also did the following:

>>> from maasserver.models import Node
>>> node = Node.objects.get(hostname='node01')
>>> node.nodegroup.uuid
u'5f26e851-6404-4895-9378-ad653d0327ac'

>>> node = Node.objects.gethostname='node02')
>>> node.nodegroup.uuid
u'5f26e851-6404-4895-9378-ad653d0327ac'

If you see, both UUID's are the same. The UUID belongs to the cluster-controller running on node 'cluster'.

Related branches

Changed in maas:
importance: Undecided → High
importance: High → Undecided
Revision history for this message
Raphaël Badin (rvb) wrote :

When a node enlists, we decide to which cluster it should be attached to by checking the IP address of the node against the managed interfaces configured on the cluster (see src/maasserver/api.py: find_nodegroup).

Can you tell us how the interfaces of the 2 clusters are configured?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Raphael... that's the thing... there are no managed interfaces for the cluster as i'm using external DHCP servers, however this should work regardless of whether we are managing or not, because the nodes will still boot from a particular cluster,making them belong to a nodegroup.

Revision history for this message
Raphaël Badin (rvb) wrote :

Ok, this is related to bug 1085823 then. The problem was that since we detect all the interfaces of the server where MAAS is installed, we might get conflicting interfaces if we use *all* the interfaces (and not just the *managed* interfaces) to detect to which nodegroup a node should be attached to.

Revision history for this message
Gavin Panella (allenap) wrote :

My understanding of this is the following:

- Neither cluster controller manages DHCP or DNS,

- initialize_node_group() is called with an enlisting node,

- find_nodegroup() cannot find a matching nodegroup for the node,

- initialize_node_group() therefore sets the node's nodegroup to the
  master nodegroup.

I'm not sure what we can do better here yet.

Andres, how much does this matter for the purposes of FPI? Can you
work around this?

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Gavin,

This is important because the root.tar.gz images used by the FPI are stored in the Cluster Controllers, since they are currently generated based on the ephemeral images.

Now, since each node on an FPI installation process is being told where to obtain the root.tar.gz image from, it is important to determine the nodegroup/cluster controller the node belongs to. By doing so, we ensure that the node contacts the correct Cluster Controller (which is the one the node it booted from) in order to be able to successfully obtain the image.

The proposed way to do this is via the cluster_host variable made available for the preseeds.

Unfortunately, there's no easy way to work around this.

Cheers.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

So I have tested the attached branch, and having 2 cluster controllers on different networks with DNS/DHCP not being managed by MAAS seems to work just fine:

>>> node = Node.objects.get(hostname='node1-1')
>>> node.nodegroup.get_any_interface().ip
'192.168.123.2'
>>> node = Node.objects.get(hostname='node2-1')
>>> node.nodegroup.get_any_interface().ip
'192.168.124.2'

Changed in maas:
status: New → Fix Committed
importance: Undecided → Critical
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.