Tagged VLAN on aliased NIC breaks migration 0099

Bug #1391139 reported by Nicolas Thomas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Jeroen T. Vermeulen

Bug Description

Find below the log of errors when upgrading to 1.7

Setting up maas-region-controller (1.7.0~rc2+bzr3297-0ubuntu1~trusty1) ...
rsyslog stop/waiting
rsyslog start/running, process 4104
 * Stopping web server apache2
 *
 * Restarting PostgreSQL 9.3 database server
   ...done.
Considering dependency proxy for proxy_http:
Module proxy already enabled
Module proxy_http already enabled
Module expires already enabled
Module wsgi already enabled
Syncing...
Creating tables ...
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

Synced:
 > django.contrib.auth
 > django.contrib.contenttypes
 > django.contrib.sessions
 > django.contrib.sites
 > django.contrib.messages
 > django.contrib.staticfiles
 > piston
 > south

Not synced (use migrations):
 - maasserver
 - metadataserver
(use ./manage.py migrate to migrate these)
Running migrations for maasserver:
 - Migrating forwards to 0114_add_pxe_mac_to_node.
 > maasserver:0099_convert_cluster_interfaces_to_networks
Error in migration: maasserver:0099_convert_cluster_interfaces_to_networks
Traceback (most recent call last):
  File "/usr/sbin/maas-region-admin", line 16, in <module>
    management.execute_from_command_line()
  File "/usr/lib/python2.7/dist-packages/django/core/management/__init__.py", line 399, in execute_from_command_line
    utility.execute()
  File "/usr/lib/python2.7/dist-packages/django/core/management/__init__.py", line 392, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/lib/python2.7/dist-packages/django/core/management/base.py", line 242, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/usr/lib/python2.7/dist-packages/django/core/management/base.py", line 285, in execute
    output = self.handle(*args, **options)
  File "/usr/lib/python2.7/dist-packages/south/management/commands/migrate.py", line 107, in handle
    ignore_ghosts = ignore_ghosts,
  File "/usr/lib/python2.7/dist-packages/south/migration/__init__.py", line 219, in migrate_app
    success = migrator.migrate_many(target, workplan, database)
  File "/usr/lib/python2.7/dist-packages/south/migration/migrators.py", line 235, in migrate_many
    result = migrator.__class__.migrate_many(migrator, target, migrations, database)
  File "/usr/lib/python2.7/dist-packages/south/migration/migrators.py", line 310, in migrate_many
    result = self.migrate(migration, database)
  File "/usr/lib/python2.7/dist-packages/south/migration/migrators.py", line 133, in migrate
    result = self.run(migration)
  File "/usr/lib/python2.7/dist-packages/south/migration/migrators.py", line 107, in run
    return self.run_migration(migration)
  File "/usr/lib/python2.7/dist-packages/south/migration/migrators.py", line 81, in run_migration
    migration_function()
  File "/usr/lib/python2.7/dist-packages/south/migration/migrators.py", line 57, in <lambda>
    return (lambda: direction(orm))
  File "/usr/lib/python2.7/dist-packages/maasserver/migrations/0099_convert_cluster_interfaces_to_networks.py", line 52, in forwards
    network.save()
  File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 545, in save
    force_update=force_update, update_fields=update_fields)
  File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 573, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 654, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 687, in _do_insert
    using=using, raw=raw)
  File "/usr/lib/python2.7/dist-packages/django/db/models/manager.py", line 232, in _insert
    return insert_query(self.model, objs, fields, **kwargs)
  File "/usr/lib/python2.7/dist-packages/django/db/models/query.py", line 1511, in insert_query
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 897, in execute_sql
    for sql, params in self.as_sql():
  File "/usr/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 855, in as_sql
    for obj in self.query.objs
  File "/usr/lib/python2.7/dist-packages/django/db/models/fields/__init__.py", line 350, in get_db_prep_save
    prepared=False)
  File "/usr/lib/python2.7/dist-packages/django/db/models/fields/__init__.py", line 342, in get_db_prep_value
    value = self.get_prep_value(value)
  File "/usr/lib/python2.7/dist-packages/django/db/models/fields/__init__.py", line 1079, in get_prep_value
    return int(value)
ValueError: invalid literal for int() with base 10: '200:0'

Related branches

Revision history for this message
Graham Binns (gmb) wrote :

This needs a bit more information. Specifically:

 - Was this an upgrade from a vanilla 1.5 install, or had it been in use?
 - Can you reproduce this, or did it happen just once?

Changed in maas:
status: New → Triaged
importance: Undecided → Critical
milestone: none → 1.7.1
status: Triaged → Incomplete
tags: added: crash upgrade
Revision history for this message
Nicolas Thomas (thomnico) wrote : Re: [Bug 1391139] Re: maas region controller upgrade from 1.5 to 1?7 script fail

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/11/2014 17:09, Graham Binns wrote:
> This needs a bit more information. Specifically:
>
> - Was this an upgrade from a vanilla 1.5 install, or had it been in
> use?

One in use ... I'm using maas ..

> - Can you reproduce this, or did it happen just once?

Yes could reproduce very consistently ..

Had to move fast so purge the maas-region-controller and DB and
recommission .. it is a lab.

I notice more errors in the package script when re-installing after
purge : namely:
/var/log/maas and /var/log/maas/oops non existent.

> ** Changed in: maas Status: New => Triaged
>
> ** Changed in: maas Importance: Undecided => Critical
>
> ** Changed in: maas Milestone: None => 1.7.1
>
> ** Changed in: maas Status: Triaged => Incomplete
>
> ** Tags added: crash upgrade
>

- --
Best Regards,
Nicolas Thomas - Solution Architect - Canonical
http://insights.ubuntu.com/?p=889
GPG FPR: D592 4185 F099 9031 6590 6292 492F C740 F03A 7EB9
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJUYPPvAAoJEEkvx0DwOn657OEH/jHzU2PIV+IPysu68EmhXmVM
BzKACpx5meeg66bQIRAsAXKa4Nn82ArF2saKQJQa+n3Tpr87l+Wi/1XE8wP5QIOt
woyt0x9qnrnzLWK0bHXabC6izlYYU8DwDpB9rkixvJ75oHBvMnSVfxx5zwbqQjLN
4FbIa0kgJbCV91Gm28rNxeCJtCUcheq7OUha+NVGtscGAmeWRmzVxslDwrUTJ0Y5
OljEl2Jb/pNhsY7E4Xk1PG6273ruLwqNKWR80neOGHPLC5XoRCPy/bA9YhbiTOQD
/y2swZ3QVXBLH6PMaJHZJywB7/Jyiie1vsGUPNaKw9lWOAAkcX0l4G2WIDU9mi0=
=0i67
-----END PGP SIGNATURE-----

Revision history for this message
Graham Binns (gmb) wrote :

On 10 November 2014 17:20, Nicolas Thomas <email address hidden> wrote:
> One in use ... I'm using maas ..

Right; I wasn't clear if you'd installed 1.5 and upgraded immediately.
Thanks for clarifying.

Can you give us a series of steps to reproduce this? I haven't had
chance to try and reproduce it locally yet, but I'll give it a shot
tomorrow. (We could always try this out on an Orange Box in Austin if
there's one available that exhibits this behaviour).

If the "series of steps" is basically "use MAAS, then, later, upgrade
MAAS." then fair enough… is this a standard orange box setup as
created by the scripts in the orange box PPA?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Monday 10 Nov 2014 18:36:21 you wrote:
> On 10 November 2014 17:20, Nicolas Thomas <email address hidden>
wrote:
> > One in use ... I'm using maas ..
>
> Right; I wasn't clear if you'd installed 1.5 and upgraded immediately.
> Thanks for clarifying.
>
> Can you give us a series of steps to reproduce this? I haven't had
> chance to try and reproduce it locally yet, but I'll give it a shot
> tomorrow. (We could always try this out on an Orange Box in Austin if
> there's one available that exhibits this behaviour).
>
> If the "series of steps" is basically "use MAAS, then, later, upgrade
> MAAS." then fair enough… is this a standard orange box setup as
> created by the scripts in the orange box PPA?

It looks to me like the vlan tag has got a colon in its definition, which
would result from a cluster interface name of the form:

eth0.200:0

The code tries to convert this into a Network definition with name "eth0" and
vlan tag "200:0".

Can you confirm if this was the case Nicolas?

Revision history for this message
Nicolas Thomas (thomnico) wrote :
Download full text (6.8 KiB)

First it is not on Orangebox but on partner premises.

I had to remove all maas package and DB and re-commission nodes (ok
for a test env only) .. can not change anything now as it basically
works.

On Tue, Nov 11, 2014 at 1:10 AM, Julian Edwards
<email address hidden> wrote:
> On Monday 10 Nov 2014 18:36:21 you wrote:
>> On 10 November 2014 17:20, Nicolas Thomas <email address hidden>
> wrote:
>> > One in use ... I'm using maas ..
>>
>> Right; I wasn't clear if you'd installed 1.5 and upgraded immediately.
>> Thanks for clarifying.
>>
>> Can you give us a series of steps to reproduce this? I haven't had
>> chance to try and reproduce it locally yet, but I'll give it a shot
>> tomorrow. (We could always try this out on an Orange Box in Austin if
>> there's one available that exhibits this behaviour).
>>
>> If the "series of steps" is basically "use MAAS, then, later, upgrade
>> MAAS." then fair enough… is this a standard orange box setup as
>> created by the scripts in the orange box PPA?
>
> It looks to me like the vlan tag has got a colon in its definition, which
> would result from a cluster interface name of the form:
>
> eth0.200:0
>
> The code tries to convert this into a Network definition with name "eth0" and
> vlan tag "200:0".
>
> Can you confirm if this was the case Nicolas?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1391139
>
> Title:
> maas region controller upgrade from 1.5 to 1?7 script fail
>
> Status in MAAS:
> Incomplete
>
> Bug description:
>
> Find below the log of errors when upgrading to 1.7
>
>
>
> Setting up maas-region-controller (1.7.0~rc2+bzr3297-0ubuntu1~trusty1) ...
> rsyslog stop/waiting
> rsyslog start/running, process 4104
> * Stopping web server apache2
> *
> * Restarting PostgreSQL 9.3 database server
> ...done.
> Considering dependency proxy for proxy_http:
> Module proxy already enabled
> Module proxy_http already enabled
> Module expires already enabled
> Module wsgi already enabled
> Syncing...
> Creating tables ...
> Installing custom SQL ...
> Installing indexes ...
> Installed 0 object(s) from 0 fixture(s)
>
> Synced:
> > django.contrib.auth
> > django.contrib.contenttypes
> > django.contrib.sessions
> > django.contrib.sites
> > django.contrib.messages
> > django.contrib.staticfiles
> > piston
> > south
>
> Not synced (use migrations):
> - maasserver
> - metadataserver
> (use ./manage.py migrate to migrate these)
> Running migrations for maasserver:
> - Migrating forwards to 0114_add_pxe_mac_to_node.
> > maasserver:0099_convert_cluster_interfaces_to_networks
> Error in migration: maasserver:0099_convert_cluster_interfaces_to_networks
> Traceback (most recent call last):
> File "/usr/sbin/maas-region-admin", line 16, in <module>
> management.execute_from_command_line()
> File "/usr/lib/python2.7/dist-packages/django/core/management/__init__.py", line 399, in execute_from_command_line
> utility.execute()
> File "/usr/lib/python2.7/dist-packages/django/core/management/__init__.py", line 392, in...

Read more...

Revision history for this message
Christian Reis (kiko) wrote : Re: maas region controller upgrade from 1.5 to 1?7 script fail

Aren't colons an indication of aliases? If so, does that look like a vlan with an alias?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

It's a VLAN on an aliased network interface, not VLAN with an alias. Perfectly valid I guess, but the notation surprises me, I'd more expect something like eth0:0.200.

We'll have to dig up an Ubuntu expert to see how this gets presented in userspace.

Changed in maas:
status: Incomplete → Triaged
summary: - maas region controller upgrade from 1.5 to 1?7 script fail
+ Tagged VLAN on aliased NIC breaks migration 0099
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

The function that's failing is used both in codebase and in the migration. The two can't share model-aware code, so we have the function duplicated between the two. That makes this fix a bit awkward, especially because we can't cover migrations in the test suite.

My branch in https://code.launchpad.net/~jtv/maas/unify-get_name_and_vlan_from_cluster_interface/+merge/241738 unifies the two versions of the function. Should make it a bit more robust to maintain. Consider using that as the basis for a fix.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

While I was at it, I wrote up a fix based on the unified function. Attached to bug. Not assigning the bug to myself yet because I haven't looked into it much and may still be on the wrong track.

Christian Reis (kiko)
Changed in maas:
assignee: nobody → Jeroen T. Vermeulen (jtv)
Changed in maas:
status: Triaged → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.