network state is lost if the cluster controller (CC) is stopped

Bug #460089 reported by Daniel Nurmi
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
Fix Released
Medium
Unassigned
Karmic
Fix Released
Medium
Mathias Gug

Bug Description

in the eucalyptus-cc upstart script, the line:

 rm -f /var/lib/eucalyptus/CC/*

will always clear all CC state when the service is stopped. The upstream init scripts use:

stop/start/restart

to control the service while maintaining CC state (stored in /var/lib/eucalyptus/CC/), while:

cleanstop/cleanstart/cleanrestart

will remove the state. With upstart, an idea is to wrap the above line in a conditional that will only be true if a variable is defined on the cmdline (i.e. the default behavior would be to leave the state, and if an admin really wanted to start from a clean state, or re-read the config file, they would have to set the variable on the cmdline):

start eucalyptus-cc CLEAN=1
stop eucalyptus-cc CLEAN=1
restart eucalyptus-cc CLEAN=1

and in the file:

if ( test "$CLEAN" == "1" ); then
   rm -f /var/lib/eucalyptus/CC/*
fi

-Dan

===================
SRU verification

Impact:
If the CC is restarted or rebooted (to be exact if eucalyptus-cc is stopped) all the network configuration for running instances is lost.

How the bug has been addressed:
The post script of the eucalyptus-cc upstart job has been modified to only delete the network state file if the CLEAN environment variable has been set to 1.

To reproduce the bug:
1. Install a CC and one NC.
2. Start an instance with a public IP and log into the running instance.
3. Reboot the CC.
4. After the CC has rebooted, the running instance cannot be accessed via its previous public IP.
5. Once the CC has been upgraded, a reboot of the CC should not prevent login into running instances using their previously assigned public IP.

Regression potential:
The discover-ability of the new CLEAN option may be improved.

===================

Revision history for this message
Thierry Carrez (ttx) wrote :

Could you explain what exactly this "network state" covers and why we would generally like to keep it over CC restarts or system reboots ?

Changed in eucalyptus (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Daniel Nurmi (nurmi) wrote :

Network state refers to instance IP addresses and active security group rules (user defined instance network ingress policies). It is important to be able to maintain this state in the following scenario:

Eucalyptus is up and running
Users are running VMs and logging in/accessing them via the network
The machine on which the CC is running fails or reboots

Currently, if this happens, all currently running VMs will lose network connectivity when the CC state is cleared. If the files in '/var/lib/eucalyptus/CC' are not removed, then the CC will restore the system's network state when the machine comes back and the CC is restarted. Otherwise, all instances will need to be terminated and restarted.

This issue arises when Eucalyptus is in MANAGED, MANAGED-NOVLAN, and STATIC networking modes.

Thierry Carrez (ttx)
Changed in eucalyptus (Ubuntu):
status: Incomplete → Triaged
Thierry Carrez (ttx)
Changed in eucalyptus (Ubuntu Karmic):
milestone: none → karmic-updates
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Thierry Carrez (ttx) wrote :

Agreed, filed bug 464384 to track potential improvement in making CLEAN=1 a little more discoverable.

Mathias Gug (mathiaz)
Changed in eucalyptus (Ubuntu Karmic):
assignee: nobody → Mathias Gug (mathiaz)
Mathias Gug (mathiaz)
Changed in eucalyptus (Ubuntu Karmic):
status: Triaged → In Progress
Mathias Gug (mathiaz)
description: updated
Mathias Gug (mathiaz)
Changed in eucalyptus (Ubuntu Karmic):
status: In Progress → Fix Committed
Revision history for this message
Steve Langasek (vorlon) wrote :

please don't set 'fix committed' on bugs for SRUs, this conflicts with the SRU team's use of this state.

Changed in eucalyptus (Ubuntu Karmic):
status: Fix Committed → In Progress
Mathias Gug (mathiaz)
tags: added: uec
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted eucalyptus into karmic-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in eucalyptus (Ubuntu Karmic):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Thierry Carrez (ttx) wrote :

Verified fixed

tags: added: verification-done
removed: verification-needed
Revision history for this message
Thierry Carrez (ttx) wrote :

Still fixed in 7.2

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6~bzr931-0ubuntu7.3

---------------
eucalyptus (1.6~bzr931-0ubuntu7.3) karmic-proposed; urgency=low

  * debian/eucalyptus-cc.postinst: restart avahi daemon so that it uses
    eucalyptus specific configuration file (LP: #458904).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: Respawn
    avahi publication jobs if they die (LP: #480885).

eucalyptus (1.6~bzr931-0ubuntu7.2) karmic-proposed; urgency=low

  [ Dustin Kirkland ]
  * cluster/handlers.c: euca_rootwrap rework did not whitelist powerwake;
    however, powerwake does *not* need root privs, drop euca_rootwrap wrapper
    (LP: #458163)
  * debian/rules, debian/euclayptus-cc.install: install the avahi-daemon.conf
    in /etc/eucalyptus, (LP: #458904).

  [ Thierry Carrez ]
  * clc/modules/www/src/main/java/edu/ucsb/eucalyptus/admin/public/EucalyptusWebInterface.html:
    Fix HTML title in the web UI for more consistency in naming (LP: #455293)
  * debian/eucalyptus-common.eucalyptus.upstart: Add -l to eucalyptus-cloud
    options so that cloud-output.log is affected by LOGLEVEL (LP: #458001)

  [ Mathias Gug ]
  * cluster/handlers.c: Fix the networkIndex returned by describeInstances.
    (LP: #454405 - upstream revno 933).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: add an
    upstart job to explicitly publish the IP/CC hostname mapping via avahi
    instead of publishing the CC IP address via the service name (LP: #458904).
  * debian/avahi-daemon.conf: ship a specific avahi-daemon configuration file
    that doesn't publish IP addresses by default. (LP: #458904).
  * debian/eucalyptus-cloud.postinst: Fix postfix configuration to accept
    confirmation emails sent by eucalyptus (LP: #459101)
  * debian/eucalyptus-cc.upstart: Don't clean the CC network state when the CC is
    stopped by default (LP: #460089).
 -- Mathias Gug <email address hidden> Wed, 11 Nov 2009 15:15:48 -0500

Changed in eucalyptus (Ubuntu Karmic):
status: Fix Committed → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote :

I copied the karmic-proposed package to lucid. Ffor karmic-updates it is still missing two verifications.

Changed in eucalyptus (Ubuntu):
status: Triaged → Fix Released
Changed in eucalyptus (Ubuntu Karmic):
status: Fix Released → Fix Committed
Revision history for this message
Thierry Carrez (ttx) wrote :

The fix introduced a small regression, bug 490382

tags: added: verification-failed
removed: uec verification-done
Revision history for this message
Thierry Carrez (ttx) wrote :

Also fix should also implement "start eucalyptus CLEAN=1", so it's incomplete.

Revision history for this message
Martin Pitt (pitti) wrote :

The 1 vs. "1" turned out to be a red herring, and it's just the same race condition as in the final version. (Discussed with Dustin). So this should be good to go.

tags: added: verification-done
removed: verification-failed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6~bzr931-0ubuntu7.3

---------------
eucalyptus (1.6~bzr931-0ubuntu7.3) karmic-proposed; urgency=low

  * debian/eucalyptus-cc.postinst: restart avahi daemon so that it uses
    eucalyptus specific configuration file (LP: #458904).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: Respawn
    avahi publication jobs if they die (LP: #480885).

eucalyptus (1.6~bzr931-0ubuntu7.2) karmic-proposed; urgency=low

  [ Dustin Kirkland ]
  * cluster/handlers.c: euca_rootwrap rework did not whitelist powerwake;
    however, powerwake does *not* need root privs, drop euca_rootwrap wrapper
    (LP: #458163)
  * debian/rules, debian/euclayptus-cc.install: install the avahi-daemon.conf
    in /etc/eucalyptus, (LP: #458904).

  [ Thierry Carrez ]
  * clc/modules/www/src/main/java/edu/ucsb/eucalyptus/admin/public/EucalyptusWebInterface.html:
    Fix HTML title in the web UI for more consistency in naming (LP: #455293)
  * debian/eucalyptus-common.eucalyptus.upstart: Add -l to eucalyptus-cloud
    options so that cloud-output.log is affected by LOGLEVEL (LP: #458001)

  [ Mathias Gug ]
  * cluster/handlers.c: Fix the networkIndex returned by describeInstances.
    (LP: #454405 - upstream revno 933).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: add an
    upstart job to explicitly publish the IP/CC hostname mapping via avahi
    instead of publishing the CC IP address via the service name (LP: #458904).
  * debian/avahi-daemon.conf: ship a specific avahi-daemon configuration file
    that doesn't publish IP addresses by default. (LP: #458904).
  * debian/eucalyptus-cloud.postinst: Fix postfix configuration to accept
    confirmation emails sent by eucalyptus (LP: #459101)
  * debian/eucalyptus-cc.upstart: Don't clean the CC network state when the CC is
    stopped by default (LP: #460089).
 -- Mathias Gug <email address hidden> Wed, 11 Nov 2009 15:15:48 -0500

Changed in eucalyptus (Ubuntu Karmic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.