Log curtin config for debugging purposes.

Bug #1768911 reported by Jason Hobbs
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Medium
Andres Rodriguez
curtin
Invalid
Undecided
Unassigned

Bug Description

As in bug 1768893, it's common for curtin config information to be requested to debug issues with maas/curtin.

Since this is dynamically generated, it's not stored anywhere on the maas file system or db. Additionally, it can only be retrieved while a node is in DEPLOYING state. Once the node has entered FAILED DEPLOYMENT, it's too late.

So, we don't have a good way to capture this during automated tests. The lack of this ability prevents us from getting the required logs to debug blocker issues, so this issue is itself a blocker.

Related branches

tags: added: cdo-qa cdo-qa-blocker foundations-engine
description: updated
Revision history for this message
Blake Rouse (blake-rouse) wrote :

You can use the get-curtin-config API to get the config even when the machine is in failed deployment. It is not too late.

Changed in maas:
status: New → Invalid
Changed in curtin:
status: New → Invalid
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Ok - that's great. I'll implement that. I still shouldn't have to, so this is still a valid bug. curtin and maas should be logging stuff like this on their own without everyone using them needing to implement that via the API. It's no different than capturing curtin's output logs.

Changed in curtin:
status: Invalid → New
Changed in maas:
status: Invalid → New
Revision history for this message
Blake Rouse (blake-rouse) wrote :

The curtin output is logged in MAAS you can pull that over the API.

maas {profile} installation-results

I am confused on what more you want or need? You can get both the config and the logs.

Changed in maas:
status: New → Incomplete
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Blake,

I do not want to add more log collection code everytime we hit a new bug. Now that we're capturing both the DB and the maas logs for every run, that really should be sufficient.

To capture this output, I have to go and add some code to check and see if we have failed deployments, and then a API calls to grab the curtin config, and some more code to put the curtin somewhere it will get archived with the rest of the logs. That's fine, I'll do it because I want to get this bug fixed, but I consider it a workaround for something missing in maas and something we don't want to carry in our code base in the long run.

I've also seen the maas team ask lots of other people to go and capture curtin config to debug issues. Anyone using MAAS in a CI loop would have to implement the API calls I mentioned above in their own way. Anyone not using a CI loop might have already released the machines and moved on by the time you've asked for the config info. If it's a race condition, it might not be easy to reproduce again. All of that could be avoided if maas just persisted this stuff in the first place, like it does the curtin output. The curtin input is at least as important to persist as the output.

Changed in maas:
status: Incomplete → New
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I removed the cdo-qa-blocker tag since we have a way to get the curtin config via the API. This should stay open, though, until the curtin config is captured either via a db dump or in /var/log/maas.

tags: removed: cdo-qa-blocker
Revision history for this message
Ryan Harper (raharper) wrote : Re: [Bug 1768911] Re: curtin configuration for node during installation should be logged by default

Curtin already logs config by default,

The merged configuration is logged and sent to serial console.

[ 61.403846] cloud-init[1445]: curtin: Installation started.
(17.1-62-g9dbe45e5)
[ 61.406329] cloud-init[1445]: LANG=None
[ 61.407138] cloud-init[1445]: merged config: {'apt_proxy': '',
'sources': {'00_cmdline': {'type': 'tgz', 'uri':
'cp:///media/root-ro'}},

Additionally, on error, curtin now collects debug information and can
post that back to maas.

I think the part that may be missing is having maas configuring the
install error_tarfile:

error_tarfile: <path to write a tar of Curtin’s log and configuration
data in the event of an error>

If error_tarfile is not None and curtin encounters an error, this
tarfile will be created. It includes logs, configuration and system
info to aid triage and bug filing. When unset, error_tarfile defaults
to /var/log/curtin/curtin-logs.tar.

Maas could do:

install:
   log_file: /tmp/install.log
   error_tarfile: /tmp/curtin-error-logs.tar
   post_files:
     - /tmp/install.log
     - /var/log/syslog
     - /tmp/curtin-error-logs.tar

On error, this will get posted back like the install log and it
includes the configs, and other system data that's helpful for
debugging.

Finally, it would be good to extend the maas api to get the
install.log file (and error-logs.tar if present) to aid in bug filing.

Ryan

On Fri, May 4, 2018 at 1:06 AM, Jason Hobbs <email address hidden> wrote:
> I removed the cdo-qa-blocker tag since we have a way to get the curtin
> config via the API. This should stay open, though, until the curtin
> config is captured either via a db dump or in /var/log/maas.
>
> ** Tags removed: cdo-qa-blocker
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1768911
>
> Title:
> curtin configuration for node during installation should be logged by
> default
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1768911/+subscriptions

Changed in maas:
status: New → Triaged
milestone: none → 2.4.x
Changed in maas:
milestone: 2.4.x → 2.4.0rc1
summary: - curtin configuration for node during installation should be logged by
- default
+ Log curtin config for debugging purposes.
Revision history for this message
Andres Rodriguez (andreserl) wrote :

The curtin side is tracked in: https://bugs.launchpad.net/maas/+bug/1746760

Changed in maas:
importance: Undecided → Medium
importance: Medium → Low
importance: Low → Medium
assignee: nobody → Andres Rodriguez (andreserl)
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Changed in curtin:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.