[2.1,2.2] cloud-init/curtin http status updates cause high CPU usage

Bug #1648456 reported by David Britton
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Blake Rouse

Bug Description

maas 2.1+, cloud-init & curtin on xenial, lost of http posts come back for all actions. According to the maas folks, these all open a database connection for each post. Each one of these in turn goes to the web client to update the UI event log.

This combination pegs all CPUs while even a few nodes are deploying.

To repro:

use maas 2.1+, deploy 10 xenial machines. You will notice very large CPU spikes.

The slowness is the contributing factor in bug # 1604962

Tags: landscape

Related branches

Changed in maas:
status: New → Triaged
importance: Undecided → Critical
milestone: none → 2.2.0
summary: - maas2, xenial, cloud-init/curtin http status updates cause extreme
- slowness in MAAS
+ [2.1,2.2] cloud-init/curtin http status updates cause high CPU usage
tags: removed: kanban-cross-team
David Britton (dpb)
Changed in landscape:
milestone: none → 16.11
Revision history for this message
Mike Pontillo (mpontillo) wrote :

This is a shot in the dark, but could you try the following to see if this improves performance:

    $ sudo maas-region dbshell

    maasdb=# ALTER DATABASE maasdb SET synchronous_commit TO off;

If that doesn't help, then try this:

    maasdb=# ALTER DATABASE maasdb SET commit_delay TO 10000;

If you later want to return to the default settings, you can do:

    maasdb=# ALTER DATABASE maasdb RESET synchronous_commit;
    maasdb=# ALTER DATABASE maasdb RESET commit_delay;

Be sure not to leave the `dbshell` open. I've seen weird behavior in MAAS when this happens (maybe because it's taking up a database connection; not sure).

The other thing you can try is Django connection pooling. Open the MAAS Django settings file, such as:

    $ sudo vi $(dpkg -L maas-region-api | grep settings.py$)

Then find the spot in the file that configures the database. It should look like this:

# Database access configuration.
try:
    with RegionConfiguration.open() as config:
        DATABASES = {
            'default': {
                'ENGINE': 'django.db.backends.postgresql_psycopg2',
                'NAME': config.database_name,
                'USER': config.database_user,
                'PASSWORD': config.database_pass,
                'HOST': config.database_host,
                'PORT': str(config.database_port),
            }
        }

Change it to look like this:

# Database access configuration.
try:
    with RegionConfiguration.open() as config:
        DATABASES = {
            'default': {
                'ENGINE': 'django.db.backends.postgresql_psycopg2',
                'NAME': config.database_name,
                'USER': config.database_user,
                'PASSWORD': config.database_pass,
                'HOST': config.database_host,
                'PORT': str(config.database_port),
                'CONN_MAX_AGE': 600,
            }
        }

Note the addition of CONN_MAX_AGE. This enables Django connection pooling.

Disclaimer: all of the above is at your own risk. While my limited research leads me to believe it might help, I have not tested these options extensively and make no guarantees. But please let me know if it works. ;-)

Changed in landscape:
milestone: 16.11 → 16.12
Changed in landscape:
status: New → Triaged
importance: Undecided → Critical
importance: Critical → High
Changed in landscape:
milestone: 16.12 → 17.01
Chad Smith (chad.smith)
Changed in landscape:
milestone: 17.01 → 17.02
Changed in maas:
status: Triaged → In Progress
assignee: nobody → Blake Rouse (blake-rouse)
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Chad Smith (chad.smith)
Changed in landscape:
milestone: 17.02 → 17.03
David Britton (dpb)
no longer affects: landscape
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.