juju destroy-environment -y local --force does not stop jujud/mongod processes on broken bootstrap

Bug #1292275 reported by Dave Cheney
26
This bug affects 6 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Tim Penhey

Bug Description

Situation.

$ juju bootstrap -e local --upload-tools

failed because /usr/bin/mongod is not installed, see LP #1271937

$ juju destroy-environment -y local

fails because the environment is not bootstrapped

ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused
ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused
ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused
ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused
ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused
ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused
ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp 10.0.3.1:17070: connection refused

$ juju destroy-environment -y local --force

works, but does not stop the jujud processes

ubuntu@winton-02:~/src/launchpad.net/juju-core$ pstree
init─┬─acpid
     ├─atd
     ├─cron
     ├─dbus-daemon
     ├─dnsmasq
     ├─7*[getty]
     ├─irqbalance
     ├─rsyslogd───3*[{rsyslogd}]
     ├─sshd───sshd───sshd───bash───tmux
     ├─systemd-logind
     ├─systemd-udevd
     ├─tmux─┬─bash───vim───{vim}
     │ └─bash─┬─juju─┬─sudo───bash───bash───jujud───6*[{jujud}] // oops
     │ │ └─6*[{juju}]
     │ └─pstree
     ├─upstart-file-br
     ├─upstart-socket-
     └─upstart-udev-br

ubuntu@winton-02:~/src/launchpad.net/juju-core$ pgrep -lf jujud
15368 jujud
ubuntu@winton-02:~/src/launchpad.net/juju-core$ sudo !!
sudo pgrep -lf jujud
15368 jujud
15516 sudo
ubuntu@winton-02:~/src/launchpad.net/juju-core$ cat /proc/15368/cmdline
/home/ubuntu/.juju/local/tools/1.17.5.1-trusty-ppc64/jujudbootstrap-state--data-dir/home/ubuntu/.juju/local--env-configYWRtaW4tc2VjcmV0OiAiIgphZ2VudC12ZXJzaW9uOiAxLjE3LjUuMQphcGktcG9ydDogMTcwNzAKYXB0LWZ0cC1wcm94eTogIiIKYXB0LWh0dHAtcHJveHk6ICIiCmFwdC1odHRwcy1wcm94eTogIiIKYXV0aG9yaXplZC1rZXlzOiAnc3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEQ2EzL2lDRHBwQkZQemlZbzdqblJDdFRwRkdLdnppQm1FKzloaDFiUnd3eXNjanczTkR5UmxWY2NJTFB1ZVczS3BicG5ZOEx3d3BRN3FhVVNscVFGcTkyTnRGUEdhYTJlMndGSVFIYnlxc1VQQ1B4dEY1d1lxUVhYY0dUcFRqK3pRS3dNNmxmeWxKUStZeWJ6Nk1jbGRKYUx2VitYcWFvRXFBYXNpclBmMytBZ0duVE5KL25md2Fkd3RnbVF5cDMyUUY0REplZGdnSGxwWk92ejVmVm5PYjMzdUhOTU0xS2x4Q0VJaDNFTWgxWklwdG1BaDhyV3I1T0w4WlM0eCt6RjdVL01VSnp6K1hsYnFZYVI3SkdPT2tKUmMyTWtpNlBOb0FLQ09rRlNnOHJNdFF5NW5UcXpNakR5K0VwdFM4WEFkLzhLeFUxWi81dXQyZk5UM2k4VkoKICBqdWp1LWNsaWVudC1rZXkKCiAgc3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEeDAzN3JsYTd6ZDFXaE10ZzM3dlB5Rk5pYmhYZmZGbVJyT0c5YzRzbENMSjBxaVRNMGFsREkzT2dEdnpzQWU0ZmVGa085WEREMTFDbWtxcXdYV29Gd2QzTlAzNlVzNWI4Tys1bis2bExzNGVHL1pZYy9MQzdpeVBNdk12MHcvM2JwU2J6QU1CUlI3UjdndXEzUzZQM1FzNC9hVWtTWVBNbW9WcENLYW1LN1B2MlVUbEVlWnFJRkRIN1hvUjI2WHlGcFZ2QWFidndQY0JmbFJkUkJseGJsRWNSRVVaVWJCQ1R3dG1tK2k4KzBPd25JM2o4ODhGVktoazZsbUNjbzc4YWVvVXJ0NUs4c3JlMzZIL1hLdnhyenpLUHR2S24rbDlvOWVzM3NTejZwRmpWNjJ5NUszK3pzaXBMU0xqKzF0Ry9iRkZvUHNQSVMySVBRblJwV3lBSUwKICBqdWp1LXN5c3RlbS1rZXkKCicKYm9vdHN0cmFwLWFkZHJlc3Nlcy1kZWxheTogMTAKYm9vdHN0cmFwLWlwOiAxMC4wLjMuMQpib290c3RyYXAtcmV0cnktZGVsYXk6IDUKYm9vdHN0cmFwLXRpbWVvdXQ6IDYwMApjYS1jZXJ0OiAnLS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCgogIE1JSUNXekNDQWNhZ0F3SUJBZ0lCQURBTEJna3Foa2lHOXcwQkFRVXdRekVOTUFzR0ExVUVDaE1FYW5WcWRURXkKCiAgTURBR0ExVUVBd3dwYW5WcWRTMW5aVzVsY21GMFpXUWdRMEVnWm05eUlHVnVkbWx5YjI1dFpXNTBJQ0pzYjJOaAoKICBiQ0l3SGhjTk1UUXdNekV6TWpNek5qUXpXaGNOTWpRd016RXpNak0wTVRReVdqQkRNUTB3Q3dZRFZRUUtFd1JxCgogIGRXcDFNVEl3TUFZRFZRUUREQ2xxZFdwMUxXZGxibVZ5WVhSbFpDQkRRU0JtYjNJZ1pXNTJhWEp2Ym0xbGJuUWcKCiAgSW14dlkyRnNJakNCbnpBTkJna3Foa2lHOXcwQkFRRUZBQU9CalFBd2dZa0NnWUVBeXkvWS9OMmxEWjY3S2xhTQoKICBCOGtZcGdNcXcvSFp2V203RHVVUDdsT1JmWVJMMWMzS0lYQlJDR1NPOVFhdEQ4UHVOUFhFOVhFYm9oNjl5YlEyCgogIGxTQUVmUjdJclNvenBqSklZSkV0WURkcWFRRFhZclF2NDgrcnlLTXJNZ21jMGtiYUJhN0pXTWtHcDBGRlJqdkcKCiAgalBMU0tXS3pCZ3Rrem1FVkFRdVNDWXpGTGhVQ0F3RUFBYU5qTUdFd0RnWURWUjBQQVFIL0JBUURBZ0NrTUE4RwoKICBBMVVkRXdFQi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZKSFhPOTEzRGV4SElmNHVNRklDWktlZ1NRTFFNQjhHCgogIEExVWRJd1FZTUJhQUZKSFhPOTEzRGV4SElmNHVNRklDWktlZ1NRTFFNQXNHQ1NxR1NJYjNEUUVCQlFPQmdRQkoKCiAgMFVYeUFsRFhXWklmazhkSVJjOGFqTzVyc2o2Qmx0RUhneFFKRFcvSlFTQUdoSWRSYzdjQlNqdm9vWllKQ1BRNQoKICBJMTBUeFd5NXdKSUhtaHVUVlNvN3kzU3UzMVBNRUs2RUkyZEFnL3pQRzdSNm1haWUzaTV3L3V2MVg5ZElzUmNsCgogIFBscFkvbDFpSXZEd0VJMFY0eCtNSUpWTElkcFB6OTdLTUZZNUpmNUNqZz09CgogIC0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0KCicKY2EtcHJpdmF0ZS1rZXk6ICIiCmNoYXJtLXN0b3JlLWF1dGg6ICIiCmNvbnRhaW5lcjogbHhjCmRlZmF1bHQtc2VyaWVzOiBwcmVjaXNlCmRldmVsb3BtZW50OiBmYWxzZQpmaXJld2FsbC1tb2RlOiBpbnN0YW5jZQpmdHAtcHJveHk6ICIiCmh0dHAtcHJveHk6IGh0dHA6Ly8xMC4yNDUuNjQuMTozMTI4LwpodHRwcy1wcm94eTogaHR0cDovLzEwLjI0NS42NC4xOjMxMjgvCmltYWdlLW1ldGFkYXRhLXVybDogIiIKaW1hZ2Utc3RyZWFtOiAiIgpsb2dnaW5nLWNvbmZpZzogPHJvb3Q+PURFQlVHO3VuaXQ9REVCVUcKbmFtZTogbG9jYWwKbmFtZXNwYWNlOiB1YnVudHUtbG9jYWwKbmV0d29yay1icmlkZ2U6IGx4Y2JyMApuby1wcm94eTogIiIKcm9vdC1kaXI6IC9ob21lL3VidW50dS8uanVqdS9sb2NhbApzc2wtaG9zdG5hbWUtdmVyaWZpY2F0aW9uOiB0cnVlCnN0YXRlLXBvcnQ6IDM3MDE3CnN0b3JhZ2UtcG9ydDogODA0MApzeXNsb2ctcG9ydDogNjUxNAp0ZXN0LW1vZGU6IGZhbHNlCnRvb2xzLW1ldGFkYXRhLXVybDogIiIKdG9vbHMtdXJsOiAiIgp0eXBlOiBsb2NhbAo=--debug

So, now we can bootstrap again and have TWO sets of local provider processes running

ubuntu@winton-02:~/src/launchpad.net/juju-core$ pgrep -lf juju
15310 juju
15368 jujud
15521 juju
15579 jujud

This will not end well

Related branches

Revision history for this message
Dave Cheney (dave-cheney) wrote : Re: [Bug 1292275] [NEW] juju destroy-environment -y local --force does not stop jujud processes

top - 00:06:40 up 2 days, 23:09, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 82 total, 1 running, 76 sleeping, 5 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.2 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si,
 0.0 st
KiB Mem: 8255588 total, 1005912 used, 7249676 free, 88848 buffers
KiB Swap: 0 total, 0 used, 0 free. 685768 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

15310 ubuntu 20 0 370348 78212 12620 T 0.0 0.9 0:10.98 juju

15368 root 20 0 403920 34504 11592 T 0.0 0.4 0:00.16 jujud

^ T state, ooh err.

Revision history for this message
Dave Cheney (dave-cheney) wrote : Re: juju destroy-environment -y local --force does not stop jujud processes

T state means someone sent STOP to the process, or it's being traced (unlikely)

Possibly the shutdown code is sending -SIGSTOP not SIGTERM or SIGKILL. Possibly this is coming from upstart rather than the local provider.

Revision history for this message
John A Meinel (jameinel) wrote :

I think, actually, that Upstart uses ptrace for processes it spawns so that it can see when that process spawns children:
https://blueprints.launchpad.net/ubuntu/+spec/foundations-q-upstart-overcome-ptrace-limitations

So probably the T is because it actually *is* being traced.

Tim Penhey (thumper)
summary: juju destroy-environment -y local --force does not stop jujud processes
+ on broken bootstrap
John A Meinel (jameinel)
Changed in juju-core:
milestone: 1.18.0 → 1.17.6
Revision history for this message
Dave Cheney (dave-cheney) wrote : Re: [Bug 1292275] Re: juju destroy-environment -y local --force does not stop jujud processes
Download full text (6.4 KiB)

oh. my. god. That is going to end terribly. Is there is a switch to disable
that ?

On Mon, Mar 17, 2014 at 3:25 PM, John A Meinel <email address hidden>wrote:

> I think, actually, that Upstart uses ptrace for processes it spawns so
> that it can see when that process spawns children:
>
> https://blueprints.launchpad.net/ubuntu/+spec/foundations-q-upstart-overcome-ptrace-limitations
>
> So probably the T is because it actually *is* being traced.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1292275
>
> Title:
> juju destroy-environment -y local --force does not stop jujud
> processes
>
> Status in juju-core:
> Triaged
>
> Bug description:
> Situation.
>
> $ juju bootstrap -e local --upload-tools
>
> failed because /usr/bin/mongod is not installed, see LP #1271937
>
> $ juju destroy-environment -y local
>
> fails because the environment is not bootstrapped
>
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
> ERROR state/api: websocket.Dial wss://10.0.3.1:17070/: dial tcp
> 10.0.3.1:17070: connection refused
>
> $ juju destroy-environment -y local --force
>
> works, but does not stop the jujud processes
>
> ubuntu@winton-02:~/src/launchpad.net/juju-core$ pstree
> init─┬─acpid
> ├─atd
> ├─cron
> ├─dbus-daemon
> ├─dnsmasq
> ├─7*[getty]
> ├─irqbalance
> ├─rsyslogd───3*[{rsyslogd}]
> ├─sshd───sshd───sshd───bash───tmux
> ├─systemd-logind
> ├─systemd-udevd
> ├─tmux─┬─bash───vim───{vim}
> │ └─bash─┬─juju─┬─sudo───bash───bash───jujud───6*[{jujud}] //
> oops
> │ │ └─6*[{juju}]
> │ └─pstree
> ├─upstart-file-br
> ├─upstart-socket-
> └─upstart-udev-br
>
> ubuntu@winton-02:~/src/launchpad.net/juju-core$ pgrep -lf jujud
> 15368 jujud
> ubuntu@winton-02:~/src/launchpad.net/juju-core$ sudo !!
> sudo pgrep -lf jujud
> 15368 jujud
> 15516 sudo
> ubuntu@winton-02:~/src/launchpad.net/juju-core$ cat /proc/15368/cmdline
>
> /home/ubuntu/.juju/local/tools/1.17.5.1-trusty-ppc64/jujudbootstrap-state--data-dir/home/ubuntu/.juju/local--env-configYWRtaW4tc2VjcmV0OiAiIgphZ2VudC12ZXJzaW9uOiAxLjE3LjUuMQphcGktcG9ydDogMTcwNzAKYXB0LWZ0cC1wcm94eTogIiIKYXB0LWh0dHAtcHJveHk6ICIiCmFwdC1odHRwcy1wcm94eTogIiIKYXV0aG9yaXplZC1rZXlzOiAnc3NoLXJzYSBBQUFBQjNOemFDMXljMkVBQUFBREFRQUJBQUFCQVFEQ2EzL2lDRHBwQkZQemlZbzdqblJDdFRwRkdLdnppQm1FKzloaDFiUnd3eXNjanczTkR5UmxWY2NJTFB1ZVczS3BicG5ZOEx3d3BRN3FhVVNscVFGcTkyTnRGUEdhYTJlMndGSVFIYnlxc1VQQ1B4dEY1d1lxUVhYY0...

Read more...

John A Meinel (jameinel)
Changed in juju-core:
assignee: nobody → Andrew Wilkins (axwalk)
status: Triaged → In Progress
Revision history for this message
Andrew Wilkins (axwalk) wrote : Re: juju destroy-environment -y local --force does not stop jujud processes on broken bootstrap

The termination code sends SIGABRT (slightly abusive, but SIGTERM is used by upstart).

The problem here is that since mongod doesn't exist, the bootstrap agent never finishes. The agent never gets as far as starting the signal-handler worker, hence why termination/uninstall doesn't work.

Since this only happened on a momentarily broken trunk (trunk is fixed now, you can't bootstrap without mongod anymore), I'm not convinced there's anything to do.

Revision history for this message
Andrew Wilkins (axwalk) wrote :

BTW the bootstrap agent is run directly, not through upstart. Seems likely to me that it just got ctrl-z'd.

John A Meinel (jameinel)
Changed in juju-core:
importance: Critical → High
Revision history for this message
Andrew Wilkins (axwalk) wrote :

John, I don't think there's anything to do here for now. If the bootstrap agent hangs then something is fundamentally broken, and requires user intervention.

There's argument to be made for having a more intelligent destroy-env to handle this and similarly broken cases, but I think this should be marked Low and have its milestone removed. I'll wait for you before doing that.

Andrew Wilkins (axwalk)
Changed in juju-core:
status: In Progress → Triaged
Revision history for this message
Andrew Wilkins (axwalk) wrote :

I've just confirmed that destroy-environment --force *does* kill the bootstrap agent, but obviously not if the process is stopped (where the signal won't be acted upon until the process wakes up). I'm closing this, if someone disagrees then they can reopen.

Changed in juju-core:
status: Triaged → Invalid
Revision history for this message
John A Meinel (jameinel) wrote :

I've just seen that it leaves mongodb running, so I'm going to reopen this.
I'm not sure how I got into this situation, but I bootstrapped the local provider, upgraded it, killed machine-0 agent a few times, and in the end, I have a mongod running that I had to kill manually.

Revision history for this message
John A Meinel (jameinel) wrote :

I don't think this has to block 1.17.6, but it does still seem to be a bug.

Changed in juju-core:
milestone: 1.17.6 → 2.0
status: Invalid → Triaged
summary: - juju destroy-environment -y local --force does not stop jujud processes
- on broken bootstrap
+ juju destroy-environment -y local --force does not stop jujud/mongod
+ processes on broken bootstrap
Revision history for this message
John A Meinel (jameinel) wrote :

Note that I do still have:
$ ls /etc/init/juju*
/etc/init/juju-agent-jameinel-local.conf /etc/init/juju-db-jameinel-local.conf

However, those are now listed as "stop/waiting", though that was after I manually killed mongod (upstart didn't seem to want to auto-restart it, which surprises me a bit)

Andrew Wilkins (axwalk)
Changed in juju-core:
status: Triaged → In Progress
Curtis Hovey (sinzui)
tags: added: destroy-environment mongodb
Changed in juju-core:
milestone: 2.0 → 1.17.7
Tim Penhey (thumper)
Changed in juju-core:
assignee: Andrew Wilkins (axwalk) → Tim Penhey (thumper)
Tim Penhey (thumper)
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.