Juju Charms Collection
ceph package

Merge lp:~seyeongkim/charms/trusty/ceph/lp1411652 into lp:~openstack-charmers-archive/charms/trusty/ceph/next

Trusty Tahr (14.04)
lp1411652
Merge into next

Proposed by Seyeong Kim on 2016-01-06

Status:

Rejected

Rejected by:

Edward Hope-Morley on 2016-04-27

Proposed branch:

lp:~seyeongkim/charms/trusty/ceph/lp1411652

Merge into:

lp:~openstack-charmers-archive/charms/trusty/ceph/next

Diff against target:

49 lines (+31/-0)

2 files modified

hooks/ceph.py (+22/-0)
hooks/ceph_hooks.py (+9/-0)

To merge this branch:

bzr merge lp:~seyeongkim/charms/trusty/ceph/lp1411652

Low

Confirmed

Link a bug report

Reviewer	Date Requested	Status
Edward Hope-Morley		Needs Fixing on 2016-04-27
Billy Olsen	2016-01-06	Needs Fixing on 2016-02-03
Chris Holcombe (community)		Needs Fixing on 2016-02-02
Chris MacNaughton (community)		Needs Information on 2016-01-08
Review via email: mp+281732@code.launchpad.net

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-01-06:

charm_unit_test #15563 ceph-next for xtrusia mp281732
UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/15563/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-01-06:

charm_lint_check #16664 ceph-next for xtrusia mp281732
LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/16664/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-01-06:

charm_amulet_test #8538 ceph-next for xtrusia mp281732
AMULET OK: passed

Build: http://10.245.162.77:8080/job/charm_amulet_test/8538/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-01-08:

charm_unit_test #15701 ceph-next for xtrusia mp281732
UNIT OK: passed

Build: http://10.245.162.77:8080/job/charm_unit_test/15701/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-01-08:

charm_lint_check #16815 ceph-next for xtrusia mp281732
LINT OK: passed

Build: http://10.245.162.77:8080/job/charm_lint_check/16815/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-01-08:

charm_amulet_test #8586 ceph-next for xtrusia mp281732
AMULET OK: passed

Build: http://10.245.162.77:8080/job/charm_amulet_test/8586/

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2016-01-08:

I'm very nervous about unilaterally removing monitors and OSDs on the stop hook. I think that we may want to make this stop hook actions instead of a hook that can be called implicitly.

What is the goal with this change? To remove nodes from a production Ceph cluster or maybe to remove OSDs for maintenance?

review: Needs Information

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2016-01-11:

Thanks for your review.

As you can see the bug report, mon, osd ( + crushmap, auth ) are still remains after unit added & removed several times.

You mean this needs kind of "shutdown" hook instead of stop hook?
This seems not which i can do..

Thanks.

> I'm very nervous about unilaterally removing monitors and OSDs on the stop
> hook. I think that we may want to make this stop hook actions instead of a
> hook that can be called implicitly.
>
> What is the goal with this change? To remove nodes from a production Ceph
> cluster or maybe to remove OSDs for maintenance?

Revision history for this message

James Page (james-page) wrote on 2016-01-11:

A charms 'stop' hook is called as the instance of the charm is destroyed on
a unit - so I think its appropriate to remove the OSD from the cluster at
this point, as the next step taken by juju will be to destroy the machine
its running on; dropping the OSD from the cluster configuration here LGTM.

On Mon, Jan 11, 2016 at 4:38 AM, Seyeong Kim <email address hidden>
wrote:

> Thanks for your review.
>
> As you can see the bug report, mon, osd ( + crushmap, auth ) are still
> remains after unit added & removed several times.
>
> You mean this needs kind of "shutdown" hook instead of stop hook?
> This seems not which i can do..
>
> Thanks.
>
> > I'm very nervous about unilaterally removing monitors and OSDs on the
> stop
> > hook. I think that we may want to make this stop hook actions instead of
> a
> > hook that can be called implicitly.
> >
> > What is the goal with this change? To remove nodes from a production Ceph
> > cluster or maybe to remove OSDs for maintenance?
> --
>
> https://code.launchpad.net/~xtrusia/charms/trusty/ceph/lp1411652/+merge/281732
> Your team OpenStack Charmers is requested to review the proposed merge of
> lp:~xtrusia/charms/trusty/ceph/lp1411652 into
> lp:~openstack-charmers/charms/trusty/ceph/next.
>

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2016-01-11:

> On Jan 11, 2016, at 1:24 AM, James Page <email address hidden> wrote:
>
> A charms 'stop' hook is called as the instance of the charm is destroyed on
> a unit

Is there no other possible time that the stop hook will be called? I'm still nervous about it but I suppose that when you're destroying machines that you are hopefully aware that the cluster may become permanently unhealthy?

Chris

Revision history for this message

Chris Holcombe (xfactor973) wrote on 2016-01-19:

I agree with Chris MacNaughton on this one. It would be really nice if Juju could say no I can't remove this OSD or Monitor but allows a --force flag that allows you to take it down. It doesn't look like this destroys any data. Could you test to see if you remove the OSDs and then add them again if the cluster will form again? I believe it will but it's just a guess.

review: Needs Fixing

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2016-01-22:

According to the documentation:

"stop runs immediately before the end of the unit's destruction sequence. It should be used to ensure that the charm's software is not running, and will not start again on reboot.

This hook is called when a service removal is requested by the client. It should implement the following logic:

Stop the service
Remove any files/configuration created during the service lifecycle
Prepare any backup(s) of the service that are required for restore purposes."

I think then, that if re-adding the OSD works after this that we can say most of the logic should be handled. I'd like to see the stop hook also remove the ceph-mon-all service and ceph-osd-all service from upstart so that they don't run on reboot of the machine as well.

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2016-01-24:

@xfactor973
Hello. I tested only charms over openstack env and maas with one lxc for one unit.
There was no special problem. but after you comment. I think there could be a problem in case that one machine for multiple unit.
I haven't check this case yet. will do in this week and modify code if there is something.
Thanks.

> I agree with Chris MacNaughton on this one. It would be really nice if Juju
> could say no I can't remove this OSD or Monitor but allows a --force flag that
> allows you to take it down. It doesn't look like this destroys any data.
> Could you test to see if you remove the OSDs and then add them again if the
> cluster will form again? I believe it will but it's just a guess.

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2016-01-29:

Ok.. I was wrong. seems no way to deploy multiple ceph osd to one machine.

Then, It's working fine with several tries destroying and adding ceph.

but I'm testing more now..

Revision history for this message

Seyeong Kim (seyeongkim) wrote on 2016-02-02:

I tried several times this.

but it was fine and cluster was stable.

I think we need test by someone not me.

Revision history for this message

Chris Holcombe (xfactor973) wrote on 2016-02-02:

@xtrusia I'll try to deploy this on ec2 in the next day or so to see how it performs. I'm sure it's fine. :)

Revision history for this message

Chris Holcombe (xfactor973) wrote on 2016-02-02:

Ok I ran this patch through a few paces. I have an idea that shouldn't be hard to do that I think will improve things.

One thing that I noticed when I removed a node was it didn't completely take it out of the crush map. I had to do that manually:

```
root@ip-172-31-36-39:~# ceph osd tree
# id weight type name up/down reweight
-1 3 root default
-2 0 host ip-172-31-30-72
-3 1 host ip-172-31-9-197
2 1 osd.2 up 1
-4 1 host ip-172-31-36-39
1 1 osd.1 up 1
-5 1 host ip-172-31-46-49
3 1 osd.3 up 1
-6 0 host ip-172-31-24-18
root@ip-172-31-36-39:~# ceph osd crush remove ip-172-31-30-72
removed item id -2 name 'ip-172-31-30-72' from crush map
```
I really love it if this patch did a poll to let ceph remap the data while the OSD's are getting removed. I wrote some pseudo code to give you the idea: https://gist.github.com/cholcombe973/468572a9bf73efb6f87d We can give the user an option to say wait x minutes or remove the node regardless of whether or not the cluster is unhealthy. This will at least give the cluster a chance to rebalance the other OSD's. The only downside is sometimes rebalances take days or weeks to complete.

review: Needs Fixing

Revision history for this message

Chris Holcombe (xfactor973) wrote on 2016-02-02:

A possible future patch that we can roll is having the ceph charm add disks that were previously removed from the cluster. We can do this by referencing the fsid in /var/lib/ceph/osd/ceph-*/ceph_fsid. If it matches the current ceph cluster than add it back in. This would make it such that removing a unit isn't the end of the world for your cluster if you did it by accident.

Revision history for this message

Billy Olsen (billy-olsen) wrote on 2016-02-03:

I think this change looks generally good to me, minus the couple of changes need in the inline comments and some unit tests.

The OSDs and Mons should have their auth removed so they couldn't access the cluster even if they wanted. If they try and connect because a daemon is started, they'll get denied and that's expected. It'd be nice if the related ceph systemd/upstart services were disabled (not removed unless the package is removed), but I don't see that as a requirement for this particular change.

In current code, what we have is the removal of a ceph unit will completely trash the unit and not clean up anything properly within ceph. This significantly improves things in that it removes almost everything related to this node from the crushmap, the monitor cluster, etc. This is goodness. (Remove the crush bucket for the host to get the last dangling stuff, though it won't have any weight so it shouldn't be *that* big of a deal in the crush algorithm, but certainly could affect retries).

As far as polling for the healthy cluster and for the placement groups to be rebalanced. I think that's a great feature, however I don't think it should be part of this particular merge proposal. The ceph service units have auth access to the cluster, but the ceph-osd units do not (and in a real-world deployment there'd be ~3 to 5 ceph units to >10 ceph-osd units). This would introduce an inconsistency with how the ceph and the ceph-osd units behave with respect to one another. I think this is a change that should be implemented outside of this merge proposal.

So bottom line, I'm marking this as 'Needs Fixing' - but I don't think we need to add the polling or service removal as part of this particular change as it is. Address the points inline below, add unit tests, and we can make an incremental improvement and provide future MPs to address the enhancements requested in the review.

review: Needs Fixing

Revision history for this message

Billy Olsen (billy-olsen) wrote on 2016-02-03:

fwiw, there's a few things at play here... the data on the OSD is intentionally not touched so that it can be reused later. Charm defaults to not reformatting the disk in an effort to prevent data loss so a user can bring the OSD back into the cluster. We'd need to check if this is handled automagically upon adding the device. In a MAAS environment, I believe the disks are wiped clean when the node is reclaimed by MAAS (an option at least). Thus, a juju destroy-unit ceph will instead be more catastrophic to the OSD themselves.

Revision history for this message

Chris MacNaughton (chris.macnaughton) wrote on 2016-02-03:

Hey Billy,

Some of this may change in the near future as we're planning to deprecate the current ceph charm in favor of a ceph-mon charm + ceph-osd charm to clarify what deployments should look like. At that point, we can cleanly separate the removal of services from osd and monitor specific deployments.

Chris
> On Feb 3, 2016, at 5:14 AM, Billy Olsen <email address hidden> wrote:
>
> fwiw, there's a few things at play here... the data on the OSD is intentionally not touched so that it can be reused later. Charm defaults to not reformatting the disk in an effort to prevent data loss so a user can bring the OSD back into the cluster. We'd need to check if this is handled automagically upon adding the device. In a MAAS environment, I believe the disks are wiped clean when the node is reclaimed by MAAS (an option at least). Thus, a juju destroy-unit ceph will instead be more catastrophic to the OSD themselves.
> --
> https://code.launchpad.net/~xtrusia/charms/trusty/ceph/lp1411652/+merge/281732
> You are reviewing the proposed merge of lp:~xtrusia/charms/trusty/ceph/lp1411652 into lp:~openstack-charmers/charms/trusty/ceph/next.

Revision history for this message

Chris Holcombe (xfactor973) wrote on 2016-02-03:

I agree, lets skip the polling feature for now and add it in a future merge.

lp:~seyeongkim/charms/trusty/ceph/lp1411652 updated on 2016-02-04

128. By Seyeong Kim on 2016-02-04: add removing crush map by hostname
129. By Seyeong Kim on 2016-02-04: merge from upstream

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-02-09:

charm_unit_test #115 ceph-next for xtrusia mp281732
UNIT OK: passed

Build: http://10.245.162.36:8080/job/charm_unit_test/115/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-02-09:

charm_lint_check #132 ceph-next for xtrusia mp281732
LINT OK: passed

Build: http://10.245.162.36:8080/job/charm_lint_check/132/

Revision history for this message

uosci-testing-bot (uosci-testing-bot) wrote on 2016-02-09:

charm_amulet_test #9 ceph-next for xtrusia mp281732
AMULET OK: passed

Build: http://10.245.162.36:8080/job/charm_amulet_test/9/

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2016-04-27:

Hi Seyeong, the Openstack charm development workflow has changed and is now performed using git repos under the Openstack project [0]. Can you please resubmit this to both of these locations:

* http://github.com/openstack/charm-ceph
* http://github.com/openstack/charm-ceph-mon

Note that you need to apply to both these charms since we have not yet deprecated the ceph charm so still need to keep the two synced in terms of mon code. Thanks.

[0] https://github.com/openstack-charmers/openstack-community/blob/master/README.dev-charms.md

review: Needs Fixing

Unmerged revisions

129. By Seyeong Kim on 2016-02-04: merge from upstream
128. By Seyeong Kim on 2016-02-04: add removing crush map by hostname
127. By Seyeong Kim on 2016-01-06: fix lint
126. By Seyeong Kim on 2016-01-06: add stop hook

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Nobuto Murata

OpenStack Charmers

Seyeong Kim

 === modified file 'hooks/ceph.py'
 --- hooks/ceph.py	2016-01-12 14:17:36 +0000
 +++ hooks/ceph.py	2016-02-04 08:22:15 +0000
@@ -487,3 +487,25 @@
  def filesystem_mounted(fs):
      return subprocess.call(['grep', '-wqs', fs, '/proc/mounts']) == 0
++
++
++def remove_mon():
++    hostname = get_unit_hostname()
++    subprocess.check_call(["ceph", "mon", "remove", hostname])
++
++
++def get_osdids():
++    osddir = os.listdir("/var/lib/ceph/osd/")
++    osdids = []
++    for osd in osddir:
++        osdids.append(osd.split('-')[1])
++    return osdids
++
++
++def shutdown_osd(osd_id):
++    hostname = get_unit_hostname()
++    subprocess.check_call(["ceph", "osd", "crush", "remove", "osd."+osd_id])
++    subprocess.check_call(["ceph", "osd", "crush", "remove", hostname])
++    subprocess.check_call(["ceph", "auth", "del", "osd."+osd_id])
++    subprocess.check_call(["service", "ceph-osd", "stop", "id="+osd_id])
++    subprocess.check_call(["ceph", "osd", "rm", osd_id])
 === modified file 'hooks/ceph_hooks.py'
 --- hooks/ceph_hooks.py	2016-01-22 15:37:01 +0000
 +++ hooks/ceph_hooks.py	2016-02-04 08:22:15 +0000
@@ -413,6 +413,15 @@
          ceph.start_osds(get_devices())
++@hooks.hook('stop')
++def stop():
++    osdids = ceph.get_osdids()
++    for osd_id in osdids:
++        ceph.shutdown_osd(osd_id)
++
++    ceph.remove_mon()
++
++
  @hooks.hook('nrpe-external-master-relation-joined')
  @hooks.hook('nrpe-external-master-relation-changed')
  def update_nrpe_config():

Juju Charms Collection
ceph package

Merge lp:~seyeongkim/charms/trusty/ceph/lp1411652 into lp:~openstack-charmers-archive/charms/trusty/ceph/next

Commit message

Description of the change

Unmerged revisions

Preview Diff

Subscribers

Juju Charms Collectionceph package

Merge lp:~seyeongkim/charms/trusty/ceph/lp1411652 into lp:~openstack-charmers-archive/charms/trusty/ceph/next

Commit message

Description of the change

Unmerged revisions

Preview Diff

Subscribers

Juju Charms Collection
ceph package