Code review comment for lp:~seyeongkim/charms/trusty/ceph/lp1411652

Revision history for this message
Chris Holcombe (xfactor973) wrote :

Ok I ran this patch through a few paces. I have an idea that shouldn't be hard to do that I think will improve things.

One thing that I noticed when I removed a node was it didn't completely take it out of the crush map. I had to do that manually:

```
root@ip-172-31-36-39:~# ceph osd tree
# id weight type name up/down reweight
-1 3 root default
-2 0 host ip-172-31-30-72
-3 1 host ip-172-31-9-197
2 1 osd.2 up 1
-4 1 host ip-172-31-36-39
1 1 osd.1 up 1
-5 1 host ip-172-31-46-49
3 1 osd.3 up 1
-6 0 host ip-172-31-24-18
root@ip-172-31-36-39:~# ceph osd crush remove ip-172-31-30-72
removed item id -2 name 'ip-172-31-30-72' from crush map
```
I really love it if this patch did a poll to let ceph remap the data while the OSD's are getting removed. I wrote some pseudo code to give you the idea: https://gist.github.com/cholcombe973/468572a9bf73efb6f87d We can give the user an option to say wait x minutes or remove the node regardless of whether or not the cluster is unhealthy. This will at least give the cluster a chance to rebalance the other OSD's. The only downside is sometimes rebalances take days or weeks to complete.

review: Needs Fixing

« Back to merge proposal