Ok I ran this patch through a few paces. I have an idea that shouldn't be hard to do that I think will improve things.
One thing that I noticed when I removed a node was it didn't completely take it out of the crush map. I had to do that manually:
```
root@ip-172-31-36-39:~# ceph osd tree
# id weight type name up/down reweight
-1 3 root default
-2 0 host ip-172-31-30-72
-3 1 host ip-172-31-9-197
2 1 osd.2 up 1
-4 1 host ip-172-31-36-39
1 1 osd.1 up 1
-5 1 host ip-172-31-46-49
3 1 osd.3 up 1
-6 0 host ip-172-31-24-18
root@ip-172-31-36-39:~# ceph osd crush remove ip-172-31-30-72
removed item id -2 name 'ip-172-31-30-72' from crush map
```
I really love it if this patch did a poll to let ceph remap the data while the OSD's are getting removed. I wrote some pseudo code to give you the idea: https://gist.github.com/cholcombe973/468572a9bf73efb6f87d We can give the user an option to say wait x minutes or remove the node regardless of whether or not the cluster is unhealthy. This will at least give the cluster a chance to rebalance the other OSD's. The only downside is sometimes rebalances take days or weeks to complete.
Ok I ran this patch through a few paces. I have an idea that shouldn't be hard to do that I think will improve things.
One thing that I noticed when I removed a node was it didn't completely take it out of the crush map. I had to do that manually:
``` 172-31- 36-39:~ # ceph osd tree 172-31- 36-39:~ # ceph osd crush remove ip-172-31-30-72 /gist.github. com/cholcombe97 3/468572a9bf73e fb6f87d We can give the user an option to say wait x minutes or remove the node regardless of whether or not the cluster is unhealthy. This will at least give the cluster a chance to rebalance the other OSD's. The only downside is sometimes rebalances take days or weeks to complete.
root@ip-
# id weight type name up/down reweight
-1 3 root default
-2 0 host ip-172-31-30-72
-3 1 host ip-172-31-9-197
2 1 osd.2 up 1
-4 1 host ip-172-31-36-39
1 1 osd.1 up 1
-5 1 host ip-172-31-46-49
3 1 osd.3 up 1
-6 0 host ip-172-31-24-18
root@ip-
removed item id -2 name 'ip-172-31-30-72' from crush map
```
I really love it if this patch did a poll to let ceph remap the data while the OSD's are getting removed. I wrote some pseudo code to give you the idea: https:/