Merge lp:~hopem/charms/precise/cinder/lp1226823 into lp:charms/raring/cinder

Proposed by Edward Hope-Morley
Status: Merged
Merged at revision: 22
Proposed branch: lp:~hopem/charms/precise/cinder/lp1226823
Merge into: lp:charms/raring/cinder
Diff against target: 51 lines (+23/-2)
3 files modified
config.yaml (+10/-0)
hooks/cinder-hooks (+12/-1)
revision (+1/-1)
To merge this branch: bzr merge lp:~hopem/charms/precise/cinder/lp1226823
Reviewer Review Type Date Requested Status
James Page Pending
Adam Gandelman Pending
Review via email: mp+187289@code.launchpad.net

This proposal supersedes a proposal from 2013-09-17.

Description of the change

* Sets minimum recommended pg settings when creating cinder rbd pool.
* Sets replication count of 3 when creating new cinder pool.

Fixes: bug 1226823

To post a comment you must log in.
Revision history for this message
Adam Gandelman (gandelman-a) wrote : Posted in a previous version of this proposal

Edward and I chatted about this on IRC and decided the minimal fix for the bug here is to set the PG number on the pool, and not the replication count. Hard-coding the replication count to 3 on the new pool is dependent on the ceph cluster on the other end. We'd still like to support deploying single-node ceph clusters for testing, and adding this to the cinder charm would block that. As far as I can tell reading docs, a PG number of 1024 should be supported on any size ceph cluster.

We need a better way of managing these details. I don't believe the remote client services should be dictating the particulars of the ceph pools they use. I'd be happier if, via the relation, a service like cinder could simply request the creation of some pools, eg ['cinder', 'images', 'otherstuff'] and it is up to the ceph charm to create those pools with parameters that are sensible for the size/capacity of the deployed ceph cluster, probably based on its charm config.

That said, if it safe to use a default 1024 PG count for all ceph clusters, I would be okay with having cinder set this itself when creating its pool. But I think we both agree the replication count should stay unspecified until we can come up with a better way of setting that.

I'm in the process of testing this change.

review: Needs Fixing
Revision history for this message
James Page (james-page) wrote : Posted in a previous version of this proposal

I don't think 1024 is the minimum recommended pool size - see upstream docs:

 # Ensure you have a realistic number of placement groups. We recommend
 # approximately 100 per OSD. E.g., total number of OSDs multiplied by 100
 # divided by the number of replicas (i.e., osd pool default size). So for
 # 10 OSDs and osd pool default size = 3, we'd recommend approximately
 # (100 * 10) / 3 = 333.

The pool size needs to be configured based on the number of OSD's; the OSD list can be retrieved using:

 num_osds=$(ceph --id $SERVICE_NAME osd ls | wc -l)

This can then be used with the replica count to correctly configure the pool.

I agree that for ceph-next charm updates Adams suggestion is a good one.

review: Needs Fixing
Revision history for this message
Edward Hope-Morley (hopem) wrote : Posted in a previous version of this proposal

Fair enough, i'm sure I saw that value used in the past but your method based on the number of OSDs makes more sense. I'll adjust that. I'm not clear on what we want to do for the replication count though. What we have right now is a count of 2 and Adam mentions single node testing which would still be broken by a count of two since we replicate by host by default. I will remove the replication count set for now since it is not currenlty breaking anything but we really need the ability to modify this since production deployments will likely want a replication count of 3.

Revision history for this message
James Page (james-page) wrote : Posted in a previous version of this proposal

Ed

For the replica count - please add that as a configuration option to the charm with a default of 2; that should give you want you need.

Its worth noting in the config that you can't change this value post association.

Revision history for this message
Adam Gandelman (gandelman-a) wrote : Posted in a previous version of this proposal

Is there no way to determine replication count by quering the remote ceph cluster, similar to how you can determine PG number/OSD data? Im really worried we're creating a situation where we are loading charms with config that depends on the state of other services for things to work correctly. This introduces lots of potential for user error that could be avoided if this was handled for them in the interfaces. Eg, setting replication count on the cinder side to something crazy or something that just isn't supported on the ceph side.

Revision history for this message
Edward Hope-Morley (hopem) wrote : Posted in a previous version of this proposal

There are two ways to get replication sizes in uses:

1. sudo ceph --admin-daemon /var/run/ceph/ceph-mon.a.asok config show| grep osd_pool_default_size
2. ceph osd dump| grep "rep size"

The first method gives you the default value but it is only possible to execute that command on a node that is running a ceph daemon i.e. MON or OSD so this will not work in this case.

The second method can be run from the cinder node but, it gives the replication sizes in use for existing pools so (a) you assume that a pool already exists and (b) the sizes on existing pools may not be the default. This will not work either.

The only way to get the default would be to have the ceph node/service provide it when the relation is joined.

Revision history for this message
Adam Gandelman (gandelman-a) wrote : Posted in a previous version of this proposal

If the ceph relation simply passed on the configured monitor-count value to clients, would that be enough to make a sane decision on the cinder side about what to use for a replication size?

Revision history for this message
Adam Gandelman (gandelman-a) wrote : Posted in a previous version of this proposal

The required ceph changes have been merged to support this. I'm okay with merging this into the current charm now. Edward, can you please also file a merge against hte lp:~openstack-charmers/charms/precise/cinder/python-redux branch that implements the same hook and config changes? I've updated those branches with the latest charm-helpers, so the ceph helpers should support specifying the replica count, and it already calculates the PG number accordingly.

review: Approve
Revision history for this message
Edward Hope-Morley (hopem) wrote : Posted in a previous version of this proposal

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'config.yaml'
2--- config.yaml 2013-02-20 01:42:47 +0000
3+++ config.yaml 2013-09-24 17:14:14 +0000
4@@ -28,6 +28,16 @@
5 The *available* block device on which to create LVM volume group.
6 May also be set to None for deployments that will not need local
7 storage (eg, Ceph/RBD-backed volumes).
8+ ceph-osd-replication-count:
9+ default: 2
10+ type: int
11+ description: |
12+ This value dictates the number of replicas ceph must make of any
13+ object it stores withing the cinder rbd pool. Of course, this only
14+ applies if using Ceph as a backend store. Note that once the cinder
15+ rbd pool has been created, changing this value will not have any
16+ effect (although it can be changed in ceph by manually configuring
17+ your ceph cluster).
18 volume-group:
19 default: cinder-volumes
20 type: string
21
22=== modified file 'hooks/cinder-hooks'
23--- hooks/cinder-hooks 2013-05-22 00:53:45 +0000
24+++ hooks/cinder-hooks 2013-09-24 17:14:14 +0000
25@@ -202,7 +202,18 @@
26 if eligible_leader 'res_cinder_vip'; then
27 # Create the cinder pool if it does not already exist
28 if ! rados --id $SERVICE_NAME lspools | grep -q cinder; then
29- rados --id $SERVICE_NAME mkpool cinder
30+ local num_osds=$(ceph --id $SERVICE_NAME osd ls| egrep "[^\s]"| wc -l)
31+ local cfg_key='ceph-osd-replication-count'
32+ local rep_count="$(config-get $cfg_key)"
33+ if [ -z "$rep_count" ]
34+ then
35+ rep_count=2
36+ juju-log "config returned empty string for $cfg_key - using value of 2"
37+ fi
38+ local num_pgs=$(((num_osds*100)/rep_count))
39+ ceph --id $SERVICE_NAME osd pool create cinder $num_pgs $num_pgs
40+ ceph --id $SERVICE_NAME osd pool set cinder size $rep_count
41+ # TODO: set appropriate crush ruleset
42 fi
43 fi
44
45
46=== modified file 'revision'
47--- revision 2013-05-22 00:19:16 +0000
48+++ revision 2013-09-24 17:14:14 +0000
49@@ -1,1 +1,1 @@
50-27
51+28

Subscribers

People subscribed via source and target branches