1
=== added file 'docs/source/drafts/security.rst'
2
--- docs/source/drafts/security.rst	1970-01-01 00:00:00 +0000
3
+++ docs/source/drafts/security.rst	2012-06-22 15:29:20 +0000
4
@@ -0,0 +1,260 @@
5
1
Security Overview
6
2
=================
7
3
8
4
Ensemble is committed to providing a reliable secure mechanism for
9
5
deploying services. What follows is an overview of these different
10
6
mechanisms and how they contribute to keeping an ensemble environment
11
7
secure.
12
8
13
9
Glossary
14
10
--------
15
11
16
12
First a glossary of terms used in this document.
17
13
18
14
19
15
Principal
20
16
~~~~~~~~~
21
17
22
18
A principal in the context of ensemble can represent any actor or
23
19
group of actors within the system. Each principal is authenticated via
24
20
a user name/principal id and password. For example a service unit
25
21
agent would associate its own unique principal information to its
26
22
connection, and would thus have access to all nodes that have their
27
23
ACL mapping explicitly giving access to the node.
28
24
29
25
Token Database
30
26
~~~~~~~~~~~~~~
31
27
32
28
A mapping of principal id to their acl identity token. The identity
33
29
token is a md5 checksum of username/password prefixed of the form
34
30
username:identity_scheme:checksum, where identity_scheme is md5 in the
35
31
case of ensemble. The mapping is stored in a zookeeper node which is
36
32
world readable but only writable by the security (ticket granting)
37
33
agent, which is responsible for creating principals.
38
34
39
35
Zookeeper ACLs
40
36
~~~~~~~~~~~~~~
41
37
42
38
Ensemble relies on the security facilities provided by the zookeeper's
43
39
coordination storage, whereby zookeeper automatically restricts access
44
40
to each node, based on the ACL permission map on each node.  This ACL
45
41
facility maps permissions to principal identity tokens.  Zookeeper
46
42
provides permissions for read, write, delete, create, and admin access
47
43
to each node. Every zookeeper connection can associate principal
48
44
credentials to its connection, and all access by that connection is
49
45
validated against the per node ACL mapping.
50
46
51
47
52
48
Additional documentation available here.
53
49
http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#sc_ZooKeeperAccessControl
54
50
55
51
56
52
Security Agent
57
53
~~~~~~~~~~~~~~
58
54
59
55
An additional zookeeper connected actor responsible for creating principals
60
56
and providing an up to date token database.
61
57
62
58
The security agent manages a token database (definition to follow),
63
59
and provides for the creation of new principals and handing out their
64
60
hash tokens to inquiring parties.
65
61
66
62
Security Policy
67
63
~~~~~~~~~~~~~~~
68
64
69
65
Each actor employs a security policy, to determine the ACL map for a given
70
66
node path that may create. The policy simply takes the path to the node
71
67
to be created, and returns back an ACL map that can be set on the node.
72
68
73
69
74
70
Identity
75
71
--------
76
72
77
73
How the system passes credentials to an actor is a critical aspect to
78
74
managing principals securely. Every actor in the system needs its own
79
75
unique principal, to provide an auth identity, the credentials for a
80
76
principal are known only to the actor utilizing them and transiently
81
77
the security agent when they are created.
82
78
83
79
Instead of passing principals credentials directly via insecure
84
80
channels, an actor creating another actor also establishes a principal
85
81
creation token via the security agent. The principal creation token is
86
82
a one time use string which can be used to create a principal and its
87
83
password, and update the token database.
88
84
89
85
The security agent has a simple policy in place regarding principal
90
86
names and which actors can create them, ie. a provisioning agent can
91
87
create machine principals, but not service unit principals.
92
88
93
89
If a malicious user intercepts the token and uses it, compared with
94
90
passing credentials directly it minimizes the time that a third party
95
91
has to perform such an interception. Moreover invalid use of a token
96
92
can be logged as foresenic information.
97
93
98
94
One question that emerges with the use of a separate agent for creating
99
95
identities, is how agents needed for bootstrap recieve their credentials.
100
96
101
97
 - The bootstrap can utilize a specialized OTP interface with a precreated
102
98
   known value, which it can use to initialize the tree.
103
99
104
100
Transport level security
105
101
------------------------
106
102
107
103
As zookeeper does not currently support SSL/TLS transport level
108
104
security, Ensemble utilizes SSH port forwarding to ensure encrypted
109
105
communications to zookeeper. One significant lacking to this approach,
110
106
is that any process on the set of ensemble machines can attempt to
111
107
connect zookeeper to brute force principal passwords.
112
108
113
109
See Also Alternatives#NodeEncryption
114
110
115
111
Privileged Data
116
112
---------------
117
113
118
114
Certain data stored within zookeeper, is by its nature privileged and
119
115
should only be shared with agents requiring it for their function. For
120
116
example the Ensemble provider credentials should only be exposed to
121
117
the provisioning agent, as its required for it to function, any
122
118
additional access to the data, would be regarded as a data escalation
123
119
vulnerability.
124
120
125
121
Additionally services utilize relations to communicate with each
126
122
other, every service unit of the services participating within a
127
123
relation gets write access only to its own node within the relation,
128
124
and has read access to all service unit relation settings. An
129
125
unrelated service unit from a different service, is not allowed to
130
126
read any settings from the relation.
131
127
132
128
133
129
Data Security
134
130
-------------
135
131
136
132
A pub key/priv key can be associated to the OTP to 
137
133
138
134
139
135
OTP Security
140
136
------------
141
137
142
138
The otp is not secure without an additional enforcement, as it does
143
139
not exist as a native capability of the zk interface. An additional
144
140
actor responsible for creating identities and processing OTP tokens
145
141
would be an alternative (See futures).
146
142
147
143
148
144
Relations attacks
149
145
-----------------
150
146
151
147
Ensemble is comprised of a number of actors connecting to and
152
148
communicating via a shared storage. When two services enter into a
153
149
relation, a private bidirectional channel is created for them to
154
150
exchange data.
155
151
156
152
Ensemble ensures that the zookeeper nodes used for this communication
157
153
are subject to the proper ACL constraints such that unrelated services
158
154
are unable to access them.
159
155
160
156
But these relations represent adhoc inter machine communication, which
161
157
are formula defined. A malicious agent could possibly abuse one of
162
158
these protocols to further compromise additional agents. Unlike other
163
159
attack vectors in ensemble, this is one that ensemble can only make
164
160
minimal safety guarantees regarding, outside of perhaps a simple
165
161
validation of relation data (currently treated as a binary blob) with
166
162
relation type associated schemas.
167
163
168
164
The formulas executed by the unit agent provide for user executed code
169
165
done within an lxc container (with root privileges). LXC provides
170
166
limited support for security against root in a container, so a
171
167
container compromise can escalate to a machine level compromise and
172
168
those of the other units on a machine.
173
169
174
170
175
171
Privilege Escalation Scenarios
176
172
------------------------------
177
173
178
174
We have serveral different levels of escalation within ensemble for
179
175
malicious code that need to be considered.
180
176
181
177
container escalation
182
178
++++++++++++++++++++
183
179
184
180
All formula hooks are executed within an lxc container to give a
185
181
minimally isolated environment. This lxc container is rather trivially
186
182
exploitable to gain root access on the machine, as formulas execute
187
183
as root within the container and lxc provides minimal security guarantees
188
184
atm, which leads to the next escalation level.
189
185
190
186
Future work is needed to provide better security around lxc
191
187
integration, perhaps via integration of apparmor and ongoing lxc
192
188
isolation work.
193
189
194
190
Machine escalation
195
191
++++++++++++++++++
196
192
197
193
A machine is considered compromised if malicious code has root access
198
194
on the machine, all service units colocated on the machine are also
199
195
considered compromised if this occurs.
200
196
201
197
Agent Escalation
202
198
++++++++++++++++
203
199
204
200
An agent is considered compromised if malicious code has an open zookeeper
205
201
connection with a valid actor principal identity. The malicious code
206
202
has access to all data exposed via ACL to the compromised identity.
207
203
208
204
Beyond these generic scenarios we have particular escalations which
209
205
are effectively fatal, as they entail access to sensitive data that
210
206
spans the ensemble environment or machine provider.
211
207
212
208
A bootstrap machine compromise which allow for disk access could be
213
209
considered fatal as the Ensemble shared state (zookeeper) data is
214
210
resident on disk.
215
211
216
212
Certain agents like the provisioning agent, compromise of whose identity
217
213
would allow malicious code to utilize the machine provider credentials.
218
214
219
215
220
216
Access to Deployed services
221
217
----------------------------
222
218
223
219
A plan for controlled public access to deployed services is provided
224
220
separately by the expose-services specification.
225
221
226
222
Currently all internal access within a machine provider environment
227
223
like ec2 is unfiltered.
228
224
229
225
In future we should have machine level firewalling to allow access
230
226
between services based on their relations.
231
227
232
228
Alternatives
233
229
------------
234
230
235
231
236
232
Node Encryption
237
233
~~~~~~~~~~~~~~~
238
234
239
235
Principal Agent
240
236
~~~~~~~~~~~~~~~
241
237
242
238
A security agent responsible for 
243
239
 for transport security via node encryption.
244
240
245
241
Next Steps
246
242
----------
247
243
248
244
SSH Host Identity Checks
249
245
250
246
we should pull the ssh key of the machine into zk, so connections to a
251
247
given machine can verify against valid keys of environment machines
252
248
253
249
Formula Storage URLs
254
250
255
251
Currently the formula storage access is referenced by a storage key
256
252
which is retrieved via the machine provider storage interface. This
257
253
requires access to the machine provider credentials by Formula Storage
258
254
by machine agents, which they shouldn't need.
259
255
260
256
- Security Agent & Token Database
261
257
- Security Policy (Path Based ACL generator)
262
258
- Connections w/ Principal
263
259
264
260
Reviewer	Review Type	Date Requested	Status
Gustavo Niemeyer		2011-06-08	Needs Fixing on 2011-06-09
Review via email: mp+63921@code.launchpad.net