Merge lp:~wallyworld/juju-core/environ-resource-catalogs into lp:~go-bot/juju-core/trunk

Proposed by Ian Booth
Status: Work in progress
Proposed branch: lp:~wallyworld/juju-core/environ-resource-catalogs
Merge into: lp:~go-bot/juju-core/trunk
Diff against target: 383 lines (+351/-0)
4 files modified
state/interface.go (+7/-0)
state/storage/interface.go (+16/-0)
state/storage/resourcecatalog.go (+158/-0)
state/storage/resourcecatalog_test.go (+170/-0)
To merge this branch: bzr merge lp:~wallyworld/juju-core/environ-resource-catalogs
Reviewer Review Type Date Requested Status
Juju Engineering Pending
Review via email: mp+221484@code.launchpad.net

Description of the change

New environ ResourceCatalog

A ResourceCatalog is added, which knows how to persist
resource records. A resource record contains a path and
a hash and is used to catalog the data blobs stored in
the recently added ResourceStorage. A category attribute
is also provided and will be used later for defining
resources are tools vs charms etc.

When a resource entry with the same path is saved, a new
record is not created but instead the existing one has
its ref count incremented. The opposite occurs with
removal. Removing the last record deletes it.

The above will be used by a new ManagedResource implemention
which will know how to store data for environs and users and
do things like de-duping using checksums etc.

https://codereview.appspot.com/104720046/

To post a comment you must log in.
Revision history for this message
Ian Booth (wallyworld) wrote :

Reviewers: mp+221484_code.launchpad.net,

Message:
Please take a look.

Description:
New environ ResourceCatalog

A ResourceCatalog is added, which knows how to persist
resource records. A resource record contains a path and
a hash and is used to catalog the data blobs stored in
the recently added ResourceStorage. A category attribute
is also provided and will be used later for defining
resources are tools vs charms etc.

When a resource entry with the same path is saved, a new
record is not created but instead the existing one has
its ref count incremented. The opposite occurs with
removal. Removing the last record deletes it.

The above will be used by a new ManagedResource implemention
which will know how to store data for environs and users and
do things like de-duping using checksums etc.

https://code.launchpad.net/~wallyworld/juju-core/environ-resource-catalogs/+merge/221484

(do not edit description out of merge proposal)

Please review this at https://codereview.appspot.com/104720046/

Affected files (+353, -0 lines):
   A [revision details]
   M state/interface.go
   M state/storage/interface.go
   A state/storage/resourcecatalog.go
   A state/storage/resourcecatalog_test.go

Revision history for this message
William Reade (fwereade) wrote :

Let's talk about this when you get back. I think it would be more
profitable to start by just straight-up implementing the Storage
interface, backed by gridfs, and leaving all considerations of
deduplication out.

https://codereview.appspot.com/104720046/diff/1/state/interface.go
File state/interface.go (right):

https://codereview.appspot.com/104720046/diff/1/state/interface.go#newcode197
state/interface.go:197: // MongoTransactionRunner instances execute
mongo transactions.
/me freaks out, and it's not pretty. I really don't think we should let
this implementation detail leak out of state.

https://codereview.appspot.com/104720046/diff/1/state/storage/interface.go
File state/storage/interface.go (right):

https://codereview.appspot.com/104720046/diff/1/state/storage/interface.go#newcode23
state/storage/interface.go:23: // Resources with the same Path value are
not duplicated; instead a reference count is incremented.
I think this is not quite at the correct level of abstraction. Path is a
distraction: it's the *content* that's important, and if the first cut
puts the emphasis on path I think it'll only make it harder to do the
deduplication right when we get to that phase.

https://codereview.appspot.com/104720046/diff/1/state/storage/resourcecatalog.go
File state/storage/resourcecatalog.go (right):

https://codereview.appspot.com/104720046/diff/1/state/storage/resourcecatalog.go#newcode52
state/storage/resourcecatalog.go:52: func NewResourceCatalog(collection
*mgo.Collection, txnRunner state.MongoTransactionRunner) ResourceCatalog
{
OK, I can see why you're doing this, but I still don't really like it.
It *could* be fixed by layering -- ie putting the runTransaction
functionality into "state/txn", and having both State and
ResourceCatalog use it -- but exposing it to other random clients of
state freaks me right out.

https://codereview.appspot.com/104720046/diff/1/state/storage/resourcecatalog.go#newcode93
state/storage/resourcecatalog.go:93: if ops, err :=
rc.resourceDecRefOps(id); err == mgo.ErrNotFound {
Hmm. Is this definitely a "you already did this", or is it perhaps
evidence that everything's totally screwed?

https://codereview.appspot.com/104720046/

Unmerged revisions

2811. By Ian Booth

New ResourceCatalog implementation

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'state/interface.go'
2--- state/interface.go 2014-03-26 00:24:30 +0000
3+++ state/interface.go 2014-05-30 06:23:54 +0000
4@@ -4,6 +4,8 @@
5 package state
6
7 import (
8+ "labix.org/v2/mgo/txn"
9+
10 "launchpad.net/juju-core/environs/config"
11 "launchpad.net/juju-core/instance"
12 "launchpad.net/juju-core/state/api/params"
13@@ -191,3 +193,8 @@
14 }
15
16 var _ InstanceIdGetter = (*Machine)(nil)
17+
18+// MongoTransactionRunner instances execute mongo transactions.
19+type MongoTransactionRunner interface {
20+ RunTransaction(ops []txn.Op) error
21+}
22
23=== modified file 'state/storage/interface.go'
24--- state/storage/interface.go 2014-05-29 05:35:37 +0000
25+++ state/storage/interface.go 2014-05-30 06:23:54 +0000
26@@ -18,3 +18,19 @@
27 // Remove deletes the data at the specified path.
28 Remove(path string) error
29 }
30+
31+// ResourceCatalog instances persist Resources.
32+// Resources with the same Path value are not duplicated; instead a reference count is incremented.
33+// Similarly, when a Resources is removed, the reference count is decremented. When the reference
34+// count reaches zero, the Resource is deleted.
35+type ResourceCatalog interface {
36+ // Get fetches a Resource with the given id.
37+ Get(id string) (*Resource, error)
38+
39+ // Put stores a Resource and returns its id.
40+ Put(r *Resource) (id string, err error)
41+
42+ // Remove decrements the reference count for a Resource with the given id, deleting it
43+ // if the reference count reaches zero.
44+ Remove(id string) error
45+}
46
47=== added file 'state/storage/resourcecatalog.go'
48--- state/storage/resourcecatalog.go 1970-01-01 00:00:00 +0000
49+++ state/storage/resourcecatalog.go 2014-05-30 06:23:54 +0000
50@@ -0,0 +1,158 @@
51+// Copyright 2014 Canonical Ltd.
52+// Licensed under the AGPLv3, see LICENCE file for details.
53+
54+package storage
55+
56+import (
57+ "github.com/juju/errors"
58+ "labix.org/v2/mgo"
59+ "labix.org/v2/mgo/bson"
60+ "labix.org/v2/mgo/txn"
61+
62+ "launchpad.net/juju-core/state"
63+)
64+
65+// Resource is a catalog entry for stored data.
66+type Resource struct {
67+ Category string
68+ Path string
69+ Hash string
70+}
71+
72+// resourceDoc is the persistent representation of a Resource.
73+type resourceDoc struct {
74+ Id bson.ObjectId `bson:"_id"`
75+ Category string
76+ Path string
77+ Hash string
78+ RefCount int64
79+}
80+
81+// resourceCatalog is a mongo backed ResourceCatalog instance.
82+type resourceCatalog struct {
83+ txnRunner state.MongoTransactionRunner
84+ collection *mgo.Collection
85+}
86+
87+var _ ResourceCatalog = (*resourceCatalog)(nil)
88+
89+// newResourceDoc constructs a resourceDoc from a Resource.
90+func newResourceDoc(r *Resource) resourceDoc {
91+ return resourceDoc{
92+ Id: bson.NewObjectId(),
93+ Category: r.Category,
94+ Path: r.Path,
95+ Hash: r.Hash,
96+ RefCount: 1,
97+ }
98+}
99+
100+// NewResourceCatalog creates a new ResourceCatalog using the transaction runner and
101+// storing resource entries in the mongo collection.
102+func NewResourceCatalog(collection *mgo.Collection, txnRunner state.MongoTransactionRunner) ResourceCatalog {
103+ return &resourceCatalog{
104+ txnRunner: txnRunner,
105+ collection: collection,
106+ }
107+}
108+
109+// Get is defined on the ResourceCatalog interface.
110+func (rc *resourceCatalog) Get(id string) (*Resource, error) {
111+ var doc resourceDoc
112+ objId := bson.ObjectIdHex(id)
113+ if err := rc.collection.FindId(objId).One(&doc); err == mgo.ErrNotFound {
114+ return nil, errors.NotFoundf("resource with id %q", id)
115+ } else if err != nil {
116+ return nil, err
117+ }
118+ return &Resource{
119+ Category: doc.Category,
120+ Path: doc.Path,
121+ Hash: doc.Hash,
122+ }, nil
123+}
124+
125+// Put is defined on the ResourceCatalog interface.
126+func (rc *resourceCatalog) Put(r *Resource) (string, error) {
127+ if id, ops, err := rc.resourceIncRefOps(r); err != nil {
128+ return "", err
129+ } else {
130+ for i := 0; i < 3; i++ {
131+ if err := rc.txnRunner.RunTransaction(ops); err == nil {
132+ return id, nil
133+ } else if err != txn.ErrAborted {
134+ return "", err
135+ }
136+ }
137+ }
138+ return "", state.ErrExcessiveContention
139+}
140+
141+// Remove is defined on the ResourceCatalog interface.
142+func (rc *resourceCatalog) Remove(id string) error {
143+ if ops, err := rc.resourceDecRefOps(id); err == mgo.ErrNotFound {
144+ return nil
145+ } else if err != nil {
146+ return err
147+ } else {
148+ for i := 0; i < 3; i++ {
149+ if err := rc.txnRunner.RunTransaction(ops); err == nil {
150+ return nil
151+ } else if err != txn.ErrAborted {
152+ return err
153+ }
154+ }
155+ }
156+ return state.ErrExcessiveContention
157+}
158+
159+func (rc *resourceCatalog) resourceIncRefOps(r *Resource) (string, []txn.Op, error) {
160+ var doc resourceDoc
161+ exists := false
162+ err := rc.collection.Find(bson.M{"path": r.Path}).One(&doc)
163+ if err != nil && err != mgo.ErrNotFound {
164+ return "", nil, err
165+ } else if err == nil {
166+ exists = true
167+ }
168+ if !exists {
169+ doc := newResourceDoc(r)
170+ return doc.Id.Hex(), []txn.Op{{
171+ C: rc.collection.Name,
172+ Id: doc.Id,
173+ Assert: txn.DocMissing,
174+ Insert: doc,
175+ }}, nil
176+ }
177+ // If we are incrementing the ref count, then the checksums must match.
178+ if r.Hash != doc.Hash {
179+ return "", nil, errors.Errorf("checksums do not match for resource %q: %s != %s", r.Path, r.Hash, doc.Hash)
180+ }
181+ return doc.Id.Hex(), []txn.Op{{
182+ C: rc.collection.Name,
183+ Id: doc.Id,
184+ Assert: txn.DocExists,
185+ Update: bson.D{{"$inc", bson.D{{"refcount", 1}}}},
186+ }}, nil
187+}
188+
189+func (rc *resourceCatalog) resourceDecRefOps(id string) ([]txn.Op, error) {
190+ var doc resourceDoc
191+ if err := rc.collection.FindId(bson.ObjectIdHex(id)).One(&doc); err != nil {
192+ return nil, err
193+ }
194+ if doc.RefCount == 1 {
195+ return []txn.Op{{
196+ C: rc.collection.Name,
197+ Id: doc.Id,
198+ Assert: bson.D{{"refcount", 1}},
199+ Remove: true,
200+ }}, nil
201+ }
202+ return []txn.Op{{
203+ C: rc.collection.Name,
204+ Id: doc.Id,
205+ Assert: bson.D{{"refcount", bson.D{{"$gt", 1}}}},
206+ Update: bson.D{{"$inc", bson.D{{"refcount", -1}}}},
207+ }}, nil
208+}
209
210=== added file 'state/storage/resourcecatalog_test.go'
211--- state/storage/resourcecatalog_test.go 1970-01-01 00:00:00 +0000
212+++ state/storage/resourcecatalog_test.go 2014-05-30 06:23:54 +0000
213@@ -0,0 +1,170 @@
214+// Copyright 2014 Canonical Ltd.
215+// Licensed under the AGPLv3, see LICENCE file for details.
216+
217+package storage_test
218+
219+import (
220+ "labix.org/v2/mgo"
221+ "labix.org/v2/mgo/bson"
222+ "labix.org/v2/mgo/txn"
223+ gc "launchpad.net/gocheck"
224+
225+ "launchpad.net/juju-core/state"
226+ "launchpad.net/juju-core/state/storage"
227+ "launchpad.net/juju-core/testing"
228+)
229+
230+var _ = gc.Suite(&resourceCatalogSuite{})
231+
232+type resourceCatalogSuite struct {
233+ testing.BaseSuite
234+ testing.MgoSuite
235+ rCatalog storage.ResourceCatalog
236+ collection *mgo.Collection
237+}
238+
239+func (s *resourceCatalogSuite) SetUpSuite(c *gc.C) {
240+ s.BaseSuite.SetUpSuite(c)
241+ s.MgoSuite.SetUpSuite(c)
242+}
243+
244+func (s *resourceCatalogSuite) TearDownSuite(c *gc.C) {
245+ s.MgoSuite.TearDownSuite(c)
246+ s.BaseSuite.TearDownSuite(c)
247+}
248+
249+type simpleTxnRunner struct {
250+ runner *txn.Runner
251+}
252+
253+var _ state.MongoTransactionRunner = (*simpleTxnRunner)(nil)
254+
255+func (tr *simpleTxnRunner) RunTransaction(ops []txn.Op) error {
256+ return tr.runner.Run(ops, "", nil)
257+}
258+
259+func (s *resourceCatalogSuite) SetUpTest(c *gc.C) {
260+ s.BaseSuite.SetUpTest(c)
261+ s.MgoSuite.SetUpTest(c)
262+ db := s.Session.DB("juju")
263+ s.collection = db.C("resourceCatalog")
264+ txnRunner := &simpleTxnRunner{txn.NewRunner(db.C("txns"))}
265+ s.rCatalog = storage.NewResourceCatalog(s.collection, txnRunner)
266+}
267+
268+func (s *resourceCatalogSuite) TearDownTest(c *gc.C) {
269+ s.MgoSuite.TearDownTest(c)
270+ s.BaseSuite.TearDownTest(c)
271+}
272+
273+func (s *resourceCatalogSuite) assertPut(c *gc.C, category, path, hash string) string {
274+ r := &storage.Resource{
275+ Category: category,
276+ Path: path,
277+ Hash: hash,
278+ }
279+ id, err := s.rCatalog.Put(r)
280+ c.Assert(err, gc.IsNil)
281+ c.Assert(id, gc.Not(gc.Equals), "")
282+ s.assertGet(c, r, id)
283+ return id
284+}
285+
286+func (s *resourceCatalogSuite) assertGet(c *gc.C, expected *storage.Resource, id string) {
287+ r, err := s.rCatalog.Get(id)
288+ c.Assert(err, gc.IsNil)
289+ c.Assert(*r, gc.DeepEquals, *expected)
290+}
291+
292+type resourceDoc struct {
293+ Id bson.ObjectId `bson:"_id"`
294+ RefCount int64
295+}
296+
297+func (s *resourceCatalogSuite) assertRefCount(c *gc.C, id string, expected int64) {
298+ var doc resourceDoc
299+ objId := bson.ObjectIdHex(id)
300+ err := s.collection.FindId(objId).One(&doc)
301+ c.Assert(err, gc.IsNil)
302+ c.Assert(doc.RefCount, gc.Equals, expected)
303+}
304+
305+func (s *resourceCatalogSuite) TestPut(c *gc.C) {
306+ id := s.assertPut(c, "cat", "/path/to/file", "1234567890")
307+ s.assertRefCount(c, id, 1)
308+}
309+
310+func (s *resourceCatalogSuite) TestPutSamePathIncRefCount(c *gc.C) {
311+ id := s.assertPut(c, "cat", "/path/to/file", "1234567890")
312+ s.assertPut(c, "cat", "/path/to/file", "1234567890")
313+ s.assertRefCount(c, id, 2)
314+}
315+
316+func (s *resourceCatalogSuite) TestPutSamePathMismatchedChecksums(c *gc.C) {
317+ s.assertPut(c, "cat", "/path/to/file", "1234567890")
318+ r := &storage.Resource{
319+ Category: "cat",
320+ Path: "/path/to/file",
321+ Hash: "abcdef",
322+ }
323+ _, err := s.rCatalog.Put(r)
324+ c.Assert(err, gc.ErrorMatches, `checksums do not match .*`)
325+}
326+
327+func (s *resourceCatalogSuite) TestGetNonExistent(c *gc.C) {
328+ _, err := s.rCatalog.Get(bson.NewObjectId().Hex())
329+ c.Assert(err, gc.ErrorMatches, `resource with id ".*" not found`)
330+}
331+
332+func (s *resourceCatalogSuite) TestGet(c *gc.C) {
333+ r := &storage.Resource{
334+ Category: "cat",
335+ Path: "/path/to/file",
336+ Hash: "12345",
337+ }
338+ id, err := s.rCatalog.Put(r)
339+ c.Assert(err, gc.IsNil)
340+ s.assertGet(c, r, id)
341+}
342+
343+func (s *resourceCatalogSuite) TestRemoveOnlyRecord(c *gc.C) {
344+ id := s.assertPut(c, "cat", "/path/to/file", "1234567890")
345+ err := s.rCatalog.Remove(id)
346+ c.Assert(err, gc.IsNil)
347+ _, err = s.rCatalog.Get(id)
348+ c.Assert(err, gc.ErrorMatches, `resource with id ".*" not found`)
349+}
350+
351+func (s *resourceCatalogSuite) TestRemoveDecRefCount(c *gc.C) {
352+ id := s.assertPut(c, "cat", "/path/to/file", "1234567890")
353+ s.assertPut(c, "cat", "/path/to/file", "1234567890")
354+ s.assertRefCount(c, id, 2)
355+ err := s.rCatalog.Remove(id)
356+ c.Assert(err, gc.IsNil)
357+ s.assertRefCount(c, id, 1)
358+ // The record still exists.
359+ r := &storage.Resource{
360+ Category: "cat",
361+ Path: "/path/to/file",
362+ Hash: "1234567890",
363+ }
364+ s.assertGet(c, r, id)
365+}
366+
367+func (s *resourceCatalogSuite) TestRemoveLastCopy(c *gc.C) {
368+ id := s.assertPut(c, "cat", "/path/to/file", "1234567890")
369+ s.assertPut(c, "cat", "/path/to/file", "1234567890")
370+ s.assertRefCount(c, id, 2)
371+ err := s.rCatalog.Remove(id)
372+ c.Assert(err, gc.IsNil)
373+ s.assertRefCount(c, id, 1)
374+ err = s.rCatalog.Remove(id)
375+ c.Assert(err, gc.IsNil)
376+ _, err = s.rCatalog.Get(id)
377+ c.Assert(err, gc.ErrorMatches, `resource with id ".*" not found`)
378+}
379+
380+func (s *resourceCatalogSuite) TestRemoveNonExistent(c *gc.C) {
381+ err := s.rCatalog.Remove(bson.NewObjectId().Hex())
382+ c.Assert(err, gc.IsNil)
383+}

Subscribers

People subscribed via source and target branches

to status/vote changes: