Merge lp:~thisfred/u1db/doc-fixes into lp:u1db

Proposed by Eric Casteleijn
Status: Merged
Approved by: Eric Casteleijn
Approved revision: 394
Merged at revision: 391
Proposed branch: lp:~thisfred/u1db/doc-fixes
Merge into: lp:u1db
Diff against target: 139 lines (+43/-27)
2 files modified
html-docs/conflicts.rst (+41/-27)
html-docs/high-level-api.rst (+2/-0)
To merge this branch: bzr merge lp:~thisfred/u1db/doc-fixes
Reviewer Review Type Date Requested Status
Stuart Langridge (community) Approve
Review via email: mp+120825@code.launchpad.net

Commit message

-fixed and clarified documentation

Description of the change

-fixed and clarified documentation

To post a comment you must log in.
lp:~thisfred/u1db/doc-fixes updated
392. By Eric Casteleijn

moar fixes

393. By Eric Casteleijn

s/can/should/

Revision history for this message
Stuart Langridge (sil) :
review: Approve
lp:~thisfred/u1db/doc-fixes updated
394. By Eric Casteleijn

footnotes sections

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'html-docs/conflicts.rst'
2--- html-docs/conflicts.rst 2012-08-22 12:50:24 +0000
3+++ html-docs/conflicts.rst 2012-08-22 17:05:21 +0000
4@@ -87,8 +87,8 @@
5 Synchronisation between two u1db replicas consists of the following steps:
6
7 1. The source replica asks the target replica for the information it has
8- stored about the last time these two replicas were synchronised (If
9- ever.)
10+ stored about the last time these two replicas were synchronised (if
11+ ever).
12
13 2. The source replica validates that its information regarding the last
14 synchronisation is consistent with the target's information, and
15@@ -131,15 +131,19 @@
16 * The generation and transaction id of *this* replica at the time of the
17 most recent succesfully completed synchronisation with the other replica.
18
19-The generation is a counter that increases with each change to the database.
20-The transaction id is a unique random string that is paired with a particular
21-generation to identify cases where one of the replicas has been copied or
22-reverted to an earlier state by a restore from backup, and then diverged from
23-the known state on the other side of the synchronisation.
24+Any change to any document in a database constitutes a transaction. Each
25+transaction increases the database generation by 1, and u1db implementations
26+should [#]_ assign a transaction id, which is meant to be a unique random string
27+paired with each generation, that can be used to detect the case where replica
28+A and replica B have previously synchronised at generation N, and subsequently
29+replica B is somehow reverted to an earlier generation (say, a restore from
30+backup, or somebody made a copy of the database file of replica B at generation
31+< N, and tries to synchronise that), and then new changes are made to it. It
32+could end up at generation N again, but with completely different data. Having
33+random unique transaction ids will allow replica A to detect this situation,
34+and refuse to synchronise to prevent data loss. (Lesson to be learned from
35+this: do not copy databases around, that is what synchronisation is for.)
36
37-Implementations are not required to use transaction ids. If they don't they
38-should return an empty string when asked for a transaction id. All
39-implementations should accept an empty string as a valid transaction id.
40
41 Synchronisation Over HTTP
42 -------------------------
43@@ -193,8 +197,9 @@
44 {"last_known_generation": 12, "last_known_trans_id": "T-39299sdsfla8"},\r\n
45
46 and then for each document that it has changes for that are more recent
47- than generation 23, it sends, on a single line, followed by a comma and
48- a newline character, the following JSON object::
49+ than generation 23, ordered by generation in ascending order, it sends,
50+ on a single line, followed by a comma and a newline character, the
51+ following JSON object::
52
53 {"id": "mydocid", "rev": "my_replica_uid:4", "content": "{}", "generation": 48, "trans_id": "T-88djlahhhd"},\r\n
54
55@@ -217,8 +222,8 @@
56 which tells the source what the target's new generation and transaction
57 id are, now that it processed the changes it received from the source.
58 Then it starts streaming *its* changes since its last generation that
59- was synced (12 in this case,) in exactly the same format as the source
60- did in step 3.
61+ was synced (12 in this case), in exactly the same format (and order) as
62+ the source did in step 3.
63
64 5. When the source has processed all the changes it received from the
65 target, *and* it detects that there have been no changes to its database
66@@ -245,34 +250,34 @@
67 If you are writing a new u1db implementation, understanding revisions is
68 important, and this is where you find out about them.
69
70-To keep track of document revisions u1db uses vector versions. Each
71+To keep track of document revisions u1db uses vector clocks. Each
72 synchronised instance of the same database is called a replica and has a unique
73 identifier (``replica uid``) assigned to it (currently the reference
74 implementation by default uses UUID4s for that); a revision is a mapping
75-between ``replica uids`` and ``generations``, as follows: ``rev
76-= <replica_uid:generation...>``, or using a functional notation
77-``rev(replica_uid) = generation``. The current concrete format is a string
78+between ``replica uids`` and ``revisions``, as follows: ``rev
79+= <replica_uid:revision...>``, or using a functional notation
80+``rev(replica_uid) = revision``. The current concrete format is a string
81 built out of each ``replica_uid`` concatenated with ``':'`` and with its
82-generation in decimal, sorted lexicographically by ``replica_uid`` and then all
83+revision in decimal, sorted lexicographically by ``replica_uid`` and then all
84 joined with ``'|'``, for example: ``'replicaA:1|replicaB:3'`` . Absent
85-``replica uids`` in a revision mapping are implicitly mapped to generation 0.
86+``replica uids`` in a revision mapping are implicitly mapped to revison 0.
87
88 The new revision of a document modified locally in a replica, is the
89-modification of the old revision where the generation mapped to the editing
90+modification of the old revision where the revision mapped to the editing
91 ``replica uid`` is increased by 1.
92
93 When syncing one needs to establish whether an incoming revision is newer than
94 the current one or in conflict. A revision
95
96-``rev1 = <replica_1i:generation1i|i=1..n>``
97+``rev1 = <replica_1i:revision1i|i=1..n>``
98
99 is newer than a different
100
101-``rev2 = <replica_2j:generation2j|j=1..m>``
102-
103-if for all ``i=1..n``, ``rev2(replica_1i) <= generation1i``
104-
105-and for all ``j=1..m``, ``rev1(replica_2j) >= generation2j``.
106+``rev2 = <replica_2j:revision2j|j=1..m>``
107+
108+if for all ``i=1..n``, ``rev2(replica_1i) <= revision1i``
109+
110+and for all ``j=1..m``, ``rev1(replica_2j) >= revision2j``.
111
112 Two revisions which are not equal nor one newer than the other are in conflict.
113
114@@ -285,3 +290,12 @@
115 ``rev_resol(r) = max(rev1(r)...revN(r))`` for all ``r`` in ``R``, with ``r != rev_resol``
116
117 ``rev_resol(replica_resol) = max(rev1(replica_resol)...revN(replica_resol))+1``
118+
119+.. rubric:: footnotes
120+
121+.. [#] Implementations are not required to use transaction ids. If they don't
122+ they should return an empty string when asked for a transaction id. All
123+ implementations should accept an empty string as a valid transaction id.
124+ We suggest to implement transaction ids where possible though, since
125+ omitting them can lead to data loss in scenarios like the ones described
126+ above.
127
128=== modified file 'html-docs/high-level-api.rst'
129--- html-docs/high-level-api.rst 2012-08-21 18:46:01 +0000
130+++ html-docs/high-level-api.rst 2012-08-22 17:05:21 +0000
131@@ -381,6 +381,8 @@
132 * :py:meth:`~u1db.Database.get_doc_conflicts`
133 * :py:meth:`~u1db.Database.resolve_doc`
134
135+.. rubric:: footnotes
136+
137 .. [#] Alternatively if a factory function was passed into
138 :py:func:`u1db.open`, :py:meth:`~u1db.Database.get_doc` will return
139 whatever type of object the factory function returns.

Subscribers

People subscribed via source and target branches