Merge lp:~spiv/bzr/insert-stream-check-chks-part-2 into lp:bzr/2.0
- insert-stream-check-chks-part-2
- Merge into 2.0
Status: | Merged | ||||
---|---|---|---|---|---|
Merged at revision: | not available | ||||
Proposed branch: | lp:~spiv/bzr/insert-stream-check-chks-part-2 | ||||
Merge into: | lp:bzr/2.0 | ||||
Diff against target: | None lines | ||||
To merge this branch: | bzr merge lp:~spiv/bzr/insert-stream-check-chks-part-2 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Martin Pool | Needs Fixing | ||
Review via email: mp+11290@code.launchpad.net |
Commit message
Description of the change
Andrew Bennetts (spiv) wrote : | # |
Robert Collins (lifeless) wrote : | # |
So a few things.
Firstly, check is trying to move away from raising a single error at the first sign of trouble; raising BzrCheckError in code check uses takes that back a few steps. You could pass down a flag asking for an exception, or something. This occurs at least twice. I may be misreading though - the code in question might not be used by check at all.
+ repo = self.make_
255 + if not repo._format.
256 + raise TestNotApplicab
Please move these tests to per_repository_chks
Other than that it looks plausible to me.
Martin Pool (mbp) wrote : | # |
It seems like you should have a news entry before you merge.
=== modified file 'bzrlib/
--- bzrlib/
+++ bzrlib/
@@ -1182,6 +1182,12 @@
+ def without_
OK, it's fairly obvious, but a docstring would still be good.
+ gcvf = GroupCompressVe
+ self._index, self._access, self._delta)
+ gcvf._unadded_refs = dict(self.
+ return gcvf
+
It seems a bit potentially problematic to poke this in post
construction, if someone for instance changes the constructor so that it
does actually start opening things. How about instead adding a private
parameter to the constructor to pass this in?
def add_lines(self, key, parents, lines, parent_texts=None,
=== modified file 'bzrlib/
--- bzrlib/
+++ bzrlib/
@@ -591,8 +591,6 @@
:returns: set of missing keys. Note that not every missing key is
"""
- if getattr(self.repo, 'chk_bytes', None) is None:
- return set()
# Ensure that all revisions added in this write group have:
# - corresponding inventories,
# - chk root entries for those inventories,
@@ -603,6 +601,7 @@
+ no_fallback_
# Are any inventories for corresponding to the new revisions missing?
@@ -610,7 +609,9 @@
if missing_
- return [('inventories', key) for key in missing_
+ raise errors.
+ "Repository %s missing inventories for new revisions %r "
+ % (self.repo, sorted(
As Robert says, it seems like a step backwards to be raising an
exception directly rather than returning a list of problems - is there a
reason why you have to?
# Are any chk root entries missing for any inventories? This includes
# any present parent inventories, which may be used when calculating
# deltas for streaming.
@@ -620,17 +621,57 @@
# Filter out ghost parents.
+ parent_
+ corresponding_invs)
inv_ids = [key[-1] for key in all_inv_keys]
...
Martin Pool (mbp) wrote : | # |
that status is meant to mean 'tweak', but feel free to call me to talk about it.
Preview Diff
1 | === modified file 'bzrlib/groupcompress.py' |
2 | --- bzrlib/groupcompress.py 2009-09-04 07:39:08 +0000 |
3 | +++ bzrlib/groupcompress.py 2009-09-07 05:52:04 +0000 |
4 | @@ -1182,6 +1182,12 @@ |
5 | self._group_cache = LRUSizeCache(max_size=50*1024*1024) |
6 | self._fallback_vfs = [] |
7 | |
8 | + def without_fallbacks(self): |
9 | + gcvf = GroupCompressVersionedFiles( |
10 | + self._index, self._access, self._delta) |
11 | + gcvf._unadded_refs = dict(self._unadded_refs) |
12 | + return gcvf |
13 | + |
14 | def add_lines(self, key, parents, lines, parent_texts=None, |
15 | left_matching_blocks=None, nostore_sha=None, random_id=False, |
16 | check_content=True): |
17 | |
18 | === modified file 'bzrlib/repofmt/groupcompress_repo.py' |
19 | --- bzrlib/repofmt/groupcompress_repo.py 2009-09-07 03:00:23 +0000 |
20 | +++ bzrlib/repofmt/groupcompress_repo.py 2009-09-07 08:39:36 +0000 |
21 | @@ -591,8 +591,6 @@ |
22 | :returns: set of missing keys. Note that not every missing key is |
23 | guaranteed to be reported. |
24 | """ |
25 | - if getattr(self.repo, 'chk_bytes', None) is None: |
26 | - return set() |
27 | # Ensure that all revisions added in this write group have: |
28 | # - corresponding inventories, |
29 | # - chk root entries for those inventories, |
30 | @@ -603,6 +601,7 @@ |
31 | new_revisions_keys = key_deps.get_new_keys() |
32 | no_fallback_inv_index = self.repo.inventories._index |
33 | no_fallback_chk_bytes_index = self.repo.chk_bytes._index |
34 | + no_fallback_texts_index = self.repo.texts._index |
35 | inv_parent_map = no_fallback_inv_index.get_parent_map( |
36 | new_revisions_keys) |
37 | # Are any inventories for corresponding to the new revisions missing? |
38 | @@ -610,7 +609,9 @@ |
39 | missing_corresponding = set(new_revisions_keys) |
40 | missing_corresponding.difference_update(corresponding_invs) |
41 | if missing_corresponding: |
42 | - return [('inventories', key) for key in missing_corresponding] |
43 | + raise errors.BzrCheckError( |
44 | + "Repository %s missing inventories for new revisions %r " |
45 | + % (self.repo, sorted(missing_corresponding))) |
46 | # Are any chk root entries missing for any inventories? This includes |
47 | # any present parent inventories, which may be used when calculating |
48 | # deltas for streaming. |
49 | @@ -620,17 +621,57 @@ |
50 | # Filter out ghost parents. |
51 | all_inv_keys.intersection_update( |
52 | no_fallback_inv_index.get_parent_map(all_inv_keys)) |
53 | + parent_invs_only_keys = all_inv_keys.symmetric_difference( |
54 | + corresponding_invs) |
55 | all_missing = set() |
56 | inv_ids = [key[-1] for key in all_inv_keys] |
57 | - for inv in self.repo.iter_inventories(inv_ids, 'unordered'): |
58 | - root_keys = set([inv.id_to_entry.key()]) |
59 | - if inv.parent_id_basename_to_file_id is not None: |
60 | - root_keys.add(inv.parent_id_basename_to_file_id.key()) |
61 | - present = no_fallback_chk_bytes_index.get_parent_map(root_keys) |
62 | - missing = root_keys.difference(present) |
63 | - all_missing.update([('chk_bytes',) + key for key in missing]) |
64 | - return all_missing |
65 | - |
66 | + parent_invs_only_ids = [key[-1] for key in parent_invs_only_keys] |
67 | + root_key_info = _build_interesting_key_sets( |
68 | + self.repo, inv_ids, parent_invs_only_ids) |
69 | + expected_chk_roots = root_key_info.all_keys() |
70 | + present_chk_roots = no_fallback_chk_bytes_index.get_parent_map( |
71 | + expected_chk_roots) |
72 | + missing_chk_roots = expected_chk_roots.difference(present_chk_roots) |
73 | + if missing_chk_roots: |
74 | + # Don't bother checking any further. |
75 | + raise errors.BzrCheckError( |
76 | + "Repository %s missing chk root keys %r for new revisions" |
77 | + % (self.repo, sorted(missing_chk_roots))) |
78 | + # Find all interesting chk_bytes records, and make sure they are |
79 | + # present, as well as the text keys they reference. |
80 | + chk_bytes_no_fallbacks = self.repo.chk_bytes.without_fallbacks() |
81 | + chk_bytes_no_fallbacks._search_key_func = \ |
82 | + self.repo.chk_bytes._search_key_func |
83 | + chk_diff = chk_map.iter_interesting_nodes( |
84 | + chk_bytes_no_fallbacks, root_key_info.interesting_root_keys, |
85 | + root_key_info.uninteresting_root_keys) |
86 | + bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key |
87 | + text_keys = set() |
88 | + try: |
89 | + for record in _filter_text_keys(chk_diff, text_keys, bytes_to_info): |
90 | + pass |
91 | + except errors.NoSuchRevision, e: |
92 | + # XXX: It would be nice if we could give a more precise error here. |
93 | + raise errors.BzrCheckError( |
94 | + "Repository %s missing chk node(s) for new revisions." |
95 | + % (self.repo,)) |
96 | + chk_diff = chk_map.iter_interesting_nodes( |
97 | + chk_bytes_no_fallbacks, root_key_info.interesting_pid_root_keys, |
98 | + root_key_info.uninteresting_pid_root_keys) |
99 | + try: |
100 | + for interesting_rec, interesting_map in chk_diff: |
101 | + pass |
102 | + except errors.NoSuchRevision, e: |
103 | + raise errors.BzrCheckError( |
104 | + "Repository %s missing chk node(s) for new revisions." |
105 | + % (self.repo,)) |
106 | + present_text_keys = no_fallback_texts_index.get_parent_map(text_keys) |
107 | + missing_text_keys = text_keys.difference(present_text_keys) |
108 | + if missing_text_keys: |
109 | + raise errors.BzrCheckError( |
110 | + "Repository %s missing text keys %r for new revisions" |
111 | + % (self.repo, sorted(missing_text_keys))) |
112 | + |
113 | def _execute_pack_operations(self, pack_operations, |
114 | _packer_class=GCCHKPacker, |
115 | reload_func=None): |
116 | @@ -898,17 +939,12 @@ |
117 | parent_keys) |
118 | present_parent_inv_ids = set( |
119 | [k[-1] for k in present_parent_inv_keys]) |
120 | - uninteresting_root_keys = set() |
121 | - interesting_root_keys = set() |
122 | inventories_to_read = set(revision_ids) |
123 | inventories_to_read.update(present_parent_inv_ids) |
124 | - for inv in self.iter_inventories(inventories_to_read): |
125 | - entry_chk_root_key = inv.id_to_entry.key() |
126 | - if inv.revision_id in present_parent_inv_ids: |
127 | - uninteresting_root_keys.add(entry_chk_root_key) |
128 | - else: |
129 | - interesting_root_keys.add(entry_chk_root_key) |
130 | - |
131 | + root_key_info = _build_interesting_key_sets( |
132 | + self, inventories_to_read, present_parent_inv_ids) |
133 | + interesting_root_keys = root_key_info.interesting_root_keys |
134 | + uninteresting_root_keys = root_key_info.uninteresting_root_keys |
135 | chk_bytes = self.chk_bytes |
136 | for record, items in chk_map.iter_interesting_nodes(chk_bytes, |
137 | interesting_root_keys, uninteresting_root_keys, |
138 | @@ -1048,13 +1084,10 @@ |
139 | bytes_to_info = inventory.CHKInventory._bytes_to_utf8name_key |
140 | chk_bytes = self.from_repository.chk_bytes |
141 | def _filter_id_to_entry(): |
142 | - for record, items in chk_map.iter_interesting_nodes(chk_bytes, |
143 | - self._chk_id_roots, uninteresting_root_keys): |
144 | - for name, bytes in items: |
145 | - # Note: we don't care about name_utf8, because we are always |
146 | - # rich-root = True |
147 | - _, file_id, revision_id = bytes_to_info(bytes) |
148 | - self._text_keys.add((file_id, revision_id)) |
149 | + interesting_nodes = chk_map.iter_interesting_nodes(chk_bytes, |
150 | + self._chk_id_roots, uninteresting_root_keys) |
151 | + for record in _filter_text_keys(interesting_nodes, self._text_keys, |
152 | + bytes_to_info): |
153 | if record is not None: |
154 | yield record |
155 | # Consumed |
156 | @@ -1098,7 +1131,7 @@ |
157 | missing_inventory_keys.add(key[1:]) |
158 | if self._chk_id_roots or self._chk_p_id_roots: |
159 | raise AssertionError('Cannot call get_stream_for_missing_keys' |
160 | - ' untill all of get_stream() has been consumed.') |
161 | + ' until all of get_stream() has been consumed.') |
162 | # Yield the inventory stream, so we can find the chk stream |
163 | # Some of the missing_keys will be missing because they are ghosts. |
164 | # As such, we can ignore them. The Sink is required to verify there are |
165 | @@ -1111,6 +1144,54 @@ |
166 | yield stream_info |
167 | |
168 | |
169 | +class _InterestingKeyInfo(object): |
170 | + def __init__(self): |
171 | + self.interesting_root_keys = set() |
172 | + self.interesting_pid_root_keys = set() |
173 | + self.uninteresting_root_keys = set() |
174 | + self.uninteresting_pid_root_keys = set() |
175 | + |
176 | + def all_interesting(self): |
177 | + return self.interesting_root_keys.union(self.interesting_pid_root_keys) |
178 | + |
179 | + def all_uninteresting(self): |
180 | + return self.uninteresting_root_keys.union( |
181 | + self.uninteresting_pid_root_keys) |
182 | + |
183 | + def all_keys(self): |
184 | + return self.all_interesting().union(self.all_uninteresting()) |
185 | + |
186 | + |
187 | +def _build_interesting_key_sets(repo, inventory_ids, parent_only_inv_ids): |
188 | + result = _InterestingKeyInfo() |
189 | + for inv in repo.iter_inventories(inventory_ids, 'unordered'): |
190 | + root_key = inv.id_to_entry.key() |
191 | + pid_root_key = inv.parent_id_basename_to_file_id.key() |
192 | + if inv.revision_id in parent_only_inv_ids: |
193 | + result.uninteresting_root_keys.add(root_key) |
194 | + result.uninteresting_pid_root_keys.add(pid_root_key) |
195 | + else: |
196 | + result.interesting_root_keys.add(root_key) |
197 | + result.interesting_pid_root_keys.add(pid_root_key) |
198 | + return result |
199 | + |
200 | + |
201 | +def _filter_text_keys(interesting_nodes_iterable, text_keys, bytes_to_info): |
202 | + """Iterate the result of iter_interesting_nodes, yielding the records |
203 | + and adding to text_keys. |
204 | + """ |
205 | + for record, items in interesting_nodes_iterable: |
206 | + for name, bytes in items: |
207 | + # Note: we don't care about name_utf8, because groupcompress repos |
208 | + # are always rich-root, so there are no synthesised root records to |
209 | + # ignore. |
210 | + _, file_id, revision_id = bytes_to_info(bytes) |
211 | + text_keys.add((file_id, revision_id)) |
212 | + yield record |
213 | + |
214 | + |
215 | + |
216 | + |
217 | class RepositoryFormatCHK1(RepositoryFormatPack): |
218 | """A hashed CHK+group compress pack repository.""" |
219 | |
220 | |
221 | === modified file 'bzrlib/repofmt/pack_repo.py' |
222 | --- bzrlib/repofmt/pack_repo.py 2009-09-07 03:00:23 +0000 |
223 | +++ bzrlib/repofmt/pack_repo.py 2009-09-07 06:01:48 +0000 |
224 | @@ -2071,7 +2071,7 @@ |
225 | """ |
226 | # The base implementation does no checks. GCRepositoryPackCollection |
227 | # overrides this. |
228 | - return set() |
229 | + pass |
230 | |
231 | def _commit_write_group(self): |
232 | all_missing = set() |
233 | @@ -2087,11 +2087,7 @@ |
234 | raise errors.BzrCheckError( |
235 | "Repository %s has missing compression parent(s) %r " |
236 | % (self.repo, sorted(all_missing))) |
237 | - all_missing = self._check_new_inventories() |
238 | - if all_missing: |
239 | - raise errors.BzrCheckError( |
240 | - "Repository %s missing keys for new revisions %r " |
241 | - % (self.repo, sorted(all_missing))) |
242 | + self._check_new_inventories() |
243 | self._remove_pack_indices(self._new_pack) |
244 | any_new_content = False |
245 | if self._new_pack.data_inserted(): |
246 | |
247 | === modified file 'bzrlib/tests/per_repository/test_write_group.py' |
248 | --- bzrlib/tests/per_repository/test_write_group.py 2009-09-02 03:07:23 +0000 |
249 | +++ bzrlib/tests/per_repository/test_write_group.py 2009-09-07 08:06:25 +0000 |
250 | @@ -365,16 +365,16 @@ |
251 | """commit_write_group fails with BzrCheckError when the chk root record |
252 | for a new inventory is missing. |
253 | """ |
254 | + repo = self.make_repository('damaged-repo') |
255 | + if not repo._format.supports_chks: |
256 | + raise TestNotApplicable('requires repository with chk_bytes') |
257 | builder = self.make_branch_builder('simple-branch') |
258 | builder.build_snapshot('A-id', None, [ |
259 | ('add', ('', 'root-id', 'directory', None)), |
260 | ('add', ('file', 'file-id', 'file', 'content\n'))]) |
261 | b = builder.get_branch() |
262 | - if not b.repository._format.supports_chks: |
263 | - raise TestNotApplicable('requires repository with chk_bytes') |
264 | b.lock_read() |
265 | self.addCleanup(b.unlock) |
266 | - repo = self.make_repository('damaged-repo') |
267 | repo.lock_write() |
268 | repo.start_write_group() |
269 | # Now, add the objects manually |
270 | @@ -411,6 +411,9 @@ |
271 | (In principle the chk records are unnecessary in this case, but in |
272 | practice bzr 2.0rc1 (at least) expects to find them.) |
273 | """ |
274 | + repo = self.make_repository('damaged-repo') |
275 | + if not repo._format.supports_chks: |
276 | + raise TestNotApplicable('requires repository with chk_bytes') |
277 | # Make a branch where the last two revisions have identical |
278 | # inventories. |
279 | builder = self.make_branch_builder('simple-branch') |
280 | @@ -420,8 +423,6 @@ |
281 | builder.build_snapshot('B-id', None, []) |
282 | builder.build_snapshot('C-id', None, []) |
283 | b = builder.get_branch() |
284 | - if not b.repository._format.supports_chks: |
285 | - raise TestNotApplicable('requires repository with chk_bytes') |
286 | b.lock_read() |
287 | self.addCleanup(b.unlock) |
288 | # check our setup: B-id and C-id should have identical chk root keys. |
289 | @@ -433,10 +434,71 @@ |
290 | # We need ('revisions', 'C-id'), ('inventories', 'C-id'), |
291 | # ('inventories', 'B-id'), and the corresponding chk roots for those |
292 | # inventories. |
293 | + repo.lock_write() |
294 | + repo.start_write_group() |
295 | + src_repo = b.repository |
296 | + repo.inventories.insert_record_stream( |
297 | + src_repo.inventories.get_record_stream( |
298 | + [('B-id',), ('C-id',)], 'unordered', True)) |
299 | + repo.revisions.insert_record_stream( |
300 | + src_repo.revisions.get_record_stream( |
301 | + [('C-id',)], 'unordered', True)) |
302 | + # Make sure the presence of the missing data in a fallback does not |
303 | + # avoid the error. |
304 | + repo.add_fallback_repository(b.repository) |
305 | + self.assertRaises(errors.BzrCheckError, repo.commit_write_group) |
306 | + reopened_repo = self.reopen_repo_and_resume_write_group(repo) |
307 | + self.assertRaises( |
308 | + errors.BzrCheckError, reopened_repo.commit_write_group) |
309 | + reopened_repo.abort_write_group() |
310 | + |
311 | + def test_missing_chk_leaf_for_inventory(self): |
312 | + """commit_write_group fails with BzrCheckError when the chk root record |
313 | + for a parent inventory of a new revision is missing. |
314 | + """ |
315 | repo = self.make_repository('damaged-repo') |
316 | + if not repo._format.supports_chks: |
317 | + raise TestNotApplicable('requires repository with chk_bytes') |
318 | + builder = self.make_branch_builder('simple-branch') |
319 | + # add and modify files with very long file-ids, so that the chk map |
320 | + # will need more than just a root node. |
321 | + file_adds = [] |
322 | + file_modifies = [] |
323 | + for char in 'abc': |
324 | + name = char * 10000 |
325 | + file_adds.append( |
326 | + ('add', ('file-' + name, 'file-%s-id' % name, 'file', |
327 | + 'content %s\n' % name))) |
328 | + file_modifies.append( |
329 | + ('modify', ('file-%s-id' % name, 'new content %s\n' % name))) |
330 | + builder.build_snapshot('A-id', None, [ |
331 | + ('add', ('', 'root-id', 'directory', None))] + |
332 | + file_adds) |
333 | + builder.build_snapshot('B-id', None, []) |
334 | + builder.build_snapshot('C-id', None, file_modifies) |
335 | + b = builder.get_branch() |
336 | + src_repo = b.repository |
337 | + src_repo.lock_read() |
338 | + self.addCleanup(src_repo.unlock) |
339 | + # Now, manually insert objects for a stacked repo with only revision |
340 | + # C-id, *except* drop the non-root chk records. |
341 | + inv_b = src_repo.get_inventory('B-id') |
342 | + inv_c = src_repo.get_inventory('C-id') |
343 | + chk_root_keys_only = [ |
344 | + inv_b.id_to_entry.key(), inv_b.parent_id_basename_to_file_id.key(), |
345 | + inv_c.id_to_entry.key(), inv_c.parent_id_basename_to_file_id.key()] |
346 | + all_chks = src_repo.chk_bytes.keys() |
347 | + # Pick a non-root key to drop |
348 | + key_to_drop = all_chks.difference(chk_root_keys_only).pop() |
349 | + all_chks.discard(key_to_drop) |
350 | repo.lock_write() |
351 | repo.start_write_group() |
352 | - src_repo = b.repository |
353 | + repo.chk_bytes.insert_record_stream( |
354 | + src_repo.chk_bytes.get_record_stream( |
355 | + all_chks, 'unordered', True)) |
356 | + repo.texts.insert_record_stream( |
357 | + src_repo.texts.get_record_stream( |
358 | + src_repo.texts.keys(), 'unordered', True)) |
359 | repo.inventories.insert_record_stream( |
360 | src_repo.inventories.get_record_stream( |
361 | [('B-id',), ('C-id',)], 'unordered', True)) |
362 | @@ -456,16 +518,10 @@ |
363 | """commit_write_group fails with BzrCheckError when the chk root record |
364 | for a parent inventory of a new revision is missing. |
365 | """ |
366 | - builder = self.make_branch_builder('simple-branch') |
367 | - builder.build_snapshot('A-id', None, [ |
368 | - ('add', ('', 'root-id', 'directory', None)), |
369 | - ('add', ('file', 'file-id', 'file', 'content\n'))]) |
370 | - builder.build_snapshot('B-id', None, []) |
371 | - builder.build_snapshot('C-id', None, [ |
372 | - ('modify', ('file-id', 'new-content'))]) |
373 | - b = builder.get_branch() |
374 | - if not b.repository._format.supports_chks: |
375 | + repo = self.make_repository('damaged-repo') |
376 | + if not repo._format.supports_chks: |
377 | raise TestNotApplicable('requires repository with chk_bytes') |
378 | + b = self.make_branch_with_multiple_chk_nodes() |
379 | b.lock_read() |
380 | self.addCleanup(b.unlock) |
381 | # Now, manually insert objects for a stacked repo with only revision |
382 | @@ -476,7 +532,6 @@ |
383 | inv_c = b.repository.get_inventory('C-id') |
384 | chk_keys_for_c_only = [ |
385 | inv_c.id_to_entry.key(), inv_c.parent_id_basename_to_file_id.key()] |
386 | - repo = self.make_repository('damaged-repo') |
387 | repo.lock_write() |
388 | repo.start_write_group() |
389 | src_repo = b.repository |
390 | @@ -498,6 +553,63 @@ |
391 | errors.BzrCheckError, reopened_repo.commit_write_group) |
392 | reopened_repo.abort_write_group() |
393 | |
394 | + def make_branch_with_multiple_chk_nodes(self): |
395 | + # add and modify files with very long file-ids, so that the chk map |
396 | + # will need more than just a root node. |
397 | + builder = self.make_branch_builder('simple-branch') |
398 | + file_adds = [] |
399 | + file_modifies = [] |
400 | + for char in 'abc': |
401 | + name = char * 10000 |
402 | + file_adds.append( |
403 | + ('add', ('file-' + name, 'file-%s-id' % name, 'file', |
404 | + 'content %s\n' % name))) |
405 | + file_modifies.append( |
406 | + ('modify', ('file-%s-id' % name, 'new content %s\n' % name))) |
407 | + builder.build_snapshot('A-id', None, [ |
408 | + ('add', ('', 'root-id', 'directory', None))] + |
409 | + file_adds) |
410 | + builder.build_snapshot('B-id', None, []) |
411 | + builder.build_snapshot('C-id', None, file_modifies) |
412 | + return builder.get_branch() |
413 | + |
414 | + def test_missing_text_record(self): |
415 | + """commit_write_group fails with BzrCheckError when a text is missing. |
416 | + """ |
417 | + repo = self.make_repository('damaged-repo') |
418 | + if not repo._format.supports_chks: |
419 | + raise TestNotApplicable('requires repository with chk_bytes') |
420 | + b = self.make_branch_with_multiple_chk_nodes() |
421 | + src_repo = b.repository |
422 | + src_repo.lock_read() |
423 | + self.addCleanup(src_repo.unlock) |
424 | + # Now, manually insert objects for a stacked repo with only revision |
425 | + # C-id, *except* drop one changed text. |
426 | + all_texts = src_repo.texts.keys() |
427 | + all_texts.remove(('file-%s-id' % ('c'*10000,), 'C-id')) |
428 | + repo.lock_write() |
429 | + repo.start_write_group() |
430 | + repo.chk_bytes.insert_record_stream( |
431 | + src_repo.chk_bytes.get_record_stream( |
432 | + src_repo.chk_bytes.keys(), 'unordered', True)) |
433 | + repo.texts.insert_record_stream( |
434 | + src_repo.texts.get_record_stream( |
435 | + all_texts, 'unordered', True)) |
436 | + repo.inventories.insert_record_stream( |
437 | + src_repo.inventories.get_record_stream( |
438 | + [('B-id',), ('C-id',)], 'unordered', True)) |
439 | + repo.revisions.insert_record_stream( |
440 | + src_repo.revisions.get_record_stream( |
441 | + [('C-id',)], 'unordered', True)) |
442 | + # Make sure the presence of the missing data in a fallback does not |
443 | + # avoid the error. |
444 | + repo.add_fallback_repository(b.repository) |
445 | + self.assertRaises(errors.BzrCheckError, repo.commit_write_group) |
446 | + reopened_repo = self.reopen_repo_and_resume_write_group(repo) |
447 | + self.assertRaises( |
448 | + errors.BzrCheckError, reopened_repo.commit_write_group) |
449 | + reopened_repo.abort_write_group() |
450 | + |
451 | |
452 | class TestResumeableWriteGroup(TestCaseWithRepository): |
453 |
This completes the fix for bug 406687.
We could go much further I think in terms of reusing code. I've taken a nibble at that in this branch but there's room to do much, much better. We now have three code paths in groupcompress_ repo.py that are doing the essentially the same logic of finding text references from a set of revisions:
- GroupCHKStreamS ource, altered_ by_revision_ ids, and new_inventories (the new checks in this patch).
- fileids_
- _check_
However I think the incremental improvement in this patch is landable without tackling all the duplication at once. (But I'm happy to do any more low-hanging fruit that a reviewer considers appropriate.)