Merge lp:~pbeaman/akiban-persistit/fix_1017957_stress_tests_corruption into lp:akiban-persistit

Proposed by Peter Beaman
Status: Merged
Approved by: Peter Beaman
Approved revision: 332
Merged at revision: 329
Proposed branch: lp:~pbeaman/akiban-persistit/fix_1017957_stress_tests_corruption
Merge into: lp:akiban-persistit
Diff against target: 1021 lines (+436/-148)
12 files modified
src/main/java/com/persistit/Buffer.java (+42/-47)
src/main/java/com/persistit/BufferPool.java (+1/-0)
src/main/java/com/persistit/CLI.java (+42/-14)
src/main/java/com/persistit/Exchange.java (+41/-75)
src/main/java/com/persistit/IntegrityCheck.java (+16/-9)
src/main/java/com/persistit/VolumeStorageV2.java (+1/-0)
src/main/java/com/persistit/exception/RetryException.java (+0/-3)
src/main/java/com/persistit/util/Debug.java (+26/-0)
src/main/java/com/persistit/util/SequencerConstants.java (+1/-0)
src/main/java/com/persistit/util/Util.java (+8/-0)
src/test/java/com/persistit/Bug1017957Test.java (+235/-0)
src/test/java/com/persistit/IntegrityCheckTest.java (+23/-0)
To merge this branch: bzr merge lp:~pbeaman/akiban-persistit/fix_1017957_stress_tests_corruption
Reviewer Review Type Date Requested Status
Nathan Williams Approve
Review via email: mp+112647@code.launchpad.net

Description of the change

Fix bug 1017957 - B-Tree corruption occasionally detected in stress tests.

This branch contains a new test, Bug1018526Test, that demonstrates the bug mechanism we thing is responsible for the failures.

Also included: incremental corrections in IntegrityCheck discovered when it failed to report the corrupt garbage chain correctly.

To post a comment you must log in.
330. By Peter Beaman

Version fixes corruption bug

331. By Peter Beaman

Cleanup

332. By Peter Beaman

Cleanup

333. By Peter Beaman

Cleanup

Revision history for this message
Peter Beaman (pbeaman) wrote :

Turns out there were two problems in the Exchange class that contributed to this bug, making diagnosis somewhat difficult.

This branch fixes both:

(a) There is no longer a concept of "deferred reindexing" in raw_removeKeyRangeInternal. That was a relic of the fix-length write ahead journal of the past. One of the bug mechanisms used that code path to reinsert a page pointer to a page that had changed due to releasing all locks.

(b) there was a code path that failed to bump the tree generation counter upon completing a structure delete.

To support diagnosing this bug, I added better formatting and a new, useful CLI command that walks a pointer chain and displays selected information about a page. Some of these changes are located in Buffer#toStringDetail which is now simpler and more useful.

Also, while diagnosing this bug I came across Exceptions while reporting garbage chain structure errors in IntegrityCheck. Fixed that problem and wrote a unit test to verify it.

And of course the Eclipse random line-length formatter made its usual contribution...

Revision history for this message
Nathan Williams (nwilliams) wrote :

The fix looks reasonable and seems to match the explanation.

I only skimmed the Buffer toString or IntegrityCheck changes but they look fine.

Please change the first sentence of the merge prop to something descriptive (it is used for the commit message) and feel free to Approve.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/main/java/com/persistit/Buffer.java'
2--- src/main/java/com/persistit/Buffer.java 2012-06-25 15:41:14 +0000
3+++ src/main/java/com/persistit/Buffer.java 2012-06-29 21:06:19 +0000
4@@ -28,7 +28,6 @@
5
6 import java.nio.ByteBuffer;
7 import java.util.ArrayList;
8-import java.util.BitSet;
9 import java.util.List;
10 import java.util.Set;
11
12@@ -269,7 +268,7 @@
13 public final static int MAX_KEY_RATIO = 16;
14
15 private final static int BINARY_SEARCH_THRESHOLD = 6;
16-
17+
18 abstract static class VerifyVisitor {
19
20 protected void visitPage(long timestamp, Volume volume, long page, int type, int bufferSize, int keyBlockStart,
21@@ -330,7 +329,7 @@
22 * _byteBuffer.
23 */
24 private byte[] _bytes;
25-
26+
27 /**
28 * FastIndex structure used for rapid page searching
29 */
30@@ -2318,6 +2317,7 @@
31 assertVerify();
32 rightSibling.assertVerify();
33 }
34+
35 return whereInserted;
36 }
37
38@@ -2594,6 +2594,7 @@
39 buffer.assertVerify();
40 }
41 }
42+
43 return result;
44 }
45
46@@ -2815,7 +2816,7 @@
47 }
48
49 synchronized void invalidateFastIndex() {
50- _fastIndex.invalidate();
51+ _fastIndex.invalidate();
52 }
53
54 synchronized FastIndex getFastIndex() {
55@@ -3775,7 +3776,7 @@
56
57 public String toString() {
58 if (_toStringDebug) {
59- return toStringDetail(-1);
60+ return toStringDetail();
61 }
62 return String.format("Page %,d in volume %s at index %,d timestamp=%,d status=%s type=%s", _page, _vol,
63 _poolIndex, _timestamp, getStatusDisplayString(), getPageTypeName());
64@@ -3785,7 +3786,7 @@
65 * @return a human-readable inventory of the contents of this buffer
66 */
67 public String toStringDetail() {
68- return toStringDetail(-1);
69+ return toStringDetail(-1, 42, 42, 0, true);
70 }
71
72 /**
73@@ -3795,7 +3796,8 @@
74 * the one that points to findPointer. This provides a way to
75 * quickly find pointer paths in pages.
76 */
77- String toStringDetail(final long findPointer) {
78+ String toStringDetail(final long findPointer, final int maxKeyDisplayLength, final int maxValueDisplayLength,
79+ final int contextLines, final boolean all) {
80 final StringBuilder sb = new StringBuilder(String.format(
81 "Page %,d in volume %s at index @%,d status %s type %s", _page, _vol, _poolIndex,
82 getStatusDisplayString(), getPageTypeName()));
83@@ -3805,52 +3807,53 @@
84 sb.append(String.format("\n type=%,d alloc=%,d slack=%,d " + "keyBlockStart=%,d keyBlockEnd=%,d "
85 + "timestamp=%,d generation=%,d right=%,d hash=%,d", _type, _alloc, _slack, KEY_BLOCK_START,
86 _keyBlockEnd, getTimestamp(), getGeneration(), getRightSibling(), _pool.hashIndex(_vol, _page)));
87-
88 try {
89 final Key key = new Key(_persistit);
90 final Value value = new Value(_persistit);
91 final RecordInfo[] records = getRecords();
92- BitSet bits = new BitSet();
93+ int foundPointerRecord = -1;
94 if (isIndexPage() && findPointer >= 0) {
95 for (int index = 0; index < records.length; index++) {
96 if (records[index].getPointerValue() == findPointer) {
97- bits.set(index);
98+ foundPointerRecord = index;
99 }
100 }
101 }
102- boolean elisionMarked = false;
103+ boolean elision = false;
104 for (int index = 0; index < records.length; index++) {
105 RecordInfo r = records[index];
106 r.getKeyState().copyTo(key);
107- if (isDataPage()) {
108- r.getValueState().copyTo(value);
109- sb.append(String.format("\n %5d: db=%3d ebc=%3d tb=%,5d [%,d]%s=[%,d]%s", r.getKbOffset(), r
110- .getDb(), r.getEbc(), r.getTbOffset(), r.getKLength(), key, r.getValueState()
111- .getEncodedBytes().length, abridge(value)));
112+ String mark = " ";
113+ boolean selected = all | index < contextLines || index >= records.length - contextLines;
114+ if (foundPointerRecord >= 0) {
115+ selected |= (index >= foundPointerRecord - contextLines)
116+ && (index <= foundPointerRecord + contextLines);
117+ if (index == foundPointerRecord) {
118+ mark = "*";
119+ }
120+ }
121+
122+ if (selected) {
123+ if (elision) {
124+ sb.append(String.format("\n ..."));
125+ elision = false;
126+ }
127+ if (isDataPage()) {
128+ r.getValueState().copyTo(value);
129+ sb.append(String.format("\n%s %5d: db=%3d ebc=%3d tb=%,5d [%,d]%s=[%,d]%s", mark, r
130+ .getKbOffset(), r.getDb(), r.getEbc(), r.getTbOffset(), r.getKLength(), Util
131+ .abridge(key.toString(), maxKeyDisplayLength),
132+ r.getValueState().getEncodedBytes().length, Util.abridge(value.toString(),
133+ maxValueDisplayLength)));
134+ } else {
135+ sb.append(String.format("\n%s %5d: db=%3d ebc=%3d tb=%,5d [%,d]%s->%,d", mark, r
136+ .getKbOffset(), r.getDb(), r.getEbc(), r.getTbOffset(), r.getKLength(), Util
137+ .abridge(key.toString(), maxKeyDisplayLength), r.getPointerValue()));
138+ }
139 } else {
140- boolean selected = true;
141- if (findPointer >= 0) {
142- if (index > 2 && index < records.length - 2) {
143- boolean bit = false;
144- for (int p = index - 2; p < index + 3; p++) {
145- if (bits.get(p)) {
146- bit = true;
147- }
148- }
149- selected &= bit;
150- }
151- }
152- if (selected) {
153- sb.append(String.format("\n %5d: db=%3d ebc=%3d tb=%,5d [%,d]%s->%,d", r.getKbOffset(), r
154- .getDb(), r.getEbc(), r.getTbOffset(), r.getKLength(), key, r.getPointerValue()));
155- elisionMarked = false;
156- } else {
157- if (!elisionMarked) {
158- sb.append(String.format("\n ..."));
159- elisionMarked = true;
160- }
161- }
162+ elision = true;
163 }
164+
165 }
166 } catch (Exception e) {
167 sb.append(" - " + e);
168@@ -3875,15 +3878,6 @@
169 return sb.toString();
170 }
171
172- private String abridge(final Value value) {
173- String s = value.toString();
174- if (s.length() > 120) {
175- return s.substring(0, 120) + "...";
176- } else {
177- return s;
178- }
179- }
180-
181 String foundAtString(int p) {
182 StringBuilder sb = new StringBuilder("<");
183 sb.append(p & P_MASK);
184@@ -4316,4 +4310,5 @@
185 ++_mvvCount;
186 }
187 }
188+
189 }
190
191=== modified file 'src/main/java/com/persistit/BufferPool.java'
192--- src/main/java/com/persistit/BufferPool.java 2012-06-14 17:55:08 +0000
193+++ src/main/java/com/persistit/BufferPool.java 2012-06-29 21:06:19 +0000
194@@ -1054,6 +1054,7 @@
195 }
196
197 int selectDirtyBuffers(final int[] priorities, final BufferHolder[] holders) throws PersistitException {
198+ Debug.suspend();
199 int count = 0;
200 final int clock = _clock.get();
201
202
203=== modified file 'src/main/java/com/persistit/CLI.java'
204--- src/main/java/com/persistit/CLI.java 2012-06-14 20:15:48 +0000
205+++ src/main/java/com/persistit/CLI.java 2012-06-29 21:06:19 +0000
206@@ -1021,6 +1021,9 @@
207 final @Arg("pageSize|int:16384:1024:16384|Buffer pool index") int pageSize,
208 final @Arg("level|int:0:0:30|Tree level") int level, final @Arg("key|string|Key") String keyString,
209 final @Arg("find|long:-1:0:99999999999999999|Optional page pointer to find") long findPointer,
210+ final @Arg("maxkey|int:42:4:10000|Maximum displayed key length") int maxkey,
211+ final @Arg("maxvalue|int:42:4:100000|Maximum displayed value length") int maxvalue,
212+ final @Arg("context|int:3:0:100000|Context lines") int context,
213 final @Arg("_flag|a|All lines") boolean allLines, final @Arg("_flag|s|Summary only") boolean summary)
214 throws Exception {
215
216@@ -1084,21 +1087,8 @@
217 if (summary) {
218 postMessage(buffer.toString(), LOG_NORMAL);
219 return;
220- }
221- String detail = buffer.toStringDetail(findPointer);
222- if (allLines) {
223- postMessage(detail, LOG_NORMAL);
224- return;
225 } else {
226- int p = -1;
227- for (int i = 0; i < 20; i++) {
228- p = detail.indexOf('\n', p + 1);
229- if (p == -1) {
230- p = detail.length();
231- break;
232- }
233- }
234- postMessage(detail.substring(0, p), LOG_NORMAL);
235+ postMessage(buffer.toStringDetail(findPointer, maxkey, maxvalue, context, allLines), LOG_NORMAL);
236 return;
237 }
238 }
239@@ -1108,6 +1098,44 @@
240 }
241 };
242 }
243+
244+ @Cmd("pviewchain")
245+ Task pviewchain(final @Arg("page|long:0:0:99999999999999999|Starting page address") long pageAddress,
246+ final @Arg("find|long:-1:0:99999999999999999|Optional page pointer to find") long findPointer,
247+ final @Arg("count|int:32:1:1000000|Maximum number of pages to display") long maxcount,
248+ final @Arg("maxkey|int:42:4:10000|Maximum displayed key length") int maxkey,
249+ final @Arg("maxvalue|int:42:4:100000|Maximum displayed value length") int maxvalue,
250+ final @Arg("context|int:3:0:100000|Context lines") int context,
251+ final @Arg("_flag|a|All lines") boolean allLines, final @Arg("_flag|s|Summary only") boolean summary) {
252+
253+ return new Task() {
254+
255+ @Override
256+ protected void runTask() throws Exception {
257+ long currentPage = pageAddress;
258+ int count = 0;
259+ while (currentPage > 0 && count++ < maxcount) {
260+ if (_currentVolume == null) {
261+ postMessage("Select a volume", LOG_NORMAL);
262+ return;
263+ }
264+ final Buffer buffer = _currentVolume.getPool().getBufferCopy(_currentVolume, currentPage);
265+ if (summary) {
266+ postMessage(buffer.toString(), LOG_NORMAL);
267+ } else {
268+ postMessage(buffer.toStringDetail(findPointer, maxkey, maxvalue, context, allLines), LOG_NORMAL);
269+ }
270+ currentPage = buffer.getRightSibling();
271+ }
272+ }
273+
274+ @Override
275+ public String getStatus() {
276+ return "";
277+ }
278+
279+ };
280+ }
281
282 @Cmd("jquery")
283 Task jquery(final @Arg("page|long:-1|Page address for PageNode to look up") long pageAddress,
284
285=== modified file 'src/main/java/com/persistit/Exchange.java'
286--- src/main/java/com/persistit/Exchange.java 2012-06-18 16:35:52 +0000
287+++ src/main/java/com/persistit/Exchange.java 2012-06-29 21:06:19 +0000
288@@ -38,7 +38,7 @@
289 import static com.persistit.Key.LTEQ;
290 import static com.persistit.Key.RIGHT_GUARD_KEY;
291 import static com.persistit.Key.maxStorableKeySize;
292-import static com.persistit.util.SequencerConstants.WRITE_WRITE_STORE_A;
293+import static com.persistit.util.SequencerConstants.*;
294 import static com.persistit.util.ThreadSequencer.sequence;
295
296 import java.util.ArrayList;
297@@ -526,8 +526,7 @@
298 int _flags;
299 long _deallocLeftPage;
300 long _deallocRightPage;
301- long _deferredReindexPage;
302- long _deferredReindexChangeCount;
303+
304
305 private LevelCache(int level) {
306 _level = level;
307@@ -1330,7 +1329,8 @@
308
309 final int maxSimpleValueSize = maxValueSize(key.getEncodedSize());
310 final Value spareValue = _persistit.getThreadLocalValue();
311- assert !(doMVCC & value == spareValue || doFetch && value == _spareValue): "storeInternal may be use the supplied Value: " + value;
312+ assert !(doMVCC & value == spareValue || doFetch && value == _spareValue) : "storeInternal may be use the supplied Value: "
313+ + value;
314
315 //
316 // First insert the record in the data page
317@@ -1356,7 +1356,7 @@
318 // This method may delay significantly for I/O and must
319 // be called when there are no other claimed resources.
320 //
321- newLongRecordPointer = getLongRecordHelper().storeLongRecord(value, _transaction.isActive());
322+ newLongRecordPointer = getLongRecordHelper().storeLongRecord(value, _transaction.isActive());
323 }
324
325 if (!_ignoreTransactions && ((options & StoreOptions.DONT_JOURNAL) == 0)) {
326@@ -1408,7 +1408,7 @@
327 if (!treeClaimAcquired || !_treeHolder.upgradeClaim()) {
328 treeClaimRequired = true;
329 treeWriterClaimRequired = true;
330- throw new RetryException();
331+ throw RetryException.SINGLE;
332 }
333
334 Debug.$assert0.t(valueToStore.getPointerValue() > 0);
335@@ -1460,9 +1460,8 @@
336 }
337 /*
338 * If it was a long MVV we saved it into the
339- * variable above. Otherwise it is a
340- * primordial value that we can't get rid
341- * of.
342+ * variable above. Otherwise it is a primordial
343+ * value that we can't get rid of.
344 */
345 oldLongRecordPointer = 0;
346
347@@ -1521,7 +1520,8 @@
348 spareValue.setEncodedSize(storedLength);
349
350 if (spareValue.getEncodedSize() > maxSimpleValueSize) {
351- newLongRecordPointerMVV = getLongRecordHelper().storeLongRecord(spareValue, _transaction.isActive());
352+ newLongRecordPointerMVV = getLongRecordHelper().storeLongRecord(spareValue,
353+ _transaction.isActive());
354 }
355 }
356 }
357@@ -1797,13 +1797,15 @@
358 // level cache for this level will become
359 // (appropriately) invalid.
360 //
361+
362+
363 int at = buffer.split(rightSibling, key, valueWriter, foundAt, _spareKey1, sequence, _splitPolicy);
364 if (at < 0) {
365 lc.updateInsert(rightSibling, key, -at);
366 } else {
367 lc.updateInsert(buffer, key, at);
368 }
369-
370+
371 long oldRightSibling = buffer.getRightSibling();
372 long newRightSibling = rightSibling.getPageAddress();
373
374@@ -2148,7 +2150,8 @@
375 if (matches) {
376 index = _key.nextElementIndex(parentIndex);
377 if (index > 0) {
378- boolean isVisibleMatch = fetchFromBufferInternal(buffer, outValue, foundAt, minimumBytes);
379+ boolean isVisibleMatch = fetchFromBufferInternal(buffer, outValue, foundAt,
380+ minimumBytes);
381 //
382 // In any case (matching sibling, child or
383 // niece/nephew) we need to ignore this
384@@ -2736,7 +2739,8 @@
385 * As thrown from any internal method.
386 * @return <code>true</code> if the value was visible.
387 */
388- private boolean fetchFromBufferInternal(Buffer buffer, Value value, int foundAt, int minimumBytes) throws PersistitException {
389+ private boolean fetchFromBufferInternal(Buffer buffer, Value value, int foundAt, int minimumBytes)
390+ throws PersistitException {
391 buffer.fetch(foundAt, value);
392 return fetchFromValueInternal(value, minimumBytes, buffer);
393 }
394@@ -2744,20 +2748,21 @@
395 /**
396 * Helper for finalizing the value to return from a, potentially, MVV
397 * contained in the given Value.
398- *
399+ *
400 * @param value
401 * Value to finalize.
402 * @param minimumBytes
403 * Minimum amount of LONG_RECORD to fetch. If &lt;0, the
404 * <code>value</code> will contain just the descriptor portion.
405 * @param bufferForPruning
406- * If not <code>null</code> and <code>Value</code> did contain
407- * an MVV, call {@link Buffer#enqueuePruningAction(int)}.
408+ * If not <code>null</code> and <code>Value</code> did contain an
409+ * MVV, call {@link Buffer#enqueuePruningAction(int)}.
410 * @throws PersistitException
411 * As thrown from any internal method.
412 * @return <code>true</code> if the value was visible.
413 */
414- private boolean fetchFromValueInternal(Value value, int minimumBytes, Buffer bufferForPruning) throws PersistitException {
415+ private boolean fetchFromValueInternal(Value value, int minimumBytes, Buffer bufferForPruning)
416+ throws PersistitException {
417 boolean visible = true;
418 /*
419 * We must fetch the full LONG_RECORD, if needed, while buffer is
420@@ -3042,7 +3047,7 @@
421
422 /**
423 * Removes all records with keys falling between <code>key1</code> and
424- * </code>key2</code>, lefty-inclusive. Validity checks and Key value
425+ * </code>key2</code>, left-inclusive. Validity checks and Key value
426 * adjustments have been done by caller - this method does the work.
427 *
428 * @param key1
429@@ -3145,7 +3150,6 @@
430 boolean result = false;
431
432 boolean deallocationRequired = true; // assume until proven false
433- boolean deferredReindexRequired = false;
434 boolean tryQuickDelete = true;
435
436 if (!_ignoreTransactions) {
437@@ -3227,9 +3231,6 @@
438 + " failed to get writer claim on " + _tree);
439 }
440 treeClaimAcquired = true;
441- _tree.bumpGeneration();
442- // Because we actually haven't changed anything yet.
443- _cachedTreeGeneration++;
444 }
445 //
446 // Need to redo this check now that we have a
447@@ -3265,12 +3266,6 @@
448 lc._rightBuffer = buffer;
449 lc._rightFoundAt = foundAt2;
450 } else {
451- //
452- // Since we are spanning pages we need an
453- // exclusive claim on the tree to prevent
454- // an insertion from propagating upward through
455- // the deletion range.
456- //
457 pageAddr2 = buffer.getRightSibling();
458 samePage = false;
459 }
460@@ -3358,11 +3353,13 @@
461 _volume.getStructure().harvestLongRecords(buffer1, foundAt1, Integer.MAX_VALUE);
462 _volume.getStructure().harvestLongRecords(buffer2, 0, foundAt2);
463
464+ Debug.$assert0.t(_tree.isMine() && buffer1.isMine() && buffer2.isMine());
465 boolean rebalanced = buffer1.join(buffer2, foundAt1, foundAt2, _spareKey1, _spareKey2,
466 _joinPolicy);
467 if (buffer1.isDataPage()) {
468 _tree.bumpChangeCount();
469 }
470+
471 buffer1.setDirtyAtTimestamp(timestamp);
472 buffer2.setDirtyAtTimestamp(timestamp);
473
474@@ -3419,9 +3416,11 @@
475 }
476 }
477 if (needsReindex) {
478- lc._deferredReindexPage = buffer2.getPageAddress();
479- lc._deferredReindexChangeCount = buffer2.getGeneration();
480- deferredReindexRequired = true;
481+ _spareKey1.copyTo(_spareKey2);
482+ _value.setPointerValue(buffer2.getPageAddress());
483+ _value.setPointerPageType(buffer2.getPageType());
484+
485+ storeInternal(_spareKey2, _value, level + 1, StoreOptions.NONE);
486 needsReindex = false;
487 }
488 }
489@@ -3449,7 +3448,7 @@
490 }
491 break;
492 } catch (RetryException re) {
493- // handled below
494+ // handled below by releasing claims and retrying
495 } finally {
496 //
497 // Release all buffers.
498@@ -3459,10 +3458,17 @@
499 }
500
501 if (treeClaimAcquired) {
502+ if (treeWriterClaimRequired) {
503+ _tree.bumpGeneration();
504+ }
505 _treeHolder.release();
506 treeClaimAcquired = false;
507 }
508 }
509+ /*
510+ * Having released all prior claims, now acquire an exclusive
511+ * claim on the Tree.
512+ */
513 if (treeWriterClaimRequired) {
514 if (!_treeHolder.claim(true)) {
515 Debug.$assert0.t(false);
516@@ -3490,51 +3496,11 @@
517 deallocationRequired = false;
518 break;
519 }
520- while (deferredReindexRequired) {
521- Buffer buffer = null;
522- try {
523- for (int level = _cacheDepth; --level >= 0;) {
524- LevelCache lc = _levelCache[level];
525- if (lc._deferredReindexPage != 0) {
526- if (!treeClaimAcquired) {
527- if (!_treeHolder.claim(treeWriterClaimRequired)) {
528- Debug.$assert0.t(false);
529- throw new InUseException("Thread " + Thread.currentThread().getName()
530- + " failed to get writer claim on " + _tree);
531- }
532- treeClaimAcquired = true;
533- }
534-
535- long deferredPage = lc._deferredReindexPage;
536- buffer = _pool.get(_volume, deferredPage, false, true);
537- if (buffer.getGeneration() == lc._deferredReindexChangeCount) {
538- checkPageType(buffer, level + PAGE_TYPE_DATA, false);
539- buffer.nextKey(_spareKey2, buffer.toKeyBlock(0));
540- _value.setPointerValue(buffer.getPageAddress());
541- _value.setPointerPageType(buffer.getPageType());
542- storeInternal(_spareKey2, _value, level + 1, StoreOptions.NONE);
543- } else {
544- _persistit.getLogBase().unindexedPage.log(deferredPage, _volume, _tree.getName());
545- }
546- lc._deferredReindexPage = 0;
547- buffer.releaseTouched();
548- buffer = null;
549- }
550- }
551- deferredReindexRequired = false;
552- } catch (RetryException re) {
553- // can this even be thrown here?
554- } finally {
555- if (buffer != null) {
556- buffer.releaseTouched();
557- buffer = null;
558- }
559- }
560- }
561-
562 } finally {
563 if (treeClaimAcquired) {
564- _tree.bumpGeneration();
565+ if (treeWriterClaimRequired) {
566+ _tree.bumpGeneration();
567+ }
568 _treeHolder.release();
569 treeClaimAcquired = false;
570 }
571
572=== modified file 'src/main/java/com/persistit/IntegrityCheck.java'
573--- src/main/java/com/persistit/IntegrityCheck.java 2012-06-26 19:36:26 +0000
574+++ src/main/java/com/persistit/IntegrityCheck.java 2012-06-29 21:06:19 +0000
575@@ -339,7 +339,14 @@
576 }
577
578 private void addFault(String description, long page, int level, int position) {
579- Fault fault = new Fault(resourceName(), this, description, page, level, position);
580+ Fault fault = new Fault(resourceName(), this, description, page, _treeDepth, level, position);
581+ if (_faults.size() < MAX_FAULTS)
582+ _faults.add(fault);
583+ postMessage(fault.toString(), LOG_VERBOSE);
584+ }
585+
586+ private void addGarbageFault(String description, long page, int level, int position) {
587+ Fault fault = new Fault(resourceName(), this, description, page, 3, level, position);
588 if (_faults.size() < MAX_FAULTS)
589 _faults.add(fault);
590 postMessage(fault.toString(), LOG_VERBOSE);
591@@ -645,10 +652,10 @@
592 int _depth;
593 int _position;
594
595- Fault(String treeName, IntegrityCheck work, String description, long page, int level, int position) {
596+ Fault(String treeName, IntegrityCheck work, String description, long page, int depth, int level, int position) {
597 _treeName = treeName;
598 _description = description;
599- _depth = work._treeDepth;
600+ _depth = depth;
601 _path = new long[_depth - level];
602 for (int index = _depth; --index > level;) {
603 if (index >= work._edgeBuffers.length) {
604@@ -657,7 +664,7 @@
605 _path[index - level] = work._edgePages[index];
606 }
607 }
608- _path[level] = page;
609+ _path[0] = page;
610 _level = level;
611 _depth = work._treeDepth;
612 _position = position;
613@@ -994,11 +1001,11 @@
614 private void checkGarbagePage(Buffer garbageBuffer) throws PersistitException {
615 long page = garbageBuffer.getPageAddress();
616 if (!garbageBuffer.isGarbagePage()) {
617- addFault("Unexpected page type " + garbageBuffer.getPageType() + " expected a garbage page", page, 1, 0);
618+ addGarbageFault("Unexpected page type " + garbageBuffer.getPageType() + " expected a garbage page", page, 1, 0);
619 return;
620 }
621 if (_usedPageBits.get(page)) {
622- addFault("Garbage page is referenced by multiple parents", page, 1, 0);
623+ addGarbageFault("Garbage page is referenced by multiple parents", page, 1, 0);
624 return;
625 }
626
627@@ -1007,7 +1014,7 @@
628 int size = garbageBuffer.getBufferSize();
629 int count = (size - next) / Buffer.GARBAGE_BLOCK_SIZE;
630 if (count * Buffer.GARBAGE_BLOCK_SIZE != (size - next)) {
631- addFault("Garbage page is malformed: _alloc=" + next + " is not at a multiple of "
632+ addGarbageFault("Garbage page is malformed: _alloc=" + next + " is not at a multiple of "
633 + Buffer.GARBAGE_BLOCK_SIZE + " bytes", page, 1, 0);
634 }
635 _usedPageBits.set(page, true);
636@@ -1026,12 +1033,12 @@
637 _edgePages[2] = page;
638 while (page != 0 && page != right) {
639 if (_usedPageBits.get(page)) {
640- addFault("Page on garbage chain is referenced by multiple parents", page, 3, 0);
641+ addGarbageFault("Page on garbage chain is referenced by multiple parents", page, 0, 0);
642 return;
643 }
644 Buffer buffer = getPage(page);
645 if (!buffer.isDataPage() && !buffer.isIndexPage() && !buffer.isLongRecordPage()) {
646- addFault("Page of type " + buffer.getPageTypeName() + " found on garbage page", page, 3, 0);
647+ addGarbageFault("Page of type " + buffer.getPageTypeName() + " found on garbage page", page, 0, 0);
648 }
649 _counters._garbagePageCount++;
650 _pagesVisited++;
651
652=== modified file 'src/main/java/com/persistit/VolumeStorageV2.java'
653--- src/main/java/com/persistit/VolumeStorageV2.java 2012-06-22 19:07:13 +0000
654+++ src/main/java/com/persistit/VolumeStorageV2.java 2012-06-29 21:06:19 +0000
655@@ -363,6 +363,7 @@
656 _headBuffer = _volume.getStructure().getPool().get(_volume, 0, true, false);
657 boolean truncated = false;
658 try {
659+ _headBuffer.init(Buffer.PAGE_TYPE_HEAD);
660 _headBuffer.setFixed();
661
662 initMetaData(_headBuffer.getBytes());
663
664=== modified file 'src/main/java/com/persistit/exception/RetryException.java'
665--- src/main/java/com/persistit/exception/RetryException.java 2012-05-25 18:50:59 +0000
666+++ src/main/java/com/persistit/exception/RetryException.java 2012-06-29 21:06:19 +0000
667@@ -30,7 +30,4 @@
668
669 public final static RetryException SINGLE = new RetryException();
670
671- public RetryException() {
672-
673- }
674 }
675
676=== modified file 'src/main/java/com/persistit/util/Debug.java'
677--- src/main/java/com/persistit/util/Debug.java 2012-06-04 18:58:23 +0000
678+++ src/main/java/com/persistit/util/Debug.java 2012-06-29 21:06:19 +0000
679@@ -106,6 +106,32 @@
680 }
681 System.err.println("Debug " + sb.toString());
682 }
683+
684+ public static String trace(int from, int to) {
685+ return " " + Thread.currentThread().getName() + " {" + Debug.callStack(from + 2, to + 2) + "}";
686+ }
687+
688+ public static String callStack(final int from, final int to) {
689+ RuntimeException exception = new RuntimeException();
690+ exception.fillInStackTrace();
691+ StackTraceElement[] elements = exception.getStackTrace();
692+ int a = Math.max(0, from);
693+ int b = Math.min(to, elements.length);
694+ StringBuilder sb = new StringBuilder();
695+ for (int index = b; index >= a; index--) {
696+ StackTraceElement t = exception.getStackTrace()[index];
697+ if (index != b) {
698+ sb.append("->");
699+ }
700+ sb.append(t.getClassName());
701+ sb.append('#');
702+ sb.append(t.getMethodName());
703+ sb.append('[');
704+ sb.append(t.getLineNumber());
705+ sb.append("]");
706+ }
707+ return sb.toString();
708+ }
709
710 /**
711 * Set the suspend flag so that callers to the suspend method either do or
712
713=== modified file 'src/main/java/com/persistit/util/SequencerConstants.java'
714--- src/main/java/com/persistit/util/SequencerConstants.java 2012-05-25 18:50:59 +0000
715+++ src/main/java/com/persistit/util/SequencerConstants.java 2012-06-29 21:06:19 +0000
716@@ -97,4 +97,5 @@
717 int LONG_RECORD_ALLOCATE_A = allocate("LONG_RECORD_ALLOCATE_A");
718 int LONG_RECORD_ALLOCATE_B = allocate("LONG_RECORD_ALLOCATE_B");
719 int[][] LONG_RECORD_ALLOCATE_SCHEDULED = new int[][]{array(LONG_RECORD_ALLOCATE_B), array(LONG_RECORD_ALLOCATE_A, LONG_RECORD_ALLOCATE_B)};
720+
721 }
722
723=== modified file 'src/main/java/com/persistit/util/Util.java'
724--- src/main/java/com/persistit/util/Util.java 2012-05-25 18:50:59 +0000
725+++ src/main/java/com/persistit/util/Util.java 2012-06-29 21:06:19 +0000
726@@ -234,6 +234,14 @@
727 return format(Long.toString(i), width, true);
728 }
729
730+ public static String abridge(final String s, final int maxLength) {
731+ if (s.length() > maxLength) {
732+ return s.substring(0, maxLength - 3) + "...";
733+ } else {
734+ return s;
735+ }
736+ }
737+
738 public static boolean equalsByteSubarray(byte[] source, int next, byte[] target) {
739 return equalsByteSubarray(source, next, target, 0, target.length);
740 }
741
742=== added file 'src/test/java/com/persistit/Bug1017957Test.java'
743--- src/test/java/com/persistit/Bug1017957Test.java 1970-01-01 00:00:00 +0000
744+++ src/test/java/com/persistit/Bug1017957Test.java 2012-06-29 21:06:19 +0000
745@@ -0,0 +1,235 @@
746+/**
747+ * Copyright © 2012 Akiban Technologies, Inc. All rights reserved.
748+ *
749+ * This program is free software: you can redistribute it and/or modify
750+ * it under the terms of the GNU Affero General Public License as
751+ * published by the Free Software Foundation, version 3 (only) of the
752+ * License.
753+ *
754+ * This program is distributed in the hope that it will be useful,
755+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
756+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
757+ * GNU Affero General Public License for more details.
758+ *
759+ * You should have received a copy of the GNU Affero General Public License
760+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
761+ *
762+ * This program may also be available under different license terms. For more
763+ * information, see www.akiban.com or contact licensing@akiban.com.
764+ */
765+
766+package com.persistit;
767+
768+import static org.junit.Assert.assertEquals;
769+
770+import java.io.PrintWriter;
771+import java.util.Properties;
772+import java.util.concurrent.atomic.AtomicInteger;
773+
774+import org.junit.Test;
775+
776+import com.persistit.mxbeans.ManagementMXBean;
777+import com.persistit.unit.PersistitUnitTestCase;
778+import com.persistit.unit.UnitTestProperties;
779+
780+/**
781+ * https://bugs.launchpad.net/akiban-persistit/+bug/1017957
782+ *
783+ * During the past week the 8-hour stress test suite has generated several
784+ * CorruptVolumeExceptions and other related phenomena. Examples:
785+ *
786+ * Stress6 [main] FAILED: com.persistit.exception.CorruptVolumeException: Volume
787+ * persistit(/tmp/persistit_tests/persistit) level=0 page=15684
788+ * initialPage=57164
789+ * key=<{"stress6",98,5,"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}> walked
790+ * right more than 50 pages last page visited=81324 at
791+ * com.persistit.Exchange.corrupt(Exchange.java:3884) at
792+ * com.persistit.Exchange.searchLevel(Exchange.java:1250) at
793+ * com.persistit.Exchange.searchTree(Exchange.java:1125) at
794+ * com.persistit.Exchange.storeInternal(Exchange.java:1443) at
795+ * com.persistit.Exchange.store(Exchange.java:1294) at
796+ * com.persistit.Exchange.store(Exchange.java:2534) at
797+ * com.persistit.stress.unit.Stress6.executeTest(Stress6.java:98) at
798+ * com.persistit.stress.AbstractStressTest.run(AbstractStressTest.java:93) at
799+ * java.lang.Thread.run(Thread.java:662)
800+ *
801+ * Stress2txn [main] FAILED: com.persistit.exception.RebalanceException at
802+ * com.persistit.Buffer.join(Buffer.java:2523) at
803+ * com.persistit.Exchange.raw_removeKeyRangeInternal(Exchange.java:3367) at
804+ * com.persistit.Exchange.removeKeyRangeInternal(Exchange.java:3070) at
805+ * com.persistit.Exchange.removeInternal(Exchange.java:2999) at
806+ * com.persistit.Exchange.remove(Exchange.java:2927) at
807+ * com.persistit.stress.unit.Stress2txn.executeTest(Stress2txn.java:231) at
808+ * com.persistit.stress.AbstractStressTest.run(AbstractStressTest.java:93) at
809+ * java.lang.Thread.run(Thread.java:662)
810+ *
811+ * Stress2txn [main] FAILED: com.persistit.exception.CorruptVolumeException:
812+ * LONG_RECORD chain is invalid at page 111919 - invalid page type: Page 111,919
813+ * in volume persistit(/tmp/persistit_tests/persistit) at index 1,559
814+ * timestamp=909,787,072 status=vr1 type=Data at
815+ * com.persistit.LongRecordHelper.corrupt(LongRecordHelper.java:243) at
816+ * com.persistit.LongRecordHelper.fetchLongRecord(LongRecordHelper.java:103) at
817+ * com.persistit.Exchange.fetchFixupForLongRecords(Exchange.java:2841) at
818+ * com.persistit.Exchange.fetchFromValueInternal(Exchange.java:2778) at
819+ * com.persistit.Exchange.fetchFromBufferInternal(Exchange.java:2747) at
820+ * com.persistit.Exchange.traverse(Exchange.java:2157) at
821+ * com.persistit.Exchange.traverse(Exchange.java:1960) at
822+ * com.persistit.Exchange.traverse(Exchange.java:1897) at
823+ * com.persistit.Exchange.next(Exchange.java:2330) at
824+ * com.persistit.stress.unit.Stress2txn.executeTest(Stress2txn.java:188) at
825+ * com.persistit.stress.AbstractStressTest.run(AbstractStressTest.java:93) at
826+ * java.lang.Thread.run(Thread.java:662)
827+ *
828+ * Bug mechanism #1:
829+ *
830+ * An obscure path through Exchange#raw_removeKeyRangeInternal inserts a
831+ * key-pointer pair into an index page. It does so after removing all claims on
832+ * pages and the tree itself. After removing claims, before inserting the
833+ * key-pointer pair we believe the page itself gets put unto a garbage chain. So
834+ * after the re-insertion, the index page now has a pointer to a page that will
835+ * be reused and will contain unrelated data.
836+ *
837+ *
838+ * Bug mechanism #2: An obscure path through Exchange#raw_removeKeyRangeInternal
839+ * performs a structure delete (i.e., joins one more pairs of pages) but fails
840+ * to bump the Tree generation. The allows use of a stale LevelCache array.
841+ *
842+ * This test procedure exhibited both bug mechanisms reliably within 10 seconds
843+ * prior to fixing the code. We also implemented a test method based on the
844+ * ThreadSequencer to precisely elaborate the sequence of interactions between
845+ * two threads that cause the failure. However, the bug fix eliminates the code
846+ * path that allows the sequencer to work, so the test was removed.
847+ *
848+ * @author peter
849+ *
850+ */
851+public class Bug1017957Test extends PersistitUnitTestCase {
852+
853+ @Override
854+ protected Properties getProperties(boolean cleanup) {
855+ return UnitTestProperties.getBiggerProperties(cleanup);
856+ }
857+
858+ private final long STRESS_NANOS = 10L * 1000000000L;
859+
860+ /**
861+ *
862+ * @throws Exception
863+ */
864+ @Test
865+ public void induceCorruptionByStress() throws Exception {
866+ final long expiresAt = System.nanoTime() + STRESS_NANOS;
867+ final AtomicInteger totalErrors = new AtomicInteger();
868+ Thread t1 = new Thread(new Runnable() {
869+ public void run() {
870+ int count = 0;
871+ int errors = 0;
872+ try {
873+ Exchange ex = _persistit.getExchange("persistit", "Bug1017957Test", true);
874+ while (System.nanoTime() < expiresAt) {
875+ try {
876+ Key key = createUnsafeStructure(ex);
877+ removeInterestingKey(ex, key);
878+ if (++count % 5000 == 0) {
879+ System.out.printf("T1 iterations %,d\n", count);
880+ }
881+ } catch (Exception e) {
882+ if (++errors < 10) {
883+ e.printStackTrace();
884+ }
885+ totalErrors.incrementAndGet();
886+ }
887+ }
888+ } catch (Exception e) {
889+ throw new RuntimeException(e);
890+ }
891+ }
892+ });
893+
894+ Thread t2 = new Thread(new Runnable() {
895+ public void run() {
896+ int count = 0;
897+ int errors = 0;
898+ try {
899+ Exchange ex = _persistit.getExchange("persistit", "Bug1017957Test", true);
900+ while (System.nanoTime() < expiresAt) {
901+ try {
902+ removeCoveringRange(ex);
903+ insertOtherStuff(ex);
904+ if (++count % 5000 == 0) {
905+ System.out.printf("T2 iterations %,d\n", count);
906+ }
907+ } catch (Exception e) {
908+ if (++errors < 10) {
909+ e.printStackTrace();
910+ }
911+ totalErrors.incrementAndGet();
912+ }
913+ }
914+ } catch (Exception e) {
915+ if (++errors < 10) {
916+ e.printStackTrace();
917+ }
918+ }
919+ }
920+ });
921+
922+ t1.start();
923+ t2.start();
924+ t1.join();
925+ t2.join();
926+
927+ IntegrityCheck icheck = new IntegrityCheck(_persistit);
928+ icheck.setMessageLogVerbosity(Task.LOG_VERBOSE);
929+ icheck.setMessageWriter(new PrintWriter(System.out));
930+ icheck.checkVolume(_persistit.getVolume("persistit"));
931+ System.out.printf("\nTotal errors %d", totalErrors.get());
932+ assertEquals("Corrupt volume", 0, icheck.getFaults().length);
933+ assertEquals("Exception occurred", 0, totalErrors.get());
934+ }
935+
936+ /**
937+ * Create a B-Tree with a structure that will induce a deferred index
938+ * insertion on removal of key. We need an index page that's pretty full
939+ * such that removing a key and inserting a different one will result in
940+ * splitting the index page.
941+ *
942+ * @throws Exception
943+ */
944+ private Key createUnsafeStructure(final Exchange ex) throws Exception {
945+ Key result = null;
946+ final String v = createString(5500); // less than long record
947+ final String k = createString(1040);
948+ for (int i = 1000; i < 1019; i++) {
949+ if (i == 1009) {
950+ ex.clear().append(i).append(k.substring(0, 20));
951+ ex.getValue().put("interesting");
952+ ex.store();
953+ result = new Key(ex.getKey());
954+ }
955+ ex.clear().append(i).append(k);
956+ ex.getValue().put(v);
957+ ex.store();
958+ }
959+ return result;
960+ }
961+
962+ private void removeInterestingKey(final Exchange ex, final Key interestingKey) throws Exception {
963+ interestingKey.copyTo(ex.getKey());
964+ ex.remove();
965+ }
966+
967+ private void removeCoveringRange(final Exchange ex) throws Exception {
968+ final Key key1 = new Key(_persistit).append(1005);
969+ final Key key2 = new Key(_persistit).append(1015);
970+ ex.removeKeyRange(key1, key2);
971+ }
972+
973+ private void insertOtherStuff(final Exchange ex) throws Exception {
974+ for (int k = 0; k < 100; k++) {
975+ ex.clear().append(1009).append(k).append(RED_FOX);
976+ ex.getValue().put(RED_FOX);
977+ ex.store();
978+ }
979+ }
980+}
981
982=== modified file 'src/test/java/com/persistit/IntegrityCheckTest.java'
983--- src/test/java/com/persistit/IntegrityCheckTest.java 2012-05-25 18:50:59 +0000
984+++ src/test/java/com/persistit/IntegrityCheckTest.java 2012-06-29 21:06:19 +0000
985@@ -211,6 +211,16 @@
986 _persistit.getTransactionIndex().cleanup();
987 assertEquals(0, _persistit.getTransactionIndex().getAbortedCount());
988 }
989+
990+ @Test
991+ public void testCorruptGarbageChain() throws PersistitException {
992+ final Exchange ex = _persistit.getExchange(_volumeName, "mvv", true);
993+ nonTransactionalStore(ex);
994+ corrupt4(ex);
995+ IntegrityCheck icheck = icheck();
996+ icheck.checkVolume(ex.getVolume());
997+ assertTrue(icheck.getFaults().length > 0);
998+ }
999
1000 private String key(final int i) {
1001 return String.format("%05d%s", i, RED_FOX);
1002@@ -314,6 +324,19 @@
1003 buffer.setDirtyAtTimestamp(_persistit.getTimestampAllocator().updateTimestamp());
1004 buffer.release();
1005 }
1006+
1007+ /**
1008+ * Corrupts garbage chain by adding a live data chain to it
1009+ * @throw PersistitException
1010+ */
1011+ private void corrupt4(final Exchange ex) throws PersistitException {
1012+ Key key = ex.getKey();
1013+ ex.clear().to(key(500));
1014+ Buffer copy = ex.fetchBufferCopy(0);
1015+ Buffer buffer = ex.getBufferPool().get(ex.getVolume(), copy.getPageAddress(), true, true);
1016+ ex.getVolume().getStructure().deallocateGarbageChain(buffer.getPageAddress(), buffer.getRightSibling());
1017+ buffer.release();
1018+ }
1019
1020 private IntegrityCheck icheck() {
1021 IntegrityCheck icheck = new IntegrityCheck(_persistit);

Subscribers

People subscribed via source and target branches