Code review comment for lp:~jameinel/bzr/2.0-490228-gc-segfault

Revision history for this message
John A Meinel (jameinel) wrote :

Bug fix for the groupcompress code. The actual diff here also cleans up the specific code to be a bit more readable. But the minimal change is just:
=== modified file 'bzrlib/diff-delta.c'
--- bzrlib/diff-delta.c 2009-08-03 16:54:36 +0000
+++ bzrlib/diff-delta.c 2009-12-14 15:52:24 +0000
@@ -804,8 +804,8 @@
             old_entry--;
         }
         old_entry++;
- if (old_entry->ptr != NULL
- || old_entry >= old_index->hash[hash_offset + 1]) {
+ if (old_entry >= old_index->hash[hash_offset + 1]
+ || old_entry->ptr != NULL) {
             /* There is no room for this entry, we have to resize */
             // char buff[128];
             // get_text(buff, entry->ptr);

I don't have a good way to test this. We adapted this code from a 3rd-party source and getting at the specific structures is pretty difficult. And replicating it with real data would be very difficult. You would need:

1) Custom text data that gives a RABIN_HASH that RABIN_HASH & (hash_map_mask) falls into the last bucket.
2) Enough of them to fill the final hash bucket.
3) A memory allocator that doesn't provide any extra bytes after the end of the entry table, such that reading "entry_table[len(entry_table)]->ptr" gives a segfault. (len() versus len()-1)

I would like to revisit the groupcompress data structures, but I don't think that is a profitable use of my time right now.

« Back to merge proposal