Merge lp:~mterry/duplicity/disappearing-source into lp:duplicity/0.6-series

Proposed by Michael Terry on 2013-11-16
Status: Merged
Merged at revision: 935
Proposed branch: lp:~mterry/duplicity/disappearing-source
Merge into: lp:duplicity/0.6-series
Prerequisite: lp:~mterry/duplicity/manifest-oddities
Diff against target: 103 lines (+42/-2)
3 files modified
bin/duplicity (+5/-2)
duplicity/diffdir.py (+13/-0)
testing/tests/restarttest.py (+24/-0)
To merge this branch: bzr merge lp:~mterry/duplicity/disappearing-source
Reviewer Review Type Date Requested Status
duplicity-team 2013-11-16 Pending
Review via email: mp+195461@code.launchpad.net

Description of the change

When restarting a backup, we may accidentally skip the first chunk of one of the source files. To reproduce this,:
1) interrupt a backup
2) delete the source file it was in the middle of
3) restart the backup

When replaying the source iterator to find where to resume from, we can't notice that the file is gone until we've already iterated past where it would be!

The solution I came up with is to just let duplicity stuff the data we accidentally read back into the source iterator.

This is actually a data loss bug, because it's possible to back up corrupted files (that are missing their first chunk).

To post a comment you must log in.
933. By Michael Terry on 2013-11-16

Merge latest manifest-oddities changes

Michael Terry (mterry) wrote :

I wrote in https://code.launchpad.net/~mterry/duplicity/catch-seq-copy-error/+merge/186106:

"I'm getting the feeling that all three bugs have the same root cause: a missing sequence object. 662442 can be explained by a missing patch in the sequence, and the other two can be explained by a missing first snapshot..."

This bug might explain those problems... A missing first snapshot if the file is tiny or a missing first patch in the sequence if it's larger.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/duplicity'
2--- bin/duplicity 2013-09-20 01:32:31 +0000
3+++ bin/duplicity 2013-11-16 02:23:59 +0000
4@@ -223,7 +223,8 @@
5 last_block = globals.restart.last_block
6 try:
7 # Just spin our wheels
8- while tarblock_iter.next():
9+ iter_result = tarblock_iter.next()
10+ while iter_result:
11 if (tarblock_iter.previous_index == last_index):
12 # If both the previous index and this index are done, exit now
13 # before we hit the next index, to prevent skipping its first
14@@ -238,13 +239,15 @@
15 "Continuing restart on file %s.") %
16 ("/".join(last_index), "/".join(tarblock_iter.previous_index)),
17 log.ErrorCode.restart_file_not_found)
18+ # We went too far! Stuff the data back into place before restarting
19+ tarblock_iter.queue_index_data(iter_result)
20 break
21+ iter_result = tarblock_iter.next()
22 except StopIteration:
23 log.Warn(_("File %s missing in backup set.\n"
24 "Continuing restart on file %s.") %
25 ("/".join(last_index), "/".join(tarblock_iter.previous_index)),
26 log.ErrorCode.restart_file_not_found)
27- return 0
28
29
30 def write_multivol(backup_type, tarblock_iter, man_outfp, sig_outfp, backend):
31
32=== modified file 'duplicity/diffdir.py'
33--- duplicity/diffdir.py 2013-04-27 14:48:39 +0000
34+++ duplicity/diffdir.py 2013-11-16 02:23:59 +0000
35@@ -469,6 +469,7 @@
36 self.remember_next = False # see remember_next_index()
37 self.remember_value = None # holds index of next block
38 self.remember_block = None # holds block of next block
39+ self.queued_data = None # data to return in next next() call
40
41 def tarinfo2tarblock(self, index, tarinfo, file_data = ""):
42 """
43@@ -504,6 +505,12 @@
44 """
45 Return next block and update offset
46 """
47+ if self.queued_data is not None:
48+ result = self.queued_data
49+ self.queued_data = None
50+ # Keep rest of metadata as is (like previous_index)
51+ return result
52+
53 if self.process_waiting:
54 result = self.process_continued()
55 else:
56@@ -532,6 +539,12 @@
57 """
58 return self.previous_index, self.previous_block
59
60+ def queue_index_data(self, data):
61+ """
62+ Next time next() is called, we will return data instead of processing
63+ """
64+ self.queued_data = data
65+
66 def remember_next_index(self):
67 """
68 When called, remember the index of the next block iterated
69
70=== modified file 'testing/tests/restarttest.py'
71--- testing/tests/restarttest.py 2013-11-16 02:23:59 +0000
72+++ testing/tests/restarttest.py 2013-11-16 02:23:59 +0000
73@@ -471,6 +471,30 @@
74 # and verify we can restore
75 self.restore()
76
77+ def test_changed_source_file_disappears(self):
78+ """
79+ Make sure we correctly handle restarting a backup when a file
80+ disappears when we had been in the middle of backing it up. It's
81+ possible that the first chunk of the next file will be skipped unless
82+ we're careful.
83+ """
84+ source = 'testfiles/largefiles'
85+ self.make_largefiles(count=1)
86+ # intentionally interrupt initial backup
87+ try:
88+ self.backup("full", source, options = ["--vol 1", "--fail 2"])
89+ self.fail()
90+ except CmdError, e:
91+ self.assertEqual(30, e.exit_status)
92+ # now remove starting source data and make sure we add something after
93+ assert not os.system("rm %s/*" % source)
94+ assert not os.system("echo hello > %s/z" % source)
95+ # finish backup
96+ self.backup("full", source)
97+ # and verify we can restore
98+ self.restore()
99+ assert not os.system("diff %s/z testfiles/restore_out/z" % source)
100+
101
102 # Note that this class duplicates all the tests in RestartTest
103 class RestartTestWithoutEncryption(RestartTest):

Subscribers

People subscribed via source and target branches

to all changes: