Merge lp:~dawgfoto/duplicity/replicate into lp:~duplicity-team/duplicity/0.8-series

Proposed by Martin Nowak
Status: Merged
Merged at revision: 1209
Proposed branch: lp:~dawgfoto/duplicity/replicate
Merge into: lp:~duplicity-team/duplicity/0.8-series
Diff against target: 380 lines (+190/-24)
5 files modified
bin/duplicity (+116/-2)
bin/duplicity.1 (+17/-0)
duplicity/collections.py (+12/-3)
duplicity/commandline.py (+36/-19)
duplicity/file_naming.py (+9/-0)
To merge this branch: bzr merge lp:~dawgfoto/duplicity/replicate
Reviewer Review Type Date Requested Status
duplicity-team Pending
Review via email: mp+322836@code.launchpad.net

Description of the change

Initial request for feedback.

Add replicate command to replicate a backup (or backup sets older than a given time) to another backend, leveraging duplicity's backend and compression/encryption infrastructure.

To post a comment you must log in.
lp:~dawgfoto/duplicity/replicate updated
1191. By Kenneth Loafman

2017-04-20 Kenneth Loafman <email address hidden>

    * Fixed bug #1680682 with patch supplied from Dave Allan
      - Only specify --pinentry-mode=loopback when --use-agent is not specified
    * Fixed man page that had 'cancel' instead of 'loopback' for pinentry mode
    * Fixed bug #1684312 with suggestion from Wade Rossman
      - Use shutil.copyfile instead of os.system('cp ...')
      - Should reduce overhead of os.system() memory usage.

1192. By Kenneth Loafman

* Merged in lp:~dernils/duplicity/testing
  - Fixed minor stuff in requirements.txt.
  - Added a Dockerfile for testing.
  - Minor changes to README files.
  - Added README-TESTING with some information on testing.

1193. By Kenneth Loafman

* Merged in lp:~dernils/duplicity/documentation
  - Minor changes to README-REPO, README-TESTING
  - Also redo the changes to requirements.txt and Dockerfile

1194. By ken

* Add rdiff install and newline at end of file.

1195. By Kenneth Loafman

Move pep8 and pylint to requirements.

1196. By Kenneth Loafman

Whoops, deleted too much. Add rdiff again.

1197. By Kenneth Loafman

Merged in lp:~dernils/duplicity/Dockerfile
Fixed variable name change in commandline.py

1198. By Kenneth Loafman

More changes for testing:
- keep gpg1 version for future testing
- some changes for debugging functional tests
- add gpg-agent.conf with allow-loopback-pinentry

1199. By ken

A little reorg, just keeping pip things together.

1200. By ken

Quick fix for bug #1680682 and gnupg v1, add missing comma.

1201. By ken

- Simplify Dockerfile per https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
- Add a .dockerignore file
- Uncomment some debug prints
- quick fix for bug #1680682 and gnupg v1, add missing comma

1202. By ken

Move branch duplicity up the food chain.

1203. By ken

Add test user and swap to non-priviledged.

1204. By ken

- Remove dependencies we did not need

1205. By ken

Merged in lp:~dernils/duplicity/Dockerfile
- separated requirements into requirements for duplicity (in requirements.txt) and for testing (in tox.ini)

1206. By ken

Add libffi-dev back. My bad.

1207. By ken

You need tox to run tox. Doh!

1208. By ken

We need tzdata (timezone data).

1209. By Kenneth Loafman

* Merged in lp:~dawgfoto/duplicity/replicate
  - Add replicate command to replicate a backup (or backup
    sets older than a given time) to another backend, leveraging
    duplicity's backend and compression/encryption infrastructure.
* Fixed some incoming PyLint and PEP-8 errors.

Revision history for this message
Aaron Whitehouse (aaron-whitehouse) wrote :

Many thanks Martin!

Could you please submit some tests that ensure your code keeps working as expected? There should be an example of most things you want to test and I would be happy to help navigate them to the extent that I can.

Just an end-to-end functional test would be a great start.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/duplicity'
2--- bin/duplicity 2017-03-02 22:38:47 +0000
3+++ bin/duplicity 2017-04-24 16:35:20 +0000
4@@ -28,6 +28,7 @@
5 # any suggestions.
6
7 import duplicity.errors
8+import copy
9 import gzip
10 import os
11 import platform
12@@ -1006,6 +1007,116 @@
13 "\n" + chain_times_str(chainlist) + "\n" +
14 _("Rerun command with --force option to actually delete."))
15
16+def replicate():
17+ """
18+ Replicate backup files from one remote to another, possibly encrypting or adding parity.
19+
20+ @rtype: void
21+ @return: void
22+ """
23+ time = globals.restore_time or dup_time.curtime
24+ src_stats = collections.CollectionsStatus(globals.src_backend, None).set_values(sig_chain_warning=None)
25+ tgt_stats = collections.CollectionsStatus(globals.backend, None).set_values(sig_chain_warning=None)
26+
27+ src_list = globals.src_backend.list()
28+ tgt_list = globals.backend.list()
29+
30+ src_chainlist = src_stats.get_signature_chains(local=False, filelist=src_list)[0]
31+ tgt_chainlist = tgt_stats.get_signature_chains(local=False, filelist=tgt_list)[0]
32+ sorted(src_chainlist, key=lambda chain: chain.start_time)
33+ sorted(tgt_chainlist, key=lambda chain: chain.start_time)
34+ if not src_chainlist:
35+ log.Notice(_("No old backup sets found."))
36+ return
37+ for src_chain in src_chainlist:
38+ try:
39+ tgt_chain = filter(lambda chain: chain.start_time == src_chain.start_time, tgt_chainlist)[0]
40+ except IndexError:
41+ tgt_chain = None
42+
43+ tgt_sigs = map(file_naming.parse, tgt_chain.get_filenames()) if tgt_chain else []
44+ for src_sig_filename in src_chain.get_filenames():
45+ src_sig = file_naming.parse(src_sig_filename)
46+ if not (src_sig.time or src_sig.end_time) < time:
47+ continue
48+ try:
49+ tgt_sigs.remove(src_sig)
50+ log.Info(_("Signature %s already replicated") % (src_sig_filename,))
51+ continue
52+ except ValueError:
53+ pass
54+ if src_sig.type == 'new-sig':
55+ dup_time.setprevtime(src_sig.start_time)
56+ dup_time.setcurtime(src_sig.time or src_sig.end_time)
57+ log.Notice(_("Replicating %s.") % (src_sig_filename,))
58+ fileobj = globals.src_backend.get_fileobj_read(src_sig_filename)
59+ filename = file_naming.get(src_sig.type, encrypted=globals.encryption, gzipped=globals.compression)
60+ tdp = dup_temp.new_tempduppath(file_naming.parse(filename))
61+ tmpobj = tdp.filtered_open(mode='wb')
62+ util.copyfileobj(fileobj, tmpobj) # decrypt, compress, (re)-encrypt
63+ fileobj.close()
64+ tmpobj.close()
65+ globals.backend.put(tdp, filename)
66+ tdp.delete()
67+
68+ src_chainlist = src_stats.get_backup_chains(filename_list = src_list)[0]
69+ tgt_chainlist = tgt_stats.get_backup_chains(filename_list = tgt_list)[0]
70+ sorted(src_chainlist, key=lambda chain: chain.start_time)
71+ sorted(tgt_chainlist, key=lambda chain: chain.start_time)
72+ for src_chain in src_chainlist:
73+ try:
74+ tgt_chain = filter(lambda chain: chain.start_time == src_chain.start_time, tgt_chainlist)[0]
75+ except IndexError:
76+ tgt_chain = None
77+
78+ tgt_sets = tgt_chain.get_all_sets() if tgt_chain else []
79+ for src_set in src_chain.get_all_sets():
80+ if not src_set.get_time() < time:
81+ continue
82+ try:
83+ tgt_sets.remove(src_set)
84+ log.Info(_("Backupset %s already replicated") % (src_set.remote_manifest_name,))
85+ continue
86+ except ValueError:
87+ pass
88+ if src_set.type == 'inc':
89+ dup_time.setprevtime(src_set.start_time)
90+ dup_time.setcurtime(src_set.get_time())
91+ rmf = src_set.get_remote_manifest()
92+ mf_filename = file_naming.get(src_set.type, manifest=True)
93+ mf_tdp = dup_temp.new_tempduppath(file_naming.parse(mf_filename))
94+ mf = manifest.Manifest(fh=mf_tdp.filtered_open(mode='wb'))
95+ for i, filename in src_set.volume_name_dict.iteritems():
96+ log.Notice(_("Replicating %s.") % (filename,))
97+ fileobj = restore_get_enc_fileobj(globals.src_backend, filename, rmf.volume_info_dict[i])
98+ filename = file_naming.get(src_set.type, i, encrypted=globals.encryption, gzipped=globals.compression)
99+ tdp = dup_temp.new_tempduppath(file_naming.parse(filename))
100+ tmpobj = tdp.filtered_open(mode='wb')
101+ util.copyfileobj(fileobj, tmpobj) # decrypt, compress, (re)-encrypt
102+ fileobj.close()
103+ tmpobj.close()
104+ globals.backend.put(tdp, filename)
105+
106+ vi = copy.copy(rmf.volume_info_dict[i])
107+ vi.set_hash("SHA1", gpg.get_hash("SHA1", tdp))
108+ mf.add_volume_info(vi)
109+
110+ tdp.delete()
111+
112+ mf.fh.close()
113+ # incremental GPG writes hang on close, so do any encryption here at once
114+ mf_fileobj = mf_tdp.filtered_open_with_delete(mode='rb')
115+ mf_final_filename = file_naming.get(src_set.type, manifest=True, encrypted=globals.encryption, gzipped=globals.compression)
116+ mf_final_tdp = dup_temp.new_tempduppath(file_naming.parse(mf_final_filename))
117+ mf_final_fileobj = mf_final_tdp.filtered_open(mode='wb')
118+ util.copyfileobj(mf_fileobj, mf_final_fileobj) # compress, encrypt
119+ mf_fileobj.close()
120+ mf_final_fileobj.close()
121+ globals.backend.put(mf_final_tdp, mf_final_filename)
122+ mf_final_tdp.delete()
123+
124+ globals.src_backend.close()
125+ globals.backend.close()
126
127 def sync_archive(decrypt):
128 """
129@@ -1408,8 +1519,9 @@
130 check_resources(action)
131
132 # check archive synch with remote, fix if needed
133- decrypt = action not in ["collection-status"]
134- sync_archive(decrypt)
135+ if not action == "replicate":
136+ decrypt = action not in ["collection-status"]
137+ sync_archive(decrypt)
138
139 # get current collection status
140 col_stats = collections.CollectionsStatus(globals.backend,
141@@ -1483,6 +1595,8 @@
142 remove_all_but_n_full(col_stats)
143 elif action == "sync":
144 sync_archive(True)
145+ elif action == "replicate":
146+ replicate()
147 else:
148 assert action == "inc" or action == "full", action
149 # the passphrase for full and inc is used by --sign-key
150
151=== modified file 'bin/duplicity.1'
152--- bin/duplicity.1 2017-04-22 19:30:28 +0000
153+++ bin/duplicity.1 2017-04-24 16:35:20 +0000
154@@ -48,6 +48,10 @@
155 .I [options] [--force] [--extra-clean]
156 target_url
157
158+.B duplicity replicate
159+.I [options] [--time time]
160+source_url target_url
161+
162 .SH DESCRIPTION
163 Duplicity incrementally backs up files and folders into
164 tar-format volumes encrypted with GnuPG and places them to a
165@@ -243,6 +247,19 @@
166 .I --force
167 will be needed to delete the files instead of just listing them.
168
169+.TP
170+.BI "replicate " "[--time time] <source_url> <target_url>"
171+Replicate backup sets from source to target backend. Files will be
172+(re)-encrypted and (re)-compressed depending on normal backend
173+options. Signatures and volumes will not get recomputed, thus options like
174+.BI --volsize
175+or
176+.BI --max-blocksize
177+have no effect.
178+When
179+.I --time time
180+is given, only backup sets older than time will be replicated.
181+
182 .SH OPTIONS
183
184 .TP
185
186=== modified file 'duplicity/collections.py'
187--- duplicity/collections.py 2017-02-27 13:18:57 +0000
188+++ duplicity/collections.py 2017-04-24 16:35:20 +0000
189@@ -294,6 +294,15 @@
190 """
191 return len(self.volume_name_dict.keys())
192
193+ def __eq__(self, other):
194+ """
195+ Return whether this backup set is equal to other
196+ """
197+ return self.type == other.type and \
198+ self.time == other.time and \
199+ self.start_time == other.start_time and \
200+ self.end_time == other.end_time and \
201+ len(self) == len(other)
202
203 class BackupChain:
204 """
205@@ -642,7 +651,7 @@
206 u"-----------------",
207 _("Connecting with backend: %s") %
208 (self.backend.__class__.__name__,),
209- _("Archive dir: %s") % (util.ufn(self.archive_dir_path.name),)]
210+ _("Archive dir: %s") % (util.ufn(self.archive_dir_path.name if self.archive_dir_path else 'None'),)]
211
212 l.append("\n" +
213 ngettext("Found %d secondary backup chain.",
214@@ -697,7 +706,7 @@
215 len(backend_filename_list))
216
217 # get local filename list
218- local_filename_list = self.archive_dir_path.listdir()
219+ local_filename_list = self.archive_dir_path.listdir() if self.archive_dir_path else []
220 log.Debug(ngettext("%d file exists in cache",
221 "%d files exist in cache",
222 len(local_filename_list)) %
223@@ -894,7 +903,7 @@
224 if filelist is not None:
225 return filelist
226 elif local:
227- return self.archive_dir_path.listdir()
228+ return self.archive_dir_path.listdir() if self.archive_dir_path else []
229 else:
230 return self.backend.list()
231
232
233=== modified file 'duplicity/commandline.py'
234--- duplicity/commandline.py 2017-02-27 13:18:57 +0000
235+++ duplicity/commandline.py 2017-04-24 16:35:20 +0000
236@@ -54,6 +54,7 @@
237 collection_status = None # Will be set to true if collection-status command given
238 cleanup = None # Set to true if cleanup command given
239 verify = None # Set to true if verify command given
240+replicate = None # Set to true if replicate command given
241
242 commands = ["cleanup",
243 "collection-status",
244@@ -65,6 +66,7 @@
245 "remove-all-inc-of-but-n-full",
246 "restore",
247 "verify",
248+ "replicate"
249 ]
250
251
252@@ -236,7 +238,7 @@
253 def parse_cmdline_options(arglist):
254 """Parse argument list"""
255 global select_opts, select_files, full_backup
256- global list_current, collection_status, cleanup, remove_time, verify
257+ global list_current, collection_status, cleanup, remove_time, verify, replicate
258
259 def set_log_fd(fd):
260 if fd < 1:
261@@ -706,6 +708,9 @@
262 num_expect = 1
263 elif cmd == "verify":
264 verify = True
265+ elif cmd == "replicate":
266+ replicate = True
267+ num_expect = 2
268
269 if len(args) != num_expect:
270 command_line_error("Expected %d args, got %d" % (num_expect, len(args)))
271@@ -724,7 +729,12 @@
272 elif len(args) == 1:
273 backend_url = args[0]
274 elif len(args) == 2:
275- lpath, backend_url = args_to_path_backend(args[0], args[1]) # @UnusedVariable
276+ if replicate:
277+ if not backend.is_backend_url(args[0]) or not backend.is_backend_url(args[1]):
278+ command_line_error("Two URLs expected for replicate.")
279+ src_backend_url, backend_url= args[0], args[1]
280+ else:
281+ lpath, backend_url = args_to_path_backend(args[0], args[1]) # @UnusedVariable
282 else:
283 command_line_error("Too many arguments")
284
285@@ -899,6 +909,7 @@
286 duplicity remove-older-than %(time)s [%(options)s] %(target_url)s
287 duplicity remove-all-but-n-full %(count)s [%(options)s] %(target_url)s
288 duplicity remove-all-inc-of-but-n-full %(count)s [%(options)s] %(target_url)s
289+ duplicity replicate %(source_url)s %(target_url)s
290
291 """ % dict
292
293@@ -944,7 +955,8 @@
294 remove-older-than <%(time)s> <%(target_url)s>
295 remove-all-but-n-full <%(count)s> <%(target_url)s>
296 remove-all-inc-of-but-n-full <%(count)s> <%(target_url)s>
297- verify <%(target_url)s> <%(source_dir)s>""" % dict
298+ verify <%(target_url)s> <%(source_dir)s>
299+ replicate <%(source_url)s> <%(target_url)s>""" % dict
300
301 return msg
302
303@@ -1047,7 +1059,7 @@
304
305 def check_consistency(action):
306 """Final consistency check, see if something wrong with command line"""
307- global full_backup, select_opts, list_current
308+ global full_backup, select_opts, list_current, collection_status, cleanup, replicate
309
310 def assert_only_one(arglist):
311 """Raises error if two or more of the elements of arglist are true"""
312@@ -1058,8 +1070,8 @@
313 assert n <= 1, "Invalid syntax, two conflicting modes specified"
314
315 if action in ["list-current", "collection-status",
316- "cleanup", "remove-old", "remove-all-but-n-full", "remove-all-inc-of-but-n-full"]:
317- assert_only_one([list_current, collection_status, cleanup,
318+ "cleanup", "remove-old", "remove-all-but-n-full", "remove-all-inc-of-but-n-full", "replicate"]:
319+ assert_only_one([list_current, collection_status, cleanup, replicate,
320 globals.remove_time is not None])
321 elif action == "restore" or action == "verify":
322 if full_backup:
323@@ -1137,22 +1149,27 @@
324 "file:///usr/local". See the man page for more information.""") % (args[0],),
325 log.ErrorCode.bad_url)
326 elif len(args) == 2:
327- # Figure out whether backup or restore
328- backup, local_pathname = set_backend(args[0], args[1])
329- if backup:
330- if full_backup:
331- action = "full"
332- else:
333- action = "inc"
334+ if replicate:
335+ globals.src_backend = backend.get_backend(args[0])
336+ globals.backend = backend.get_backend(args[1])
337+ action = "replicate"
338 else:
339- if verify:
340- action = "verify"
341+ # Figure out whether backup or restore
342+ backup, local_pathname = set_backend(args[0], args[1])
343+ if backup:
344+ if full_backup:
345+ action = "full"
346+ else:
347+ action = "inc"
348 else:
349- action = "restore"
350+ if verify:
351+ action = "verify"
352+ else:
353+ action = "restore"
354
355- process_local_dir(action, local_pathname)
356- if action in ['full', 'inc', 'verify']:
357- set_selection()
358+ process_local_dir(action, local_pathname)
359+ if action in ['full', 'inc', 'verify']:
360+ set_selection()
361 elif len(args) > 2:
362 raise AssertionError("this code should not be reachable")
363
364
365=== modified file 'duplicity/file_naming.py'
366--- duplicity/file_naming.py 2016-06-28 21:03:46 +0000
367+++ duplicity/file_naming.py 2017-04-24 16:35:20 +0000
368@@ -436,3 +436,12 @@
369 self.encrypted = encrypted # true if gpg encrypted
370
371 self.partial = partial
372+
373+ def __eq__(self, other):
374+ return self.type == other.type and \
375+ self.manifest == other.manifest and \
376+ self.time == other.time and \
377+ self.start_time == other.start_time and \
378+ self.end_time == other.end_time and \
379+ self.partial == other.partial
380+

Subscribers

People subscribed via source and target branches