Merge lp:~dawgfoto/duplicity/replicate into lp:duplicity

Proposed by Martin Nowak on 2017-04-20
Status: Merged
Merged at revision: 1209
Proposed branch: lp:~dawgfoto/duplicity/replicate
Merge into: lp:duplicity
Diff against target: 380 lines (+190/-24)
5 files modified
bin/duplicity (+116/-2)
bin/duplicity.1 (+17/-0)
duplicity/collections.py (+12/-3)
duplicity/commandline.py (+36/-19)
duplicity/file_naming.py (+9/-0)
To merge this branch: bzr merge lp:~dawgfoto/duplicity/replicate
Reviewer Review Type Date Requested Status
duplicity-team 2017-04-20 Pending
Review via email: mp+322836@code.launchpad.net

Description of the change

Initial request for feedback.

Add replicate command to replicate a backup (or backup sets older than a given time) to another backend, leveraging duplicity's backend and compression/encryption infrastructure.

To post a comment you must log in.
lp:~dawgfoto/duplicity/replicate updated on 2017-05-04
1191. By Kenneth Loafman on 2017-04-22

2017-04-20 Kenneth Loafman <email address hidden>

    * Fixed bug #1680682 with patch supplied from Dave Allan
      - Only specify --pinentry-mode=loopback when --use-agent is not specified
    * Fixed man page that had 'cancel' instead of 'loopback' for pinentry mode
    * Fixed bug #1684312 with suggestion from Wade Rossman
      - Use shutil.copyfile instead of os.system('cp ...')
      - Should reduce overhead of os.system() memory usage.

1192. By Kenneth Loafman on 2017-04-23

* Merged in lp:~dernils/duplicity/testing
  - Fixed minor stuff in requirements.txt.
  - Added a Dockerfile for testing.
  - Minor changes to README files.
  - Added README-TESTING with some information on testing.

1193. By Kenneth Loafman on 2017-04-23

* Merged in lp:~dernils/duplicity/documentation
  - Minor changes to README-REPO, README-TESTING
  - Also redo the changes to requirements.txt and Dockerfile

1194. By ken on 2017-04-25

* Add rdiff install and newline at end of file.

1195. By Kenneth Loafman on 2017-04-25

Move pep8 and pylint to requirements.

1196. By Kenneth Loafman on 2017-04-26

Whoops, deleted too much. Add rdiff again.

1197. By Kenneth Loafman on 2017-04-26

Merged in lp:~dernils/duplicity/Dockerfile
Fixed variable name change in commandline.py

1198. By Kenneth Loafman on 2017-04-27

More changes for testing:
- keep gpg1 version for future testing
- some changes for debugging functional tests
- add gpg-agent.conf with allow-loopback-pinentry

1199. By ken on 2017-04-27

A little reorg, just keeping pip things together.

1200. By ken on 2017-04-28

Quick fix for bug #1680682 and gnupg v1, add missing comma.

1201. By ken on 2017-04-28

- Simplify Dockerfile per https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
- Add a .dockerignore file
- Uncomment some debug prints
- quick fix for bug #1680682 and gnupg v1, add missing comma

1202. By ken on 2017-04-28

Move branch duplicity up the food chain.

1203. By ken on 2017-04-28

Add test user and swap to non-priviledged.

1204. By ken on 2017-04-29

- Remove dependencies we did not need

1205. By ken on 2017-04-30

Merged in lp:~dernils/duplicity/Dockerfile
- separated requirements into requirements for duplicity (in requirements.txt) and for testing (in tox.ini)

1206. By ken on 2017-04-30

Add libffi-dev back. My bad.

1207. By ken on 2017-04-30

You need tox to run tox. Doh!

1208. By ken on 2017-05-03

We need tzdata (timezone data).

1209. By Kenneth Loafman on 2017-05-04

* Merged in lp:~dawgfoto/duplicity/replicate
  - Add replicate command to replicate a backup (or backup
    sets older than a given time) to another backend, leveraging
    duplicity's backend and compression/encryption infrastructure.
* Fixed some incoming PyLint and PEP-8 errors.

Aaron Whitehouse (aaron-whitehouse) wrote :

Many thanks Martin!

Could you please submit some tests that ensure your code keeps working as expected? There should be an example of most things you want to test and I would be happy to help navigate them to the extent that I can.

Just an end-to-end functional test would be a great start.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/duplicity'
2--- bin/duplicity 2017-03-02 22:38:47 +0000
3+++ bin/duplicity 2017-04-24 16:35:20 +0000
4@@ -28,6 +28,7 @@
5 # any suggestions.
6
7 import duplicity.errors
8+import copy
9 import gzip
10 import os
11 import platform
12@@ -1006,6 +1007,116 @@
13 "\n" + chain_times_str(chainlist) + "\n" +
14 _("Rerun command with --force option to actually delete."))
15
16+def replicate():
17+ """
18+ Replicate backup files from one remote to another, possibly encrypting or adding parity.
19+
20+ @rtype: void
21+ @return: void
22+ """
23+ time = globals.restore_time or dup_time.curtime
24+ src_stats = collections.CollectionsStatus(globals.src_backend, None).set_values(sig_chain_warning=None)
25+ tgt_stats = collections.CollectionsStatus(globals.backend, None).set_values(sig_chain_warning=None)
26+
27+ src_list = globals.src_backend.list()
28+ tgt_list = globals.backend.list()
29+
30+ src_chainlist = src_stats.get_signature_chains(local=False, filelist=src_list)[0]
31+ tgt_chainlist = tgt_stats.get_signature_chains(local=False, filelist=tgt_list)[0]
32+ sorted(src_chainlist, key=lambda chain: chain.start_time)
33+ sorted(tgt_chainlist, key=lambda chain: chain.start_time)
34+ if not src_chainlist:
35+ log.Notice(_("No old backup sets found."))
36+ return
37+ for src_chain in src_chainlist:
38+ try:
39+ tgt_chain = filter(lambda chain: chain.start_time == src_chain.start_time, tgt_chainlist)[0]
40+ except IndexError:
41+ tgt_chain = None
42+
43+ tgt_sigs = map(file_naming.parse, tgt_chain.get_filenames()) if tgt_chain else []
44+ for src_sig_filename in src_chain.get_filenames():
45+ src_sig = file_naming.parse(src_sig_filename)
46+ if not (src_sig.time or src_sig.end_time) < time:
47+ continue
48+ try:
49+ tgt_sigs.remove(src_sig)
50+ log.Info(_("Signature %s already replicated") % (src_sig_filename,))
51+ continue
52+ except ValueError:
53+ pass
54+ if src_sig.type == 'new-sig':
55+ dup_time.setprevtime(src_sig.start_time)
56+ dup_time.setcurtime(src_sig.time or src_sig.end_time)
57+ log.Notice(_("Replicating %s.") % (src_sig_filename,))
58+ fileobj = globals.src_backend.get_fileobj_read(src_sig_filename)
59+ filename = file_naming.get(src_sig.type, encrypted=globals.encryption, gzipped=globals.compression)
60+ tdp = dup_temp.new_tempduppath(file_naming.parse(filename))
61+ tmpobj = tdp.filtered_open(mode='wb')
62+ util.copyfileobj(fileobj, tmpobj) # decrypt, compress, (re)-encrypt
63+ fileobj.close()
64+ tmpobj.close()
65+ globals.backend.put(tdp, filename)
66+ tdp.delete()
67+
68+ src_chainlist = src_stats.get_backup_chains(filename_list = src_list)[0]
69+ tgt_chainlist = tgt_stats.get_backup_chains(filename_list = tgt_list)[0]
70+ sorted(src_chainlist, key=lambda chain: chain.start_time)
71+ sorted(tgt_chainlist, key=lambda chain: chain.start_time)
72+ for src_chain in src_chainlist:
73+ try:
74+ tgt_chain = filter(lambda chain: chain.start_time == src_chain.start_time, tgt_chainlist)[0]
75+ except IndexError:
76+ tgt_chain = None
77+
78+ tgt_sets = tgt_chain.get_all_sets() if tgt_chain else []
79+ for src_set in src_chain.get_all_sets():
80+ if not src_set.get_time() < time:
81+ continue
82+ try:
83+ tgt_sets.remove(src_set)
84+ log.Info(_("Backupset %s already replicated") % (src_set.remote_manifest_name,))
85+ continue
86+ except ValueError:
87+ pass
88+ if src_set.type == 'inc':
89+ dup_time.setprevtime(src_set.start_time)
90+ dup_time.setcurtime(src_set.get_time())
91+ rmf = src_set.get_remote_manifest()
92+ mf_filename = file_naming.get(src_set.type, manifest=True)
93+ mf_tdp = dup_temp.new_tempduppath(file_naming.parse(mf_filename))
94+ mf = manifest.Manifest(fh=mf_tdp.filtered_open(mode='wb'))
95+ for i, filename in src_set.volume_name_dict.iteritems():
96+ log.Notice(_("Replicating %s.") % (filename,))
97+ fileobj = restore_get_enc_fileobj(globals.src_backend, filename, rmf.volume_info_dict[i])
98+ filename = file_naming.get(src_set.type, i, encrypted=globals.encryption, gzipped=globals.compression)
99+ tdp = dup_temp.new_tempduppath(file_naming.parse(filename))
100+ tmpobj = tdp.filtered_open(mode='wb')
101+ util.copyfileobj(fileobj, tmpobj) # decrypt, compress, (re)-encrypt
102+ fileobj.close()
103+ tmpobj.close()
104+ globals.backend.put(tdp, filename)
105+
106+ vi = copy.copy(rmf.volume_info_dict[i])
107+ vi.set_hash("SHA1", gpg.get_hash("SHA1", tdp))
108+ mf.add_volume_info(vi)
109+
110+ tdp.delete()
111+
112+ mf.fh.close()
113+ # incremental GPG writes hang on close, so do any encryption here at once
114+ mf_fileobj = mf_tdp.filtered_open_with_delete(mode='rb')
115+ mf_final_filename = file_naming.get(src_set.type, manifest=True, encrypted=globals.encryption, gzipped=globals.compression)
116+ mf_final_tdp = dup_temp.new_tempduppath(file_naming.parse(mf_final_filename))
117+ mf_final_fileobj = mf_final_tdp.filtered_open(mode='wb')
118+ util.copyfileobj(mf_fileobj, mf_final_fileobj) # compress, encrypt
119+ mf_fileobj.close()
120+ mf_final_fileobj.close()
121+ globals.backend.put(mf_final_tdp, mf_final_filename)
122+ mf_final_tdp.delete()
123+
124+ globals.src_backend.close()
125+ globals.backend.close()
126
127 def sync_archive(decrypt):
128 """
129@@ -1408,8 +1519,9 @@
130 check_resources(action)
131
132 # check archive synch with remote, fix if needed
133- decrypt = action not in ["collection-status"]
134- sync_archive(decrypt)
135+ if not action == "replicate":
136+ decrypt = action not in ["collection-status"]
137+ sync_archive(decrypt)
138
139 # get current collection status
140 col_stats = collections.CollectionsStatus(globals.backend,
141@@ -1483,6 +1595,8 @@
142 remove_all_but_n_full(col_stats)
143 elif action == "sync":
144 sync_archive(True)
145+ elif action == "replicate":
146+ replicate()
147 else:
148 assert action == "inc" or action == "full", action
149 # the passphrase for full and inc is used by --sign-key
150
151=== modified file 'bin/duplicity.1'
152--- bin/duplicity.1 2017-04-22 19:30:28 +0000
153+++ bin/duplicity.1 2017-04-24 16:35:20 +0000
154@@ -48,6 +48,10 @@
155 .I [options] [--force] [--extra-clean]
156 target_url
157
158+.B duplicity replicate
159+.I [options] [--time time]
160+source_url target_url
161+
162 .SH DESCRIPTION
163 Duplicity incrementally backs up files and folders into
164 tar-format volumes encrypted with GnuPG and places them to a
165@@ -243,6 +247,19 @@
166 .I --force
167 will be needed to delete the files instead of just listing them.
168
169+.TP
170+.BI "replicate " "[--time time] <source_url> <target_url>"
171+Replicate backup sets from source to target backend. Files will be
172+(re)-encrypted and (re)-compressed depending on normal backend
173+options. Signatures and volumes will not get recomputed, thus options like
174+.BI --volsize
175+or
176+.BI --max-blocksize
177+have no effect.
178+When
179+.I --time time
180+is given, only backup sets older than time will be replicated.
181+
182 .SH OPTIONS
183
184 .TP
185
186=== modified file 'duplicity/collections.py'
187--- duplicity/collections.py 2017-02-27 13:18:57 +0000
188+++ duplicity/collections.py 2017-04-24 16:35:20 +0000
189@@ -294,6 +294,15 @@
190 """
191 return len(self.volume_name_dict.keys())
192
193+ def __eq__(self, other):
194+ """
195+ Return whether this backup set is equal to other
196+ """
197+ return self.type == other.type and \
198+ self.time == other.time and \
199+ self.start_time == other.start_time and \
200+ self.end_time == other.end_time and \
201+ len(self) == len(other)
202
203 class BackupChain:
204 """
205@@ -642,7 +651,7 @@
206 u"-----------------",
207 _("Connecting with backend: %s") %
208 (self.backend.__class__.__name__,),
209- _("Archive dir: %s") % (util.ufn(self.archive_dir_path.name),)]
210+ _("Archive dir: %s") % (util.ufn(self.archive_dir_path.name if self.archive_dir_path else 'None'),)]
211
212 l.append("\n" +
213 ngettext("Found %d secondary backup chain.",
214@@ -697,7 +706,7 @@
215 len(backend_filename_list))
216
217 # get local filename list
218- local_filename_list = self.archive_dir_path.listdir()
219+ local_filename_list = self.archive_dir_path.listdir() if self.archive_dir_path else []
220 log.Debug(ngettext("%d file exists in cache",
221 "%d files exist in cache",
222 len(local_filename_list)) %
223@@ -894,7 +903,7 @@
224 if filelist is not None:
225 return filelist
226 elif local:
227- return self.archive_dir_path.listdir()
228+ return self.archive_dir_path.listdir() if self.archive_dir_path else []
229 else:
230 return self.backend.list()
231
232
233=== modified file 'duplicity/commandline.py'
234--- duplicity/commandline.py 2017-02-27 13:18:57 +0000
235+++ duplicity/commandline.py 2017-04-24 16:35:20 +0000
236@@ -54,6 +54,7 @@
237 collection_status = None # Will be set to true if collection-status command given
238 cleanup = None # Set to true if cleanup command given
239 verify = None # Set to true if verify command given
240+replicate = None # Set to true if replicate command given
241
242 commands = ["cleanup",
243 "collection-status",
244@@ -65,6 +66,7 @@
245 "remove-all-inc-of-but-n-full",
246 "restore",
247 "verify",
248+ "replicate"
249 ]
250
251
252@@ -236,7 +238,7 @@
253 def parse_cmdline_options(arglist):
254 """Parse argument list"""
255 global select_opts, select_files, full_backup
256- global list_current, collection_status, cleanup, remove_time, verify
257+ global list_current, collection_status, cleanup, remove_time, verify, replicate
258
259 def set_log_fd(fd):
260 if fd < 1:
261@@ -706,6 +708,9 @@
262 num_expect = 1
263 elif cmd == "verify":
264 verify = True
265+ elif cmd == "replicate":
266+ replicate = True
267+ num_expect = 2
268
269 if len(args) != num_expect:
270 command_line_error("Expected %d args, got %d" % (num_expect, len(args)))
271@@ -724,7 +729,12 @@
272 elif len(args) == 1:
273 backend_url = args[0]
274 elif len(args) == 2:
275- lpath, backend_url = args_to_path_backend(args[0], args[1]) # @UnusedVariable
276+ if replicate:
277+ if not backend.is_backend_url(args[0]) or not backend.is_backend_url(args[1]):
278+ command_line_error("Two URLs expected for replicate.")
279+ src_backend_url, backend_url= args[0], args[1]
280+ else:
281+ lpath, backend_url = args_to_path_backend(args[0], args[1]) # @UnusedVariable
282 else:
283 command_line_error("Too many arguments")
284
285@@ -899,6 +909,7 @@
286 duplicity remove-older-than %(time)s [%(options)s] %(target_url)s
287 duplicity remove-all-but-n-full %(count)s [%(options)s] %(target_url)s
288 duplicity remove-all-inc-of-but-n-full %(count)s [%(options)s] %(target_url)s
289+ duplicity replicate %(source_url)s %(target_url)s
290
291 """ % dict
292
293@@ -944,7 +955,8 @@
294 remove-older-than <%(time)s> <%(target_url)s>
295 remove-all-but-n-full <%(count)s> <%(target_url)s>
296 remove-all-inc-of-but-n-full <%(count)s> <%(target_url)s>
297- verify <%(target_url)s> <%(source_dir)s>""" % dict
298+ verify <%(target_url)s> <%(source_dir)s>
299+ replicate <%(source_url)s> <%(target_url)s>""" % dict
300
301 return msg
302
303@@ -1047,7 +1059,7 @@
304
305 def check_consistency(action):
306 """Final consistency check, see if something wrong with command line"""
307- global full_backup, select_opts, list_current
308+ global full_backup, select_opts, list_current, collection_status, cleanup, replicate
309
310 def assert_only_one(arglist):
311 """Raises error if two or more of the elements of arglist are true"""
312@@ -1058,8 +1070,8 @@
313 assert n <= 1, "Invalid syntax, two conflicting modes specified"
314
315 if action in ["list-current", "collection-status",
316- "cleanup", "remove-old", "remove-all-but-n-full", "remove-all-inc-of-but-n-full"]:
317- assert_only_one([list_current, collection_status, cleanup,
318+ "cleanup", "remove-old", "remove-all-but-n-full", "remove-all-inc-of-but-n-full", "replicate"]:
319+ assert_only_one([list_current, collection_status, cleanup, replicate,
320 globals.remove_time is not None])
321 elif action == "restore" or action == "verify":
322 if full_backup:
323@@ -1137,22 +1149,27 @@
324 "file:///usr/local". See the man page for more information.""") % (args[0],),
325 log.ErrorCode.bad_url)
326 elif len(args) == 2:
327- # Figure out whether backup or restore
328- backup, local_pathname = set_backend(args[0], args[1])
329- if backup:
330- if full_backup:
331- action = "full"
332- else:
333- action = "inc"
334+ if replicate:
335+ globals.src_backend = backend.get_backend(args[0])
336+ globals.backend = backend.get_backend(args[1])
337+ action = "replicate"
338 else:
339- if verify:
340- action = "verify"
341+ # Figure out whether backup or restore
342+ backup, local_pathname = set_backend(args[0], args[1])
343+ if backup:
344+ if full_backup:
345+ action = "full"
346+ else:
347+ action = "inc"
348 else:
349- action = "restore"
350+ if verify:
351+ action = "verify"
352+ else:
353+ action = "restore"
354
355- process_local_dir(action, local_pathname)
356- if action in ['full', 'inc', 'verify']:
357- set_selection()
358+ process_local_dir(action, local_pathname)
359+ if action in ['full', 'inc', 'verify']:
360+ set_selection()
361 elif len(args) > 2:
362 raise AssertionError("this code should not be reachable")
363
364
365=== modified file 'duplicity/file_naming.py'
366--- duplicity/file_naming.py 2016-06-28 21:03:46 +0000
367+++ duplicity/file_naming.py 2017-04-24 16:35:20 +0000
368@@ -436,3 +436,12 @@
369 self.encrypted = encrypted # true if gpg encrypted
370
371 self.partial = partial
372+
373+ def __eq__(self, other):
374+ return self.type == other.type and \
375+ self.manifest == other.manifest and \
376+ self.time == other.time and \
377+ self.start_time == other.start_time and \
378+ self.end_time == other.end_time and \
379+ self.partial == other.partial
380+

Subscribers

People subscribed via source and target branches