branches with status:
Name Status Last Modified Last Commit
lp:~stragerneds/duplicity/duplicity-0.7-perf 1 Development 2019-07-06 03:52:03 UTC
1377. Optimize loading backup chains; reduc...

Author: Matthew Glazar
Revision Date: 2019-07-01 06:45:01 UTC

Optimize loading backup chains; reduce file_naming.parse calls

For each filename in filename_list,
CollectionsStatus.get_backup_chains calls file_naming.parse
(through BackupSet.add_filename) between 0 and len(sets)*2
times. In the worst case, this leads to a *ton* of redundant
calls to file_naming.parse.

For example, when running 'duplicity collection-status' on
one of my backup directories:

* filename_list contains 7545 files
* get_backup_chains creates 2515 BackupSet-s
* get_backup_chains calls file_naming.parse 12650450 times!

This command took 9 minutes and 32 seconds. Similar
commands, like no-op incremental backups, also take a long
time. (The directory being backed up contains only 9 MiB
across 30 files.)

Avoid many redundant calls to file_naming.parse by hoisting
the call outside the loop over BackupSet-s. This
optimization makes 'duplicity collection-status' *20 times
faster* for me (572 seconds -> 29 seconds).

Aside from improving performance, this commit should not
change behavior.

11 of 1 result