Comment 17 for bug 412470

Revision history for this message
Bart de Koning (bratdaking) wrote : Re: [Bug 412470] Re: hardlinking doesnt work with schedule per included folder enabled

Sounds like a solution.
The latest_snapshot will always contain all the information of the last
taken snapshot and we store only the part that is included via hardlinks in
the folder with the snapshot-id.

Advantages:

   1. we use less space than by copying everything each time
   2. we need to have only one rsync command (the --dry-run is unnecessary)
   and
   3. we copy (hardlink) only the included folders

Disadvantage:

   1. now we have every file visible in every snapshot, it might not have
   been updated in that round but it is visible, making scrolling through the
   snapshot folders to search for that particular version you want easy. if we
   use your proposed trick a lot of the snapshot folders will show: file not
   present in this snapshot (for less frequent files)
   2. we use one complete snapshot more than before (the latest_snapshot)

We could avoid the last though, by first making the copy (hardlinking) and
give it the name of last time, and then do the rsync for the next round. (so
you make the actual snapshot folder during the next scheduled round), this
gets pretty complicated so I vote actually against that option and use the
extra space for an extra snapshot...

2009/9/3 Borph <email address hidden>

> I think we have to be clear what BackInTime should be able to do and
> which kind of program it is..
>
> It's nice that BiT can be run by any user, but as root you have more
> possibilities: you could use anacron or backup the whole system. Maybe
> this can be detected and BiT could behave different?
>
> Should BiT support FAT32 and be (quite) efficient on it? This would mean
> to have only folders in the snapshot directory, which are rsynced. But
> in the GUI, it's nice to see each snapshot as a full folder set, even if
> only one of them was rsynced. I mean a good example is: backup (almost)
> the whole system monthly, but a specified folder like
> /home/peter/documents every hour.
>
> On FAT32, this takes a lot of space which is wasted absolutely
> senseless. On ext3/ext4, hardlinking solves this problem, but is still
> an overhead when it comes to a lot of files.
>
> If you say goodbye to the a-snapshot-has-all-files approach, an hourly
> snapshot would contain only /home/peter/documents, the monthly all files
> and the "latest_snapshot" also (like I wrote under answers). But of
> course it can get much less convenient to look for a specific file, or a
> extra logic is needed to support the user. Actually, this extra logic
> could be a soft link! I will explain my idea:
>
> There are more than one folder configured and they have different
> frequencies. Let's say "/" monthly and "/home/peter/documents" hourly.
>
> In the backintime directory, there is a folder "latest_snapshot" which
> has always the latest status of files which got rsync'ed. Also there are
> the snapshots like 20090831-204000 or so. A cronjob will do like this:
>
> If it's the monthly run, rsync including both folders mentioned using
> root and the "latest_snapshot". Then copy (cp -al) it to the snapshot
> (with the date in the name).
>
> If it's the hourly run, rsync only including the
> "/home/peter/documents", also using "latest_snapshot". Then copy (cp
> -al) _this_ folder to the named snapshot.
>
> It knows, which snapshot was with the next-lower frequency (here
> monthly) and does soft-links of the ignored folders into the current
> snapshot. Softlinks have a big disadvantage: we have to be carefull with
> removing snapshots! If we don't do the links, the new snapshot only
> contains 'documents', which is ok for me and saves space on FAT32.
>
> About the problem of non-disjoint folders: If the subfolder has a lower
> frequency, it can be excluded with "--exclude" and won't be rsynced.
> Don't do the softlinks in that case. If the subfolder has higher
> frequency, this shouldn't be a problem, as only the subfolder got
> rsynced and copied into the new snapshot.
>
> --
> hardlinking doesnt work with schedule per included folder enabled
> https://bugs.launchpad.net/bugs/412470
> You received this bug notification because you are a direct subscriber
> of the bug.
>