Koozali.org: home of the SME Server

Affa hardlinking questions

Offline JohnG

  • ***
  • 88
  • +0/-0
Affa hardlinking questions
« on: December 06, 2012, 05:45:28 AM »
I realize hardlinking occurs between multiple Affa runs, but are hardlinks created when a single Affa job archives folders that contain identical files? Also, are hardlinks created between separate Affa jobs that contain identical files?

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Re: Affa hardlinking questions
« Reply #1 on: December 06, 2012, 01:13:54 PM »
Every Affa backup folder is a complete backup.

Files found to be identical on the source and in the previous backup are hard linked rather than re-copied.

If your first backup run contains:
Folder1
+filea
+fileb
+filec

... and you edit "fileb" and backup again, your second backup folder will contain:

Folder1
+filea (hardlink to identical filea from preceding run)
+fileb (new file; only changes transferred)
+filec (hardlink to identical filec from preceding run)

If I remember correctly, Affa uses "--link-dest=...":
--link-dest=DIR
    This option behaves like --copy-dest, but unchanged files are hard linked from DIR to the destination directory. The files must be identical in all preserved attributes (e.g. permissions, possibly ownership) in order for the files to be linked together. An example:

        rsync -av --link-dest=$PWD/prior_dir host:src_dir/ new_dir/

    If file's aren't linking, double-check their attributes. Also check if some attributes are getting forced outside of rsync's control, such a mount option that squishes root to a single user, or mounts a removable drive with generic ownership (such as OS X's "Ignore ownership on this volume" option).

    Beginning in version 2.6.4, multiple --link-dest directories may be provided, which will cause rsync to search the list in the order specified for an exact match. If a match is found that differs only in attributes, a local copy is made and the attributes updated. If a match is not found, a basis file from one of the DIRs will be selected to try to speed up the transfer.

    This option works best when copying into an empty destination hierarchy, as rsync treats existing files as definitive (so it never looks in the link-dest dirs when a destination file already exists), and as malleable (so it might change the attributes of a destination file, which affects all the hard-linked versions).

    Note that if you combine this option with --ignore-times, rsync will not link any files together because it only links identical files together as a substitute for transferring the file, never as an additional check after the file is updated.

    If DIR is a relative path, it is relative to the destination directory. See also --compare-dest and --copy-dest.

    Note that rsync versions prior to 2.6.1 had a bug that could prevent --link-dest from working properly for a non-super-user when -o was specified (or implied by -a). You can work-around this bug by avoiding the -o option when sending to an old rsync.

Offline JohnG

  • ***
  • 88
  • +0/-0
Re: Affa hardlinking questions
« Reply #2 on: December 06, 2012, 07:51:48 PM »
Every Affa backup folder is a complete backup.

Files found to be identical on the source and in the previous backup are hard linked rather than re-copied.

If your first backup run contains:
Folder1
+filea
+fileb
+filec

... and you edit "fileb" and backup again, your second backup folder will contain:

Folder1
+filea (hardlink to identical filea from preceding run)
+fileb (new file; only changes transferred)
+filec (hardlink to identical filec from preceding run)

If I remember correctly, Affa uses "--link-dest=...":

Thanks. I was actually referring to within a run rather than subsequent runs.

If Folder2 contains identical copies of all the files from Folder1, then during the first Affa backup, does it hardlink all of Folder2's files back to Folder1?

In other words, is there hardlinking within a single Affa run, or only when comparing subsequent runs?

Offline mmccarn

  • *
  • 2,628
  • +10/-0
Re: Affa hardlinking questions
« Reply #3 on: December 06, 2012, 08:52:42 PM »
Between runs there is always hardlinking unless your storage device doesn't support hardlinks (affa backup to windows share, anyone?).

Hardlinks within a single run would be a function of "deduplication":

dedup
Name:    dedup
Value:    yes or no
Multivalue:    no
Default:    no
Description:    The purpose of the deduplication is to remove duplicate files to save backup space. When set to 'yes' file deduplication is run after the synchronization has been completed. It looks for files that have identical content, user, group and permissions and replace duplicates by hardlinks. Deduplication scans the just completed archive and the previous one, that usually is scheduled.0 and daily.0 or scheduled.0 and scheduled.1. Consider this scenario: A user has renamed directories or files. Rsync sees those as new ones and copies them. Deduplication finds the identical copies in the previous archive and replace them by hardlinks. To use deduplication the Freedup program needs to be installed. Affa actually runs freedup -upg scheduled.0 <previous_archive>.