This is useful where you have multiple snapshot backups of a directory tree, and you want to remove unnecessary duplication. It only works on filesystems that support hard links, which includes pretty much any default filesystem under UNIX, but will not support Windows/DOS type filesystems.
One tree is the 'target' that will have distinct files replaced with hard links, and the other tree is the 'reference', against which each file will be compared and potentially linked. In practice, it makes no difference which tree is specified first, because both will end up with files pointing to the same disk blocks. However, in cases where one tree is much larger than the other will run faster if the smaller one is the target tree (specified first). When de-duplicating backups, it is typical for the newer tree to be larger (as files accumulate over time) and so it would usually be faster to specify the older tree first and use the newer tree as the reference tree.
You can run linkify multiple times on the same pairs of directories. Any files that have been linkified will not be re-linkified because the two files will have the same inode number and thus will be skipped.
Due to the way hard links work on POSIX/UNIX systems, both directories must be on the same filesystem. Timestamps are preserved by setting the timestamp of new "link" filename to the original (not the reference) file. The new link entry should look exactly the same as the original one.
File ownership and permission settings are similarly preserved - each linkified file retains the ownership and permissions of the original file.
Disclaimer: I'm not responsible for your loss of data under any circumstances, including but not limited to use or mis-use of this software. You use this software entirely at your own risk. This software changes data on your drive, and using it might cause you to lose data.
Usage: linkify [ -v | -d ] target-tree reference-tree
The "-v" and "-d" options enable verbose and debugging modes, respectively. Adding debug options causes more detailed debugging information.
linkify -v /var/spool/backups/20100104 /var/spool/backups/20100105
This example shows verbose output, so every file that is linkified will be displayed. Every filename in the first (target) tree will be searched for in the same place in the second (reference) tree to see if the file is has the same name, is not a hard link to the same file (same inode number) and has the same contents (using an md5 checksum).
The "linkification" process consistes of the following steps for each file:
If you think there might have been a problem, simply search the target tree for filenames matching the format used, for example:
find /var/spool/backups/20100104 -name .linkify.\* -ls
Then rename the file to what it was and you should be back to how things were before.