Beginner's Guide to Unison
Unison is a file-synchronization tool for Unix and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other.
Note: This review is a short summary of the official manual. Please use version 2.52 or newer to avoid version interoperability issues.
A Demo
Unison can be used with either of two user interfaces: a textual interface and a graphical interface.
Let's consider a simple scenario and see how to synchronize two directories on a single machine.
- install Unison. Basically, we need only two executable binary files,
unison
andunison-gui
, downloaded from the proper release tarball in its github repository. Set up a
work
directory and amirror
direcotry for our illustrationmkdir work touch work/a work/b mkdir work/d touch work/d/f cp -r work mirror
- Try synchronizing
work
andmirror
. Since they are identical, synchronizing them won’t propagate any changes, but Unison will remember the current state of both directories so that it will be able to tell next time what has changed by typingunison work mirror
.- textual interface: you should see a message notifying you that all the files are actually equal and then get returned to the command line, and you may also get a warning message for creating archives (the private data structure used by Unison) as this is the first run of Unison.
- graphical interface: You should get a big empty window with a message at the bottom notifying you everything is up-to-date.
Make some changes in
work
andmirror
rm work/a echo "Hello" > work/b echo "Hello" > mirror/b date > mirror/c echo "Hi there" > work/d/h echo "Hello there" > mirror/d/h
Try synchronizing
work
andmirror
again by typingunison work mirror
.Let us elaborate the behaviors of the textual interface in this case.
- Unison will display only the files that are different and ask for actions one by one.
If a file has been changed in the same way and remain identical in both directories,
Unison will simply note the file is up-to-date and nothing will be shown. So we expect three
changes to be decided: the absent file of
a
inwork
, the new filec
inmirror
and the conflicting changes ond/h
. Unison will notify the creation of
c
inmirror
and prompt a line like<--- new file c [f]
We can follow Unison’s recommendation, press
f
or[ENTER]
at the prompt. Or we can simply ignore this file and leave both replicas alone by pressing/
. Pressing?
for a list of possible responses and their meanings. See also this question for explanation on the keyL
and matching conditions.Similarly, Unison will notify the delete of
a
in work and prompt a line likedeleted ---> a [f]
For conflicting changes on
d/h
, Unison will prompt a line likenew file <-?-> new file d/h []
Suppose we skip the file
d/h
and accept changes on filea
andc
, Unison will briefly summarize the actions it is supposed to do and asks for confirmation2 items will be synced, 1 skipped 0 B to be synced from work to mirror 32 B to be synced from mirror to work Proceed with propagating updates? []
- Finally, if we confirm then Unison will apply changes and output logs of the process.
The usage of the graphical interface is similar. The main window shows all the files that have been modified. To override a default action (or to select an action in the case when there is no default), first select the file by clicking on its name, then press the desired action key metioned before. When you are satisfied with the propagation of changes as shown in the main window, click the Go button to set them in motion.
- Unison will display only the files that are different and ask for actions one by one.
If a file has been changed in the same way and remain identical in both directories,
Unison will simply note the file is up-to-date and nothing will be shown. So we expect three
changes to be decided: the absent file of
Basic Concepts
Below is a short summary of the official manual.
- Roots. A replica’s root tells Unison where to find a set of files to be synchronized, either on the local machine or on
a remote host. The pattern of the root is
[protocol:]//[user@][host][:port][path]
. Whenpath
is given without any protocol prefix, the protocol is assumed to befile
. Other possible protocol arguments includessh
andsocket
. Ifpath
is a relative path, then it actually specifies a local root relative to the directory where Unison is started. - Paths. A path refers to a point within a set of files being synchronized; it is specified relative to the root of the
replica. Formally, a path is just a sequence of names, separated by
/
. The empty path (i.e., the empty sequence of names) denotes the whole replica. Unison displays the empty path as[root]
. - Descendants. If
p
is a path andq
is a path beginning withp
, thenq
is said to be a descendant ofp
. Thus, each path is also a descendant of itself. Contents. The contents of a path
p
in a particular replica could be a file, a directory, a symbolic link, or absent (if p does not refer to anything at all in that replica). More specifically:- If
p
refers to an ordinary file, then the contents of p are the actual contents of this file (a string of bytes) plus the current permission bits of the file. - If
p
refers to a symbolic link, then the contents ofp
are just the string specifying where the link points. - If
p
refers to a directory, then the contents of p are just the token DIRECTORY plus the current permission bits of the directory. - If
p
does not refer to anything in this replica, then the contents ofp
are the token ABSENT.
Unison keeps a record (named archives) of the contents of each path after each successful synchronization of that path (i.e., it remembers the contents at the last moment when they were the same in the two replicas).
- If
- Update. A path is updated (in some replica) if its current contents are different from its contents the last time it was successfully synchronized.
- Conflicts. A path is said to be conflicting if the following conditions all hold:
- it has been updated in one replica,
- any of its descendants has been updated in the other replica,
- its contents in the two replicas are not identical.
- Reconciliation. Unison operates in several distinct stages:
- On each host, it compares its archive file (which records the state of each path in the replica when it was last synchronized) with the current contents of the replica, to determine which paths have been updated.
- It checks for false conflicts — paths that have been updated on both replicas, but whose current values are identical. These paths are silently marked as synchronized in the archive files in both replicas.
- It displays all the updated paths to the user. For updates that do not conflict, it suggests a default action (propagating the new contents from the updated replica to the other). Conflicting updates are just displayed. The user is given an opportunity to examine the current state of affairs, change the default actions for nonconflicting updates, and choose actions for conflicting updates.
- It performs the selected actions, one at a time. Each action is performed by first transferring the new contents to a temporary file on the receiving host, then atomically moving them into place.
- It updates its archive files to reflect the new state of the replicas.
Invariants. Unison is careful to protect both its internal state and the state of the replicas at every point in this process. Specifically, the following guarantees are enforced:
- At every moment, each path in each replica has either
- its original contents (i.e., no change at all has been made to this path), or
- its correct final contents (i.e., the value that the user expected to be propagated from the other replica).
- At every moment, the information stored on disk about Unison’s private state can be either
- unchanged, or
- updated to reflect those paths that have been successfully synchronized.
If Unison gets interrupted during ensuring those guarantees, some manual cleanup may be required. In this case, a file called DANGER.README will be left in the
.unison
directory, containing information about the operation that was interrupted. The next time you try to run Unison, it will notice this file and warn you about it.If Unison is interrupted, it may sometimes leave temporary working files (with suffix
.tmp
) in the replicas. It is safe to delete these files. Also, if the backups flag is set, Unison will leave around old versions of files that it overwrites, with names likefile.0.unison.bak
. These can be deleted safely when they are no longer wanted.If Unison finds that its archive files have been deleted (or that the archive format has changed and they cannot be read, or that they don’t exist because this is the first run of Unison on these particular roots), it takes a conservative approach: it behaves as though the replicas had both been completely empty at the point of the last synchronization. Thus, It is also safe to delete those archive files on both replicas. The next time Unison runs, it will assume that all the files it sees in the replicas are new.
- At every moment, each path in each replica has either
Typical Usage
Once you are comfortable with the basic operation of Unison, you may find yourself wanting to use it regularly to synchronize your commonly used files. There are several possible ways of going about this:
- Synchronize your whole home directory, using the Ignore facility to avoid synchronizing particular directories and files.
- Synchronize your whole home directory, but tell Unison to synchronize only some of
the files and subdirectories within it. This can be accomplished by specifying the
-path
arguments in your profile. - Create another directory called
shared
(orcurrent
, or whatever) on each host, and put all the files you want to synchronize into this directory. Tell Unison to synchronizeshared
among different hosts. - Create another directory called
shared
(orcurrent
, or whatever) on each host, and put links to all the files you want to synchronize into this directory. Use thefollow
preference to make Unison treat these links as transparent.
Unison is designed for synchronizing pairs of replicas. However, it is possible to use it to keep larger groups of machines in sync by performing multiple pairwise synchronizations. If you need to do this, the most reliable way to set things up is to organize the machines into a star topology with one machine designated as the hub and the rest as spokes and with each spoke machine synchronizing only with the hub.
Caveats and Shortcomings
Here are some things to be careful of when using Unison.
- Unison cannot understand rename, and sees it as a delete and a separate create.
You need to be very CAREFUL when renaming directories containing
ignored
files.For example, suppose Unison is synchronizing directory
A
between the two machines called the local and the remote machine; suppose directoryA
contains a subdirectoryD
; and supposeD
on the local machine contains a file or subdirectoryP
that matches an ignore directive in the profile used to synchronize. Thus pathA/D/P
exists on the local machine but not on the remote machine.If
D
is renamed toDnew
on the remote machine, and this change is propagated to the local machine, all such files or subdirectoriesP
will be deleted. This is because Unison sees the rename as a delete and a separate create: it deletes the old directory (including the ignored files) and creates a new one (not including the ignored files, since they are completely invisible to it).It could be very DANGEROUS to use Unison with removable media such as USB drives unless you are careful.
If you synchronize a directory that is stored on removable media when the media is not present, it will look to Unison as though the whole directory has been deleted, and it will proceed to delete the directory from the other replica!
- Archives are created based on names of roots (and other informations), meaning that renaming roots
results Unison think it never sync these before. For example, assume you have run Unison to sync
work
andmirror
before, and you renamemirror
tobackup
then change some files inbackup
. Now, runningunison
work backup will create new archives and ask you to resolve conflicts. In this case, you may find the option-prefer backup
be useful, which effectively choose files inbackup
to resolve possible conflicts. - If you want to run Unison continuously as a crontab task, then you have to ensure the same script will not be called unless its previous call has finished. Otherwise there will be two running Unison instance caring about same targets and interfere each other. For example, it could be that a sync of big files takes more than 10 minutes, which would create problems if you have set every 10 minutes a new sync would be started.
- The graphical user interface is single-threaded. This means that if Unison is performing some long- running operation, the display will not be repainted until it finishes. We recommend not trying to do anything with the user interface while Unison is in the middle of detecting changes or propagating files.
Going Further
The official manual is here and the FAQ is here.
Besides the basic concepts mentioned in this blog, you may also want to look at the following sections in the official manual:
- Section 6.1 Running Unison
- Section 6.2 The
.unison
Directory - Section 6.4 Preferences
- Section 6.5 Profiles
- Section 6.6 Sample Profiles
- Section 6.7 Keeping Backups
- Section 6.8 Merging Conflicting Versions
- Section 6.12 Path Specification
- Section 6.13 Ignoring Paths