Skip to content

Sync command guide

Ze Qian Zhang edited this page Mar 1, 2019 · 4 revisions

Scenario

AzCopy offers a convenient sync command to save on costs when transferring large amount of data. It enumerates both the source and destination directories, and compare the last modified times of the source and destination files, to see if they need to be transferred.

Behavior

For a given file at the source, AzCopy would transfer it to the destination if:

  1. it does not exist at the destination
  2. the same file also exists at the destination, but its last modified time is earlier than the source file

For the extra files at the destination that are not present at the source, AzCopy leaves them alone by default. However, the user could choose to delete them by setting --delete-destination to either True (silently delete) or Prompt (which will ask for permission if extra files are detected). Please be careful if your sync is targeting an entire container, you might end up deleting a lot of files! On a side note, users are encouraged to enable the soft-delete feature to prevent potential data loss in general.

Examples

Please run ./azcopy sync --help to see the examples.

Caveats

  • The command syntax has been significantly simplified in 10.0.8:
    • Source and destination pairs can be:
      1. file <-> blob
      2. local dir <-> blob container/virtual dir
    • Note: dir <-> dir sync, the command compares the contents of the source and destination directories. If the same pair was given to the copy command instead, the source dir would be put under the destination dir.
    • Recursive flag is by default on, since that's the most common scenario for sync.
    • Include/exclude patterns are for the file names only
      • Ex: for a given file /Users/foo/bar/file_name.txt, any given pattern (include or exclude) would only match to the file name ("file_name.txt") part of the path. Please refer to the examples for illustration.
  • It should be noted that for the sync command to work properly, the machine from which it runs should have a reasonably accurate system clock, since last modified times are critical in determining whether a file should be transferred. If there is a significant clock skew, the user is advised to not modify the destination too closely to running a sync command.

Features to come

  • Support for symbolic links when uploading, which will enable users to upload data from SMB shares too
  • Support for more source/destination pairs