Dropbox Syncing – How does it work?

I am trying to understand what Dropbox sync is doing with a complex file backup.

I use Dropbox to backup a copy of my Outlook 2010 .pst file every 6 months.

I recently uploaded a backup .pst which was 6.8GB. Dropbox uploaded at full speed (100KB/sec - ADSL2+) for less than 8 hours then stopped with sync complete. By my calculations (8hours x 360mb/hr ) at most this was 3GB of data uploaded.

The files shows up successfully in Dropbox as 6.8G.

I know Dropbox is capable of incremental updates, and I did have copies of past .pst file backups in other Dropbox folders. However this backup was definitely a new copy, in a new folder. Without actually testing I am starting to think the backup might be corrupted. Any ideas what is going on?

Related Stores

Dropbox
Dropbox

Comments

  • +2

    Dropbox uses libsync. It basically produces compressed difference between two files, and I guess the result would be smaller than transmitting the entire file again.

    • Much more succinct and timely than my belated post :)

  • +1

    I don't know enough about the .pst format to give a definitive answer but this is possibly normal behaviour.

    Some file formats, especially on Mac OS X and some Linux systems (eg. OS X .pkg files, certain iWork files, Mac applications) are at a file-system level actually just directories. While the OS will present them to the user in a different way, services like Dropbox will see them as a folder with an extension.

    While the examples I've given are Mac-centric, Microsoft's .docx and .xlsx formats are actually zip files (see here for example). These files contain a directory with an XML version of text/worksheets plus and all of the other file resources.

    I know that Dropbox keeps checksums on all files and uses deduplication process to reduce the amount of space lost to hosting identical files. It is possible .pst files aren't fat files (with unique checksums) but are instead some form of obscured directory or ZIP file as with the two examples above. If this is the case, I'd hypothesise that Dropbox is merely uploading the new unique content.

    Its also possible that your upload speed upped significantly in your absence, or that the checksumming process for Dropbox happens in chunks or segments smaller than the .pst's total size and that dedupe has kicked in pre-upload.

    Finally, if you're especially worried I'd make an md5 checksum of a local copy and compare it with a checksum from a version retrieved from Dropbox.

    • The checksum should not be required as the dropbox service inherently uses a checksum at both ends to make sure they have the same file - so if it was corrupted it would correct itself.

      • +2

        True, which is why I suggested it only in the case he is "especially worried". Even still - every backup system should be subjected to an actual restore scenario once in a while.

        • +1

          Sounds like a plan - guess I should run a "fire drill" on my files at some stage.

Login or Join to leave a comment