Rsync to S3 in 5 Minutes

A short while ago I needed to move a few terabytes of data to the Amazon cloud. I wanted to consolidate several different sets of files into one place and give the users access to it. I was out of space!

The data is stored on a linux file system, some on a windows file system and is backed up to a ZFS filestore on a Solaris 11 box. Some of the data I wanted to use was within the zfs snapshots. I use rsync scripts to perform backups and on this linux box, I interact with windows boxes via samba shares.

No Other Quick Solution

I googled the subject of using rsync to move data to S3, but did not find any solution that I could quickly and easily implement without digging in and putting in a respectable amount of time.

User Side

To give the users access to the future data store, I found a windows based solution that gave my users a windows share to an S3 bucket called Cloudberry Drive .... and that lead me to my rsync solution:

A hack really:

Mount the new windows share (courtesy of Cloudberry Drive) from the Linux box using Samba, and bingo, you can rsync your data straight to S3!

Security

I looked into the security of the share and found it to be domain centric...basically only authenticated domain users could access this share, and in this case, that is sufficient for this data, so I was a go there.

The steps to complete were:

  1. Download and install the Cloudberry Drive software on a windows box, configure it to access your S3 bucket. This will set up a windows share that any windows box can reach example: \\server\data
  2. From the linux box, mount the new windows share using samba.
  3. Fire up rsync and move that data!

So in reality, this is a work-around, but it was fast to set up, gives the end user the ability to reach the data easily, and thousands of gigabytes of data later, I am still using it.

So I thought worth sharing......