AWS Snowball User Guide
Speeding Up Data Transfer
2. Batch small files together – Each copy operation has some overhead because of encryption.
Therefore, performing many transfers on individual files has slower overall performance than
transferring the same data in larger files. You can significantly improve your transfer speed for small
files by batching them in a single snowball cp command. Batching of small files is enabled by
default. During the import process into Amazon S3, these batched files are automatically extracted to
their original state. For more information, see Options for the snowball cp Command (p. 60).
3. Perform multiple copy operations at one time – If your workstation is powerful enough, you can
perform multiple snowball cp commands at one time. You can do this by running each command
from a separate terminal window, in separate instances of the Snowball client, all connected to the
same Snowball.
4. Copy from multiple workstations – You can connect a single Snowball to multiple workstations. Each
workstation can host a separate instance of the Snowball client.
5. Transfer directories, not files – Because there is overhead for each snowball cp command, we don't
recommend that you queue a large number of individual copy commands. Queuing many commands
has a significant negative impact on your transfer performance.
For example, say that you have a directory called C:\\MyFiles that only contains three files, file1.txt,
file2.txt, and file3.txt. Suppose that you issue the following three commands.
snowball cp C:\\MyFiles\file1.txt s3://mybucket
snowball cp C:\\MyFiles\file2.txt s3://mybucket
snowball cp C:\\MyFiles\file3.txt s3://mybucket
In this scenario, you have three times as much overhead as if you transferred the entire directory with
the following copy command.
Snowball cp –r C:\\MyFiles\* s3://mybucket
6. Don't perform other operations on files during transfer – Renaming files during transfer, changing
their metadata, or writing data to the files during a copy operation has a significant negative impact
on transfer performance. We recommend that your files remain in a static state while you transfer
them.
7. Reduce local network use – Your Snowball communicates across your local network. Because of
this, reducing other local network traffic between the Snowball, the switch it's connected to, and the
workstation that hosts your data source can improve data transfer speeds.
8. Eliminate unnecessary hops – We recommend that you set up your Snowball, your data source, and
your workstation so that they're the only machines communicating across a single switch. Doing so
can result in a significant improvement of data transfer speeds.
Experimenting to Get Better Performance
Your performance results will vary based on your hardware, your network, how many and how large
your files are, and how they're stored. Therefore, we suggest that you experiment with your performance
metrics if you're not getting the performance that you want.
First, attempt multiple copy operations until you see a reduction in overall transfer performance.
Performing multiple copy operations at once can have a significantly positive impact on your overall
transfer performance. For example, suppose that you have a single snowball cp command running in
a terminal window, and you note that it's transferring data at 30 MB/second. You open a second terminal
window, and run a second snowball cp command on another set of files that you want to transfer. You
see that both commands are performing at 30 MB/second. In this case, your total transfer performance
is 60 MB/second.
Now, suppose that you connect to the Snowball from a separate workstation. You run the Snowball
client from that workstation to execute a third snowball cp command on another set of files that
35