AWS Snowball User Guide
Calibrating a Large Transfer
Step 5: Create Your Jobs Using the AWS Snowball Management
Console
Now that you know how many Snowballs you need, you can create an import job for each appliance.
Because each Snowball import job involves a single Snowball, you create multiple import jobs. For more
information, see Create an Import Job (p. 17).
Step 6: Separate Your Data into Transfer Segments
As a best practice for large data transfers involving multiple jobs, we recommend that you separate
your data into a number of smaller, manageable data transfer segments. If you separate the data this
way, you can transfer each segment one at a time, or multiple segments in parallel. When planning your
segments, make sure that all the sizes of the data for each segment combined fit on the Snowball for
this job. When segmenting your data transfer, take care not to copy the same files or directories multiple
times. Some examples of separating your transfer into segments are as follows:
• You can make 10 segments of 4 TB each in size for a 50 TB Snowball.
• For large files, each file can be an individual segment.
• Each segment can be a different size, and each individual segment can be made of the same kind of
data—for example, batched small files in one segment, large files in another segment, and so on. This
approach helps you determine your average transfer rate for different types of files.
Note
Metadata operations are performed for each file transferred. Regardless of a file's size, this
overhead remains the same. Therefore, you get faster performance out of batching small files
together. For implementation information on batching small files, see Options for the snowball
cp Command (p. 60).
Creating these data transfer segments makes it easier for you to quickly resolve any transfer issues,
because trying to troubleshoot a large transfer after the transfer has run for a day or more can be
complex.
When you've finished planning your petabyte-scale data transfer, we recommend that you transfer a few
segments onto the Snowball from your workstation to calibrate your speed and total transfer time.
Calibrating a Large Transfer
You can calibrate a large transfer by running the snowball cp command with a representative set of
your data transfer segments. In other words, choose a number of the data segments that you defined
following last section's guidelines and transfer them to a Snowball. At the same time, make a record of
the transfer speed and total transfer time for each operation.
Note
You can also use the snowball test command to perform calibration before receiving a
Snowball. For more information about using that command, see Testing Your Data Transfer with
the Snowball Client (p. 53).
While the calibration is being performed, monitor the workstation's CPU and memory utilization. If the
calibration's results are less than the target transfer rate, you might be able to copy multiple parts of
your data transfer in parallel on the same workstation. In this case, repeat the calibration with additional
data transfer segments, using two or more instances of the Snowball client connected to the same
Snowball. Each running instance of the Snowball client should be transferring a different segment to the
Snowball.
Continue adding additional instances of the Snowball client during calibration until you see diminishing
returns in the sum of the transfer speed of all Snowball client instances currently transferring data. At
38