You can request a Transfer Appliance directly from your GCP console. The service will be available in beta in the EU in a 100TB configuration with total usable capacity of 200TB. And it’ll soon be available in a 480TB configuration with a total usable capacity of a petabyte.
Moving HDFS clusters with Transfer Appliance
Customers have been using Transfer Appliance to move everything from audio and satellite imagery archives to geographic and wind data. One popular use case is migrating Hadoop Distributed File System (HDFS) clusters to GCP.
We see lots of users run their powerful Apache Spark and Apache Hadoop clusters on GCP with Cloud Dataproc, a managed Spark and Hadoop service that allows you to create clusters quickly, then hand off cluster management to the service. Transfer Appliance is an easy way to migrate petabytes of data from on-premise HDFS clusters to GCP.
Earlier this year, we announced the ability to configure Transfer Appliance with one or more NFS volumes. This lets you push HDFS data to Transfer Appliance using Apache DistCp (also known as Distributed Copy)—an open source tool commonly used for intra/inter-cluster data copy. To copy HDFS data onto a Transfer Appliance, configure it with an NFS volume and mount it from the HDFS cluster. Then run DistCp with the mount point as the copy target. Once your data is copied to Transfer Appliance, ship it to us and we’ll load your data into Cloud Storage.
Using Transfer Appliance in production
EU customers such as Candour Creative, which helps their clients tell stories through films and photographs, wanted to take advantage of having their content readily available in the cloud. But Zac Crawley, Director at Candour, was facing some challenges with the move.
“Multiple physical backups of our data were taking up space and becoming costly,” Crawley says. “But when we looked at our network, we figured it would take a matter of months to move the 40TBs of large file data. Transfer Appliance reduced that time significantly.”