Peak performance: How retailers used Google Cloud during Black Friday/Cyber MondayPeak performance: How retailers used Google Cloud during Black Friday/Cyber MondaySr. Director, Cloud Support
Developing applications today comes with lots of choices and plenty to learn, whether you’re exploring serverless computing or managing a raft of APIs. In today’s post, we’re sharing some of our top videos on what’s new in application development on Google Cloud Platform (GCP) to find tips and tricks you can use.
This demo-packed session walks you through the use of Knative, our Kubernetes-based platform for building and deploying serverless apps. This session goes through how to get started with using Knative to further the goal of focusing on writing code. You’ll see how it uses APIs that are familiar from GKE, and auto-scales and auto-builds to remove added tasks and overhead. The demos show how Knative spins up prebuilt containers, builds custom images, previews new versions of your apps, migrates traffic to those versions, and auto-scales to meet unpredictable usage patterns, among other steps in the build and deploy pipeline. You’ll see the cold start experience, along with preconfigured monitoring dashboards and how auto-termination works.
The takeaway: Get an up-close view into how a serverless platform like Knative works, and what it looks like to further abstract code from the underlying infrastructure.
You have a lot of key choices to make when deciding how and which technology to adopt to meet your application development needs. In this session, you’ll hear about various options for running code and the tradeoffs that may come with your decisions. Considerations include what your code is used for: Does it connect to the internet? Are there licensing considerations? Is it part of a push toward CI/CD? Is it language-dependent or kernel-limited? It’s also important to consider your team’s skills and interests as you decide where you want to focus, and where you want to run your code.
The takeaway: Understand the full spectrum of compute models (and related Google Cloud products) first, then consider the right tool for the job when choosing where to run your code.
Kubernetes empowers developers by making hard tasks possible. In this session, we introduce Kubernetes as a workload-level abstraction that lets you build your own deployment pipeline, and starts with the premise that rather than making simple tasks easier. The session walks through how to deploy containers with Kubernetes, and configuring a deployment pipeline with Cloud Build. Deployment strategy advice includes using probes to check container integrity and connectedness, using configuration as code for a robust production deployment environment, setting up a CI/CD pipeline, and requesting that the scheduler provision the right resources for your container. It concludes with some tips on preparing for growth by configuring automated scaling using the requests per section (RPS) metric
The takeaway: Kubernetes can help you automate deployment operations in a highly flexible and customizable way, but needs to be configured correctly for maximum benefit. Help Kubernetes help you for best results.
There’s a lot of advice out there about APIs, so this session recommends focusing on what your goals are for each API you create. That could be updating or integrating software, among others. Choose a problem that’s important to solve with your API, and weigh your team and organization’s particular priorities when you’re creating that API. This session also points out some areas where common API mistakes happen, like version control or naming, and recommends using uniform API structure. When in doubt, keep it simple and don’t mess up how HTTP is actually used.
The takeaway: APIs have to do a lot of heavy lifting these days. Design the right API for the job and future-proof it as much as you can for the people and organizations who will use it down the road.
This session takes a top-to-bottom look at how we define and run serverless here at Google. Serverless compute platforms make it easy to quickly build applications, but sometimes identifying and diagnosing issues can be difficult without a good understanding of how the underlying machinery is working. In this session, you’ll learn how Google runs untrusted code at scale in a shared computing infrastructure, and what that means for you and your applications. You’ll learn how to build serverless applications that are optimized for high performance at scale, learn the tips and pitfalls associated with this, and see a live demo of optimization on Cloud Functions.
The takeaway: When you’re running apps on a serverless platform, you’re focusing on managing those things that elevate your business. See how it actually works so you’re ready for this stage of cloud computing.
Here’s a look at what serverless is, and what it is specifically on GCP. The bottom line is that serverless brings invisible infrastructure that automatically scales, and where you’re only charged for what you use. Serverless tools from GCP are designed to spring to life when they’re needed, and scale very closely to usage needs. In this session, you’ll get a look at how the serverless pieces come together with machine learning in a few interesting use cases, including medical data transcription and building an e-commerce recommendation engine that works even when no historical data is available. Make sure to stay for the cool demo from the CEO of Smart Parking, who shows a real-time, industrial-grade IoT system that’s improving parking for cities and drivers—without a server to be found.
The takeaway: Serverless helps workloads beyond just compute: learn how, why, and when you might use it for your own apps.
Using data and ML to better track wildfire and assess its threat levelsUsing data and ML to better track wildfire and assess its threat levelsCloud ML Blog Editor
Cloud Functions pro tips: Retries and idempotency in actionCloud Functions pro tips: Retries and idempotency in actionSoftware Engineer
Cloud Identity for Customers and Partners (CICP) is now in beta and ready to useCloud Identity for Customers and Partners (CICP) is now in beta and ready to useProduct Manager, Cloud Identity
In October at Next ’18 London, we announced Cloud Identity for Customers and Partners (CICP) to help you add Google-grade identity and access management (IAM) functionality to your apps, protect user accounts, and scale with confidence—even if those users are customers, partners, and vendors who might be outside of your organization. CICP is now available in public beta.
Adding Google-grade authentication to your apps
All users expect simple and secure sign-up, sign-in, and self-service experiences from all their devices. While you could build an IAM system for your apps, it can be hard and expensive. Just think about the complexity of building and maintaining an IAM system that stays up-to-date with evolving authentication requirements, keeping user accounts secure in the face of threats that increase in occurrence and sophistication, and scaling the system reliably when the demand for your app grows.
Knative: bringing serverless to Kubernetes everywhereKnative: bringing serverless to Kubernetes everywhereDirector of Product Management, Google CloudGroup Product Manager
How to connect Cloudera’s CDH to Cloud StorageHow to connect Cloudera’s CDH to Cloud StorageStrategic Cloud Engineer, Google Cloud Professional Services
In this post, we’ll help you get started deploying the Cloud Storage connector for your CDH clusters. The methods and steps we discuss here will apply to both on-premise clusters and cloud-based clusters. Keep in mind that the Cloud Storage connector uses Java, so you’ll want to make sure that the appropriate Java 8 packages are installed on your CDH cluster. Java 8 should come pre-configured as your default Java Development Kit.[Check out this post if you’re deciding how and when to use Cloud Storage over the Hadoop Distributed File System (HDFS).]
Here’s how to get started:
Distribute using the Cloudera parcel
If you’re running a large Hadoop cluster or more than one cluster, it can be hard to deploy libraries and configure Hadoop services to use those libraries without making mistakes. Fortunately, Cloudera Manager provides a way to install packages with parcels. A parcel is a binary distribution format that consists of a gzipped (compressed) tar archive file with metadata.
We recommend using the CDH parcel to install the Cloud Storage connector. There are some big advantages of using a parcel instead of manual deployment and configuration to deploy the Cloud Storage connector on your Hadoop cluster:
Self-contained distribution: All related libraries, scripts and metadata are packaged into a single parcel file. You can host it at an internal location that is accessible to the cluster or even upload it directly to the Cloudera Manager node.
No need for sudo access or root: The parcel is not deployed under /usr or any of the system directories. Cloudera Manager will deploy it through agents, which eliminates the need to use sudo access users or root user to deploy.
Create your own Cloud Storage connector parcel
To create the parcel for your clusters, download and use this script. You can do this on any machine with access to the internet.
This script will execute the following actions:
Download Cloud Storage connector to a local drive
Package the connector Java Archive (JAR) file into a parcel
Place the parcel under the Cloudera Manager’s parcel repo directory
If you’re connecting an on-premise CDH cluster or cluster on a cloud provider other than Google Cloud Platform (GCP), follow the instructions from this page to create a service account and download its JSON key file.
Create the Cloud Storage parcel
Next, you’ll want to run the script to create the parcel file and checksum file and let Cloudera Manager find it with the following steps:
1. Place the service account JSON key file and the create_parcel.sh script under the same directory. Make sure that there are no other files under this directory.
2. Run the script, which will look something like this:
$ ./create_parcel.sh -f <parcel_name> -v <version> -o <os_distro_suffix>
- parcel_name is the name of the parcel in a single string format without any spaces or special characters. (i.e.,, gcsconnector)
- version is the version of the parcel in the format x.x.x (ex: 1.0.0)
- os_distro_suffix: Like the naming conventions of RPM or deb, parcels need to be named in a similar way. A full list of possible distribution suffixes can be found here.
- d is a flag you can use to deploy the parcel to the Cloudera Manager parcel repo folder. It’s optional; if not provided, the parcel file will be created in the same directory where the script ran.
3. Logs of the script can be found in /var/log/build_script.log
Distribute and activate the parcel
Once you’ve created the Cloud Storage parcel, Cloudera Manager has to recognize the parcel and install it on the cluster.
The script you ran generated a .parcel file and a .parcel.sha checksum file. Put these two files on the Cloudera Manager node under directory /opt/cloudera/parcel-repo. If you already host Cloudera parcels somewhere, you can just place these files there and add an entry in the manifest.json file.
On the Cloudera Manager interface, go to Hosts -> Parcels and click Check for New Parcels to refresh the list to load any new parcels. The Cloud Storage connector parcel should show up like this:
8 common reasons why enterprises migrate to the cloud8 common reasons why enterprises migrate to the cloudCloud migration teamProduct Manager
[Editor’s note: This post originally appeared on the Velostrata blog. Velostrata has since come into the Google Cloud fold, and we’re pleased to now bring you their seasoned perspective on deciding to migrate to cloud. There’s more here on how Velostrata’s accelerated migration technology works. ]
At Velostrata, we’ve spent a lot of time talking about how to optimize the cloud migration process. But one of the questions we also get a lot is: What drives an enterprise’s cloud migration in the first place? For this post, we chatted with customers and dug into our own data, along with market data from organizations like RightScale and others to find the most common reasons businesses move to the cloud. If you think moving to the cloud may be in your future, this can help you determine what kinds of events may result in starting a migration plan.
1. Data center contract renewals
Many enterprises have contracts with private data centers that need to be periodically renewed. When you get to renegotiation time for these contracts, considerations like cost adjustments or other limiting factors often come up. Consequently, it’s during these contract renewal periods that many businesses begin to consider moving to the cloud.
When companies merge, it’s often a challenge to match up application landscapes and data—and doing this across multiple on-prem data centers can be all the more challenging. Lots of enterprises undergoing mergers find that moving key applications and data into the cloud makes the process easier. Using cloud also makes it easier to accommodate new geographies and employees, ultimately resulting in a smoother transition.
3. Increased capacity requirements
Whether it’s the normal progression of a growing business or the need to accommodate huge capacity jumps during seasonal shifts, your enterprise can benefit from being able to rapidly increase or decrease compute. Instead of having to pay the maximum for on-prem capacity, you can shift your capacity on-demand with cloud and pay as you go.
4. Software and hardware refresh cycles
When you manage an on-prem data center, it’s up to you to keep everything up to date. This can mean costly on-prem software licenses and hardware upgrades to handle the requirements of newly upgraded software. We’ve seen that when evaluating an upcoming refresh cycle, many enterprises find it’s significantly less expensive to decommission on-prem software and hardware and consider either a SaaS subscription or a lift-and-shift of that application into the public cloud. Which path you choose will depend greatly on the app (and available SaaS options), but either way it’s the beginning of a cloud migration project.
5. Security threats
With security threats only increasing in scale and severity, we know many enterprises that are migrating to the cloud to mitigate risk. Public cloud providers offer vast resources for protecting against threats—more than nearly any single company could invest in.
6. Compliance needs
If you’re working in industries like financial services and healthcare, ensuring data compliance is essential for business operations. Moving to the cloud means businesses are using cloud-based tools and services that are already compliant, helping remove some of the burden of compliance from enterprise IT teams.
7. Product development benefits
By taking advantage of benefits like a pay-as-you-go cost model and dynamic provisioning for product development and testing, many enterprises are finding that the cloud helps them get products to market faster. We see businesses migrating to the cloud not just to save time and money, but also to realize revenue faster.
8. End-of-life events
All good things must come to an end—software included. Increasingly, when critical data center software has an end-of-life event announcement, it can be a natural time for enterprise IT teams to look for ways to replicate those services in the cloud instead of trying to extend the life cycle on-prem. This means enterprises can decommission old licenses and hardware along with getting the other benefits of cloud.
As you can see, there are a lot of reasons why organizations decide to kick off their cloud journeys. In some cases, they’re already in the migration process when they find even more ways to use cloud services in the best way. Understanding the types of events that frequently result in a cloud migration can help you determine the right cloud architecture and migration strategy to get your workloads to the cloud.
Learn more here about cloud migration with Velostrata.
Using upstream Apache Airflow Hooks and Operators in Cloud ComposerUsing upstream Apache Airflow Hooks and Operators in Cloud ComposerGoogle Cloud Customer Engineer
For engineers or developers in charge of integrating, transforming, and loading a variety of data from an ever-growing collection of sources and systems, Cloud Composer has dramatically reduced the number of cycles spent on workflow logistics. Built on Apache Airflow, Cloud Composer makes it easy to author, schedule, and monitor data pipelines across multiple clouds and on-premises data centers.
Let’s walk through an example of how Cloud Composer makes building a pipeline across public clouds easier. As you design your new workflow that’s going to bring data from another cloud (Microsoft Azure’s ADLS, for example) into Google Cloud, you notice that upstream Apache Airflow already has an ADLS hook that you can use to copy data. You insert an import statement into your DAG file, save, and attempt to test your workflow. “ImportError – no module named x.” Now what?
As it turns out, functionality that has been committed upstream—such as brand new Hooks and Operators—might not have made its way into Cloud Composer just yet. Don’t worry, though: you can still use these upstream additions by leveraging the Apache Airflow Plugin interface.
Using the upstream AzureDataLakeHook as an example, all you have to do is the following:
Copy the code into a separate file (ensuring adherence to the Apache License)
from airflow.plugins_manager import AirflowPlugin)
Add the below snippet to the bottom of the file:
AWS also has a free tier, it’s like giving the first hit of ecstasy to someone free. Why not use this free server. Then that server needs to expand and you make plans and youre hooked and know the AWS cloud better than Google.
Google Cloud offers a credit of $300 right now to try and get you involved but its not the same as a free tier of service. Once the $300 is gone its always going to cost you whereas you can downgrade a server back to the free tier if ya decide to do that.
There are also some wonky decisions that Google made that leave me annoyed almost daily. The fact you cant utilize smtp ports of the servers leaves me having to go all around to get a WordPress site to send emails…or the inability to easily transfer a project between accounts. I’ve landed myself in a situation where I transferred ownership but I didn’t remember to transfer billing but was no longer a project owner so I couldnt transfer billing anymore, customer service just acted like it made sense that I couldn’t use or config the resource but that my credit card was still going to be used.
SSH and SFTP into AWS fairly standardized and it is relatively seamless. Google makes these difficult.
The way they only give out one static ip address per zone. They have a BETA project and decide if its to allow multiple IPs but…what took so long? IP Aliases or multiple network ip addresses … on AWS I just added the IP addresses, why do I need more than one? Because my name servers need to have different IP address, but again I cant do it right now.
So with all these limits here and there I personally pay for my servers with AWS (its just easier to use) but I use Google Cloud for short experiments where I may need more than 1 IP, and a site that doesn’t ever send an email. is new and overdue.
Introducing Transfer Appliance in the EU for cloud data migrationIntroducing Transfer Appliance in the EU for cloud data migrationProduct Manager
You can request a Transfer Appliance directly from your GCP console. The service will be available in beta in the EU in a 100TB configuration with total usable capacity of 200TB. And it’ll soon be available in a 480TB configuration with a total usable capacity of a petabyte.
Moving HDFS clusters with Transfer Appliance
Customers have been using Transfer Appliance to move everything from audio and satellite imagery archives to geographic and wind data. One popular use case is migrating Hadoop Distributed File System (HDFS) clusters to GCP.
We see lots of users run their powerful Apache Spark and Apache Hadoop clusters on GCP with Cloud Dataproc, a managed Spark and Hadoop service that allows you to create clusters quickly, then hand off cluster management to the service. Transfer Appliance is an easy way to migrate petabytes of data from on-premise HDFS clusters to GCP.
Earlier this year, we announced the ability to configure Transfer Appliance with one or more NFS volumes. This lets you push HDFS data to Transfer Appliance using Apache DistCp (also known as Distributed Copy)—an open source tool commonly used for intra/inter-cluster data copy. To copy HDFS data onto a Transfer Appliance, configure it with an NFS volume and mount it from the HDFS cluster. Then run DistCp with the mount point as the copy target. Once your data is copied to Transfer Appliance, ship it to us and we’ll load your data into Cloud Storage.
Using Transfer Appliance in production
EU customers such as Candour Creative, which helps their clients tell stories through films and photographs, wanted to take advantage of having their content readily available in the cloud. But Zac Crawley, Director at Candour, was facing some challenges with the move.
“Multiple physical backups of our data were taking up space and becoming costly,” Crawley says. “But when we looked at our network, we figured it would take a matter of months to move the 40TBs of large file data. Transfer Appliance reduced that time significantly.”