Enable ML Inferencing at Scale using Amazon SageMaker | Amazon Robotics case study | AWS

Although Amazon Robotics could tap into ample compute resources on AWS, the company still had to handle hosting itself. When AWS announced the release of Amazon SageMaker at AWS re:Invent 2017, Amazon Robotics quickly adopted it, avoiding the need to build a costly hosting solution of its own. Amazon Robotics was the first company to deploy to Amazon SageMaker on a large scale and remains one of the largest deployments as of January 2021.

At first the team primarily used Amazon SageMaker to host models. Amazon Robotics adapted its service usage as needed, initially using a hybrid architecture and running some algorithms on premises and some on the cloud. “We built a core set of functionalities that enabled us to deliver the Intent Detection System,” says Tim Stallman, a senior software manager at Amazon Robotics. “And then as Amazon SageMaker features came online, we slowly started adopting those.” For example, the team adopted Amazon SageMaker Experiments—a capability that enabled the team to organize, track, compare, and evaluate ML experiments and model versions.

Amazon Robotics also used Amazon SageMaker automatic scaling. “Amazon SageMaker doesn’t just manage the hosts we use for inferencing,” says Gallaudet. “It also automatically adds or removes hosts as needed to support the workload.” Because it doesn’t need to procure or manage its own fleet of over 500 GPUs, the company has saved close to 50 percent on its inferencing costs.