Effortless Serverless ML Inference: Unleash FastAPI, Docker, Lambda & Amazon API Gateway

Introduction Deploying machine learning (ML) models from proof of concept to production has always been a challenge for data scientists and ML engineers. The transition can often be plagued with performance issues, latency, and infrastructure concerns. Amazon SageMaker Inference offers a solution to these challenges, providing a suite of services and tools that streamline model…

Written by

Casey Jones

Published on

June 24, 2023

Blog

Introduction

Deploying machine learning (ML) models from proof of concept to production has always been a challenge for data scientists and ML engineers. The transition can often be plagued with performance issues, latency, and infrastructure concerns. Amazon SageMaker Inference offers a solution to these challenges, providing a suite of services and tools that streamline model deployment while offering high-performance and optimized infrastructure.

FastAPI, a modern high-performance web framework for building APIs in Python, has become exceedingly popular for building RESTful microservices in recent years. It is well-suited for scalable ML inference across various industries, offering features such as automatic API documentation and out-of-the-box functionalities that make it both user-friendly and powerful.

In this article, we will show you how to deploy serverless ML inference using FastAPI, Docker, Lambda, and Amazon API Gateway, and automate the deployment using AWS CDK.

Benefits of Amazon SageMaker Inference

Amazon SageMaker Inference offers several benefits for ML deployment:

A wide selection of ML infrastructure and deployment options tailored to meet different workloads and performance requirements.
Serverless Inference endpoints for workloads with idle periods and tolerable cold starts, where pay-as-you-go models offer cost savings.
Integration with AWS Lambda for flexible, cost-effective deployment of your ML models.

FastAPI Overview

FastAPI is a modern, high-performance web framework for building APIs in Python. It has become increasingly popular for building RESTful microservices and ML inference at scale for various industries. Key features of FastAPI include:

Automatic API documentation generation
Out-of-the-box functionalities, such as dependency injection and request validation
Ease of use and quick learning curve for users

Solution Architecture

The proposed solution architecture can be summarized in the following diagram:

[Insert solution architecture diagram]

Prerequisites

To follow the steps in this guide, you will need:

Python3
Virtualenv
AWS CDK v2
Docker

Setting up the Environment

Before getting started, you will need to set up your environment:

Create a Python virtual environment to isolate dependencies.
Install the AWS CLI tools to interact with AWS services.
Verify that necessary software (Python3, virtualenv, aws-cdk, Docker) is installed and configured.

Developing the FastAPI Application

Next, we will build a FastAPI application to serve our trained ML model:

Create a FastAPI application with required dependencies.
Organize routes for serving the ML model’s predictions.
Test the application locally to ensure correct functionality.

Containerizing the Application Using Docker

Once the FastAPI application is working locally, we can containerize it using Docker:

Create a Dockerfile for your FastAPI application.
Build the Docker container and run it locally.

Deploying FastAPI Application on AWS Lambda

With our Docker container built, we can deploy it using AWS Lambda:

Create a new AWS CDK project.
Configure the AWS CDK to deploy your Docker container to AWS Lambda.
Deploy the AWS CDK project.

Setting Up Amazon API Gateway

Finally, we need to expose our FastAPI application through Amazon API Gateway:

Create an API Gateway to handle incoming requests for your ML model.
Configure the API Gateway to handle requests for your serverless ML inference deployed on AWS Lambda.

References

Include any references or resources cited in the article.

Another blow to Australian Businesses. Scammers are piggybacking on the shoulders of Aussie businesses and their customers through this simple yet effective online scam. [Update] “We reported the scam page to Facebook through their reporting system, but despite submitting multiple reports, Facebook repeatedly denied the request to remove the page and associated posts. Facebook said…

Casey Jones

November 11, 2022

4 minute Read

Industry News & Trends

B2B Content Marketing Trends 2023

As marketers, staying informed on the latest trends in content marketing is important. In 2023, B2B content marketing will take centre stage as businesses look for innovative ways to reach and engage their target audiences. With that in mind, understanding the emerging trends and best practices in this field is key to staying ahead of…

Gracie Jones

December 15, 2022

4 minute Read

Industry News & Trends

What Does Off Market Mean?

When it comes to marketing, the term "off market" refers to a product or service that is not actively promoted or advertised to potential customers. It can be a strategic decision for a company, as they may want to focus their resources on promoting their most popular or profitable products or services. If a product…

Gracie Jones

December 26, 2022

Effortless Serverless ML Inference: Unleash FastAPI, Docker, Lambda & Amazon API Gateway

Introduction

Benefits of Amazon SageMaker Inference

FastAPI Overview

Solution Architecture

Prerequisites

Setting up the Environment

Developing the FastAPI Application

Containerizing the Application Using Docker

Deploying FastAPI Application on AWS Lambda

Setting Up Amazon API Gateway

References

Further Reading

Related Articles

The ‘Giveaway Piggy Back Scam’ In Full Swing [2022]

Casey Jones

B2B Content Marketing Trends 2023

Gracie Jones

What Does Off Market Mean?

Gracie Jones