How to Create and Deploy Custom Python Models to SageMaker

by David Hren, Lead Data Scientist

Overview of SageMaker Models

SageMaker uses Docker containers to compartmentalize machine learning algorithms. This container approach allows SageMaker to offer a wide range of readily available algorithms for common use-cases while remaining flexible enough to support models developed using common libraries or custom written models. The model containers can be used on three basic levels:

  1. Pre-built Algorithms – fixed class of algorithms fully maintained by AWS
  2. “Script Mode” – allows popular ML frameworks to be utilized via a script
  3. “Container Mode” – allows for a fully customized ML algorithm to be used

These modes offer various degrees of both complexity and ease of use.

Below you’ll find a brief rundown of each mode. The focus for this tutorial will be on the step-by-step process of using container mode to deploy a machine learning model. If you are new to using SageMaker, AWS has produced a series of deep dive videos that you can reference.

In addition to the standard AWS SDKs, Amazon also has a higher level Python package, the SageMaker Python SDK, for training and deploying models using SageMaker, which we will use here.

Pre-Built Algorithms

SageMaker offers pre-built algorithms that can tackle a wide range of problem types and use cases. AWS maintains all of the containers associated with these algorithms. You can find the full list of available algorithms and read more about each one in the SageMaker docs.

Script Mode

Script mode allows you to write Python scripts against commonly used machine learning frameworks. AWS still maintains the underlying container hosting whichever framework you choose, and your script is embedded into the container and used to direct the logic during runtime. In order for you script to be compatible with the AWS maintained container, the script must meet certain design requirements.

Container Mode

Container mode allows you to use custom logic to define a model and deploy it into the SageMaker ecosystem; in this mode you for maintaining both the container and the underlying logic it implements. This mode is the most flexible and can let you access the many Python libraries and machine learning tools available. In order for the container to be compatible with SageMaker, your container must meet certain design requirements. This can be accomplished in one of two ways:

  1. Define your custom container by extending one of the existing ones maintained by AWS
  2. Use the SageMaker Containers Library to define your container

We will focus on using method 1 here, but AWS really has made every effort to make it as easy as possible to use your own custom logic within SageMaker.

After designing your container, you must upload it to the AWS Elastic Container Registry (ECR). This is the model image you will point SageMaker to when training or deploying a model.

Steps Outline

Here we will outline the basic steps involved in creating and deploying a custom model in SageMaker:

  1. Define the logic of the machine learning model
  2. Define the model image
  3. Build and Push the container image to Amazon Elastic Container Registry (ECR)
  4. Train and deploy the model image

As an overview, the entire structure of our custom model will like something like this:

├── container
│   ├──
│   ├── code
│   │   ├──
│   │   └── requirements.txt
│   ├── Dockerfile
│   ├── gam_model
│   │   ├── gam_model
│   │   │   ├──
│   │   │   └──
│   │   ├──
│   │   └──

The directory gam_model contains the core logic of the custom model. The directory code contains the code that instructs our container on how to use the model within SageMaker (model training, saving, loading, and inferencing). Of the remaining files, DockerFile defines the docker image, and is a helper bash script (that I found here) to push our container to ECR so we can use it within SageMaker. We will look at each piece in more detail as we go through each step.

Defining the Logic of the Model

For our custom machine learning model, we will be using a generalized additive model (or GAM). GAMs are a powerful, yet interpretable, algorithm that can detect non-linear relationships and possibly interactions. If you aren’t familiar with GAMs, Kim Larson and Michael Clark both provide helpful introductions to it. Also note, there is a Python package implementing GAMs with robust features, pyGAM. For our purposes, we will make use of the statsmodels package.

When creating a container with a custom model, I generally like to put the actual implementation of the machine learning algorithm within its own Python package. This allows me to compartmentalize the logic of the model with the logic needed to run it in SageMaker, and to modify and test each part independently. Then, the model can be reused in other environments as well.

We will call our package gam_model. I’ve included it within our container definition directory just to make it simpler to include it within the container. We will define it here shortly.

In this case, our package will look like this:

├── gam_model
│   ├──
│   └──

This is a fairly simple Python module that wraps the statsmodel GAM implementation into a scikit-learn-like model. The contents of reads:

import numpy as np
from sklearn.base import BaseEstimator, RegressorMixin
from sklearn.utils.validation import check_is_fitted
from sklearn.utils import check_array
from statsmodels.gam.api import GLMGam, BSplines

class GAMRegressor(BaseEstimator, RegressorMixin):
    def __init__(self, df = 15, alpha = 1.0, degree = 3):
        self.df = df
        self.alpha = alpha = degree
    def fit(self, X, y):
        X, y = self._validate_data(X, y, y_numeric=True)
        self.spline = BSplines(
            X, df = [self.df] * self.n_features_in_, 
            degree = [] * self.n_features_in_, 
            include_intercept = False
        gam = GLMGam(
            y, exog = np.ones(X.shape[0]), 
            smoother = self.spline, alpha = self.alpha
        self.gam_predictor =
        return self

    def predict(self, X):
        check_is_fitted(self, attributes = "gam_predictor")
        X = check_array(X)
        return self.gam_predictor.predict(
            exog = np.ones(X.shape[0]), 
            exog_smooth = X
    def summary(self):
        return self.gam_predictor.summary() if \
               hasattr(self, "gam_predictor") else None is a helper script that rebuilds and installs the package every time I need to modify it. The other components of the package are fairly standard and can found on the corresponding GitHub page.

Defining the Model Image

Now that we have our model implemented and put into a package, the next step is to define the Docker container image that will house our model within the AWS ecosystem. To do this, we first write our DockerFile:

ARG REGION=us-east-1


ENV PATH="/opt/ml/code:${PATH}"

COPY /code /opt/ml/code COPY gam_model/dist/gam_model-0.0.1-py3-none-any.whl /opt/gam_model-0.0.1-py3-none-any.whl
RUN pip install -r /opt/ml/code/requirements.txt /opt/gam_model-0.0.1-py3-none-any.whl

Here we are using one of the container images that AWS has created and maintains for the the scikit-learn framework. You can find the current framework containers in the SageMaker documentation pages (here and here). By extending their container, we can take advantage of everything that they have already done to set it up and just worry about including our additional code and features (we’ll review this more shortly). We then copy in the wheel file of the gam_model package and install it and other dependencies. Lastly, we set the Python file,, as the entry point for the container.

Since we are extending one of AWS’s framework containers, we need to make sure that the instructions for the logic the container should run meets the design requirements laid out in the sagemaker-python-sdk documentation. You can read more about the general SageMaker container requirements in their documentation as well as on the sagemaker-containers page.

In our case, the code directory looks like:

└── requirements.txt

requirements.txt contains some additional packages we need to install in the container, and contains the instructions on how we want the container to train, load, and serve the model.

The training portion looks like this:

import argparse
import os
import json
import pandas as pd
import numpy as np
import joblib
from gam_model import GAMRegressor

if __name__ =='__main__':

    parser = argparse.ArgumentParser()
    gam = GAMRegressor()
    gam_dict = gam.get_params()

    # Data, model, and output directories
                        type = str, 
                        default = os.environ.get('SM_OUTPUT_DATA_DIR'))
                        type = str, 
                        default = os.environ.get('SM_MODEL_DIR'))
                        type = str, 
                        default = os.environ.get('SM_CHANNEL_TRAIN'))
    parser.add_argument('--train-file', type = str)
                        type = str, 
                        default = os.environ.get('SM_CHANNEL_TEST'))
    parser.add_argument('--test-file', type = str, default = None)
    for argument, default_value in gam_dict.items():
                            type = type(default_value),
                            default = default_value)

    print('reading arguments')
    args, _ = parser.parse_known_args()

    print('setting parameters')
    gam_dict.update({key: value for key, value in vars(args).items()\
                    if key in gam_dict and value is not None})

    print('reading training data') 
    # assume there's no headers and the target is the last column
    data = np.loadtxt(os.path.join(args.train, args.train_file), delimiter = ',')
    X = data[:, :-1]
    y = data[:, -1]
    print("X shape:", X.shape)
    print("y shape:", y.shape)

    if args.test_file is not None:
        print('reading training data') 
        # assume there's no headers and the target is the last column
        data = np.loadtxt(os.path.join(args.test, args.test_file), delimiter = ',')
        X_test = data[:, :-1]
        y_test = data[:, -1]

        print("X_test shape:", X_test.shape)
        print("y_test shape:", y_test.shape)
        X_test = None
        y_test = None
    print('fitting model'), y)
    print("R2 (train):", gam.score(X, y))
    if X_test is not None:
        print("R2 (test):", gam.score(X_test, y_test))
    print('saving model') 
    path = os.path.join(args.model_dir, "model.joblib")
    print(f"saving to {path}")
    joblib.dump(gam, path)

Loading the model:

def model_fn(model_dir):
    model = joblib.load(os.path.join(model_dir, "model.joblib"))
    return model

Using the model to make predictions:

def predict_fn(input_object, model):
    return model.predict(input_object)

Note that if we wanted to be able to use different serialization/deserialization techniques with our model within SageMaker, we could also define input_fn and output_fn, but we will make use of the default implementations. 

Build and Push the Container Image to Amazon Elastic Container Registry (ECR)

Now that we have all the ingredients for our container, we can build it and push it to ECR. 

./ gam-model

Note: I have the hard coded the region in both my DockerFile and in to pull from us-east-1 (account id 683313688378). You can adjust this to another region by referencing the docs.

Once you have done this, go to your AWS Console, navigate to ECR, and make note of your model image’s URI.

Training and Deploying the Custom Model

Now that we have defined our model image and registered it with ECR, we can use SageMaker to train and deploy our model! You can follow this process referencing the example.ipynb notebook.

For this example, we will use a small, relatively simple dataset that will display a GAM’s ability to model nonlinear relationships: the Gauss3 dataset.

We first import the necessary libraries and run some initialization:

import requests
import sagemaker
import boto3
import s3fs
import json
import io

import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

from sagemaker.estimator import Estimator
from sagemaker.predictor import Predictor
from sagemaker.serializers import NumpySerializer
from sagemaker.deserializers import NumpyDeserializer
from sagemaker.local import LocalSession

from matplotlib import pyplot as plt
import matplotlib as mpl
import seaborn as sns

%matplotlib inline

seed = 42
rand = np.random.RandomState(seed)

local_mode = False # activate to use local mode

with open("config.json") as f:
    configs = json.load(f)
default_bucket = configs["default_bucket"] #put your bucket name here
role = configs["role_arn"] # put your sagemaker role arn here

boto_session = boto3.Session()
if local_mode:
    sagemaker_session = LocalSession(boto_session = boto_session)
    sagemaker_session._default_bucket = default_bucket
    sagemaker_session = sagemaker.Session(
        boto_session = boto_session,
        default_bucket = default_bucket

ecr_image = configs["image_arn"] #put the image uri from ECR here

prefix = "modeling/sagemaker"

data_name = f"gauss3"
test_name = "gam-demo"

Note that I’m using a configs file to store my S3 bucket name, SageMaker role, and training image URI, but you can set these directly. Next, we define two helper functions. I also include logic to train and deploy the model locally or on SageMaker instances.

def get_s3fs():
    return s3fs.S3FileSystem(key = boto_session.get_credentials().access_key,
                             secret = boto_session.get_credentials().secret_key,
                             token = boto_session.get_credentials().token)

def plot_and_clear():

We can retrieve the Gauss data using the requests module and apply a train-test-split.

url = ""

r = requests.get(url)

y, x = np.loadtxt(
    io.StringIO(r.text[r.text.index("Data:   y          x"):]), 
    skiprows=1, unpack=True

x = x.reshape(-1, 1)

X_train, X_test, y_train, y_test = train_test_split(
    x, y, test_size = 0.25, 
    random_state = rand

After writing the training data to our S3 bucket,

file_fn = f"{default_bucket}/{prefix}/{data_name}/train/data.csv"
file_path = f"s3://{file_fn}"

s3 = get_s3fs()
with, 'wb') as f:
    np.savetxt(f, np.c_[X_train, y_train], delimiter = ',')

we can train our model:

hyperparameters = {
    "train-file": "data.csv",
    "df": "20"

data_channels = {
    "train": file_path

estimator = Estimator(
    role = role,
    sagemaker_session = sagemaker_session,
    instance_count = 1,
    instance_type = "local" if local_mode else "ml.m5.large",
    image_uri = ecr_image,
    base_job_name = f'{data_name}-{test_name}',
    hyperparameters = hyperparameters,
    output_path = f"s3://{default_bucket}/{prefix}/{data_name}/model"
), wait = True, logs = "None")
job_name =

Once the model is trained, we can deploy it to make real-time inferences.

np_serialize = NumpySerializer()
np_deserialize = NumpyDeserializer()

predictor = estimator.deploy(
    initial_instance_count = 1,
    instance_type = "local" if local_mode else "ml.t2.medium",
    serializer = np_serialize,
    deserializer = np_deserialize

Now let’s get model predictions on the training and testing data and compare it against the actual data.

y_hat_train = predictor.predict(X_train)
y_hat_test = predictor.predict(X_test)

We can see that the GAM found a smooth representation which captures the non-linearity of the data.

Be sure to delete the model endpoint when you are done testing the model.



This tutorial has outlined the process of creating a unique container image in SageMaker and showed how it can be used to train and deploy a custom machine learning model. Hopefully this has been helpful and will serve as a useful reference.

Note: If you have trouble during this process, be sure to check the CloudWatch log groups for your SageMaker building, training, and deployment instances. They are your best friend for finding and resolving issues!

PREDICTif is a Select Consulting Partner with Amazon Web Services. For more information, check out our Amazon partner page.

Revolutionize your business with our customized cloud solutions.