Deploy ML Models with Flask on VPS

By Anurag Singh

Updated on Oct 16, 2024

Deploy ML Models with Flask on VPS

In this tutorial, we'll learn how to deploy ML Models with Flask on VPS and TensorFlow. 

Deploying machine learning models is a crucial step in delivering AI-powered applications. Flask, a lightweight Python web framework, is ideal for serving TensorFlow models for production use. This tutorial will cover the complete process of setting up a production-ready ML application on a VPS or dedicated server using Flask to serve a TensorFlow model.

Prerequisites

  • A KVM VPS or dedicated server with Ubuntu 24.04.
  • Basic knowledge of Python, Flask, and TensorFlow.
  • Installed TensorFlow, Flask, and other required libraries on the server.

Deploy ML Models with Flask on VPS

Let's break this tutorial into logical steps:

1. Set Up the Server

Before deploying the application, ensure your server is up-to-date and ready for Python development.

Step 1.1: Update the server

Connect to your server via SSH and run the following commands to update it:

sudo apt update
sudo apt upgrade

Step 1.2: Install Python 3 and pip

Ensure Python 3 is installed. If not, install it along with pip:

sudo apt install python3 python3-pip

You can verify the installations with:

python3 --version
pip3 --version

Step 1.3: Set Up a Virtual Environment

It's recommended to create a Python virtual environment for the Flask application to avoid conflicts between packages.

sudo apt install python3-venv
mkdir ~/flask_ml_app
cd ~/flask_ml_app
python3 -m venv venv
source venv/bin/activate

After activating the virtual environment, your prompt should indicate the environment name (e.g., (venv)).

Step 1.4: Install Flask and TensorFlow

With the virtual environment active, install Flask and TensorFlow using pip:

pip install flask tensorflow

At this point, you’ve set up the base environment for serving your machine learning model.

2. Build and Save a TensorFlow Model

For this guide, we’ll create a simple TensorFlow model, but you can substitute any trained model.

Step 2.1: Create a Simple TensorFlow Model

Let's build a basic model to classify handwritten digits using the MNIST dataset. You can write a Python script to create and train the model:

nano model_train.py

Add following content:

# model_train.py
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist

# Load dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Preprocess data
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255

# Build the model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))

# Save the model
model.save('mnist_model.h5')

Run this script to train the model and save it as mnist_model.h5:

python model_train.py

Step 2.2: Transfer the Model to the Server

If the model is trained locally, transfer it to your server using scp:

scp mnist_model.h5 username@server_ip:~/flask_ml_app/

3. Build the Flask Application to Serve the Model

Next, we’ll create a Flask API that loads the TensorFlow model and processes predictions.

Step 3.1: Create the Flask Application

In the same directory (~/flask_ml_app), create the Flask app:

nano app.py

Add the following code to app.py:

from flask import Flask, request, jsonify
import tensorflow as tf
import numpy as np
from tensorflow.keras.models import load_model

# Initialize the Flask app
app = Flask(__name__)

# Load the saved model
model = load_model('mnist_model.h5')

# Define the predict function
@app.route('/predict', methods=['POST'])
def predict():
    # Get the data from the request
    data = request.get_json(force=True)
    
    # Convert the data into a numpy array and reshape it to fit the model's input
    image = np.array(data['image']).reshape(1, 28, 28, 1)
    
    # Normalize the image data
    image = image.astype('float32') / 255
    
    # Make a prediction
    predictions = model.predict(image)
    
    # Return the predicted class
    predicted_class = np.argmax(predictions)
    
    return jsonify({'predicted_class': int(predicted_class)})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Step 3.2: Test the Flask App Locally

Before deploying the app for production, you can test it locally. First, run the Flask app:

python app.py

The Flask server should be running on http://0.0.0.0:5000. You can test it using a tool like Postman or cURL.

For example, using cURL:

curl -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d '{"image": [[...]]}'

Replace [...] with a valid 28x28 pixel image array.

4. Configure Gunicorn for Production

Flask’s built-in development server is not suitable for production, so we’ll use Gunicorn, a WSGI server, to handle production requests.

Step 4.1: Install Gunicorn

With the virtual environment still active, install Gunicorn:

pip install gunicorn

Step 4.2: Run the Flask App with Gunicorn

Run the Flask application using Gunicorn to handle multiple requests:

gunicorn --bind 0.0.0.0:5000 app:app

At this stage, your app is ready for production deployment but still needs to be secured and managed by a process manager.

5. Configure Nginx as a Reverse Proxy

Nginx will act as a reverse proxy, forwarding client requests to Gunicorn.

Step 5.1: Install Nginx

Install Nginx on your server if you don’t have it installed already:

sudo apt install nginx

Step 5.2: Configure Nginx

Create a new Nginx configuration file for the Flask app:

sudo nano /etc/nginx/sites-available/flask_ml_app

Add the following configuration:

server {
    listen 80;
    server_name your_domain_or_IP;

    location / {
        proxy_pass http://127.0.0.1:5000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Replace your_domain_or_IP with your server’s domain or IP address.

Step 5.3: Enable the Nginx Configuration

Link the configuration file to sites-enabled and restart Nginx:

sudo ln -s /etc/nginx/sites-available/flask_ml_app /etc/nginx/sites-enabled
sudo systemctl restart nginx

6. Manage the Application with Systemd

We’ll configure Systemd to keep the application running, even after server reboots.

Step 6.1: Create a Systemd Service

Create a new service file:

sudo nano /etc/systemd/system/flask_ml_app.service

Add the following content:

[Unit]
Description=Gunicorn instance to serve Flask ML app
After=network.target

[Service]
User=username
Group=www-data
WorkingDirectory=/home/username/flask_ml_app
Environment="PATH=/home/username/flask_ml_app/venv/bin"
ExecStart=/home/username/flask_ml_app/venv/bin/gunicorn --workers 3 --bind unix:flask_ml_app.sock -m 007 app:app

[Install]
WantedBy=multi-user.target

Replace username with your actual username.

Step 6.2: Start and Enable the Service

Reload Systemd, start the service, and enable it to run on boot:

sudo systemctl daemon-reload
sudo systemctl start flask_ml_app
sudo systemctl enable flask_ml_app

7. Testing and Final Adjustments

You can now access your Flask app by visiting your server’s IP or domain in the browser.

Step 7.1: Test the Model API

Use cURL or Postman to send a request to your deployed app:

curl -X POST http://your_domain_or_IP/predict -H "Content-Type: application/json" -d '{"image": [[...]]}'

You should receive a JSON response with the predicted class.

8. Securing the Application with SSL

For security, it’s critical to secure the app using Let’s Encrypt SSL.

Step 8.1: Install Certbot

Install Certbot, the Let’s Encrypt client, to obtain and manage your SSL certificates:

sudo apt install certbot python3-certbot-nginx

Step 8.2: Obtain the SSL Certificate

Run the following command to automatically configure Nginx with SSL:

sudo certbot --nginx -d your_domain_or_IP

Follow the prompts to complete the SSL setup.

Conclusion

In this tutorial, we walked through the entire process of deploying a machine learning model using Flask and TensorFlow on a dedicated server. You learned how to:

  • Set up a server with Python and TensorFlow.
  • Train and save a machine learning model.
  • Serve the model using Flask.
  • Use Gunicorn and Nginx for production.
  • Secure the app with SSL.

This setup is now ready to handle production traffic and can be extended to more complex models.