In this tutorial, we'll learn how to deploy ML Models with Flask on VPS and TensorFlow.
Deploying machine learning models is a crucial step in delivering AI-powered applications. Flask, a lightweight Python web framework, is ideal for serving TensorFlow models for production use. This tutorial will cover the complete process of setting up a production-ready ML application on a VPS or dedicated server using Flask to serve a TensorFlow model.
Prerequisites
- A KVM VPS or dedicated server with Ubuntu 24.04.
- Basic knowledge of Python, Flask, and TensorFlow.
- Installed TensorFlow, Flask, and other required libraries on the server.
Deploy ML Models with Flask on VPS
Let's break this tutorial into logical steps:
1. Set Up the Server
Before deploying the application, ensure your server is up-to-date and ready for Python development.
Step 1.1: Update the server
Connect to your server via SSH and run the following commands to update it:
sudo apt update
sudo apt upgrade
Step 1.2: Install Python 3 and pip
Ensure Python 3 is installed. If not, install it along with pip:
sudo apt install python3 python3-pip
You can verify the installations with:
python3 --version
pip3 --version
Step 1.3: Set Up a Virtual Environment
It's recommended to create a Python virtual environment for the Flask application to avoid conflicts between packages.
sudo apt install python3-venv
mkdir ~/flask_ml_app
cd ~/flask_ml_app
python3 -m venv venv
source venv/bin/activate
After activating the virtual environment, your prompt should indicate the environment name (e.g., (venv)).
Step 1.4: Install Flask and TensorFlow
With the virtual environment active, install Flask and TensorFlow using pip:
pip install flask tensorflow
At this point, you’ve set up the base environment for serving your machine learning model.
2. Build and Save a TensorFlow Model
For this guide, we’ll create a simple TensorFlow model, but you can substitute any trained model.
Step 2.1: Create a Simple TensorFlow Model
Let's build a basic model to classify handwritten digits using the MNIST dataset. You can write a Python script to create and train the model:
nano model_train.py
Add following content:
# model_train.py
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
# Load dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
# Preprocess data
train_images = train_images.reshape((60000, 28, 28, 1)).astype('float32') / 255
test_images = test_images.reshape((10000, 28, 28, 1)).astype('float32') / 255
# Build the model
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10, activation='softmax')
])
# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# Train the model
model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))
# Save the model
model.save('mnist_model.h5')
Run this script to train the model and save it as mnist_model.h5
:
python model_train.py
Step 2.2: Transfer the Model to the Server
If the model is trained locally, transfer it to your server using scp:
scp mnist_model.h5 username@server_ip:~/flask_ml_app/
3. Build the Flask Application to Serve the Model
Next, we’ll create a Flask API that loads the TensorFlow model and processes predictions.
Step 3.1: Create the Flask Application
In the same directory (~/flask_ml_app
), create the Flask app:
nano app.py
Add the following code to app.py
:
from flask import Flask, request, jsonify
import tensorflow as tf
import numpy as np
from tensorflow.keras.models import load_model
# Initialize the Flask app
app = Flask(__name__)
# Load the saved model
model = load_model('mnist_model.h5')
# Define the predict function
@app.route('/predict', methods=['POST'])
def predict():
# Get the data from the request
data = request.get_json(force=True)
# Convert the data into a numpy array and reshape it to fit the model's input
image = np.array(data['image']).reshape(1, 28, 28, 1)
# Normalize the image data
image = image.astype('float32') / 255
# Make a prediction
predictions = model.predict(image)
# Return the predicted class
predicted_class = np.argmax(predictions)
return jsonify({'predicted_class': int(predicted_class)})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Step 3.2: Test the Flask App Locally
Before deploying the app for production, you can test it locally. First, run the Flask app:
python app.py
The Flask server should be running on http://0.0.0.0:5000
. You can test it using a tool like Postman or cURL.
For example, using cURL:
curl -X POST http://localhost:5000/predict -H "Content-Type: application/json" -d '{"image": [[...]]}'
Replace [...]
with a valid 28x28 pixel image array.
4. Configure Gunicorn for Production
Flask’s built-in development server is not suitable for production, so we’ll use Gunicorn, a WSGI server, to handle production requests.
Step 4.1: Install Gunicorn
With the virtual environment still active, install Gunicorn:
pip install gunicorn
Step 4.2: Run the Flask App with Gunicorn
Run the Flask application using Gunicorn to handle multiple requests:
gunicorn --bind 0.0.0.0:5000 app:app
At this stage, your app is ready for production deployment but still needs to be secured and managed by a process manager.
5. Configure Nginx as a Reverse Proxy
Nginx will act as a reverse proxy, forwarding client requests to Gunicorn.
Step 5.1: Install Nginx
Install Nginx on your server if you don’t have it installed already:
sudo apt install nginx
Step 5.2: Configure Nginx
Create a new Nginx configuration file for the Flask app:
sudo nano /etc/nginx/sites-available/flask_ml_app
Add the following configuration:
server {
listen 80;
server_name your_domain_or_IP;
location / {
proxy_pass http://127.0.0.1:5000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
Replace your_domain_or_IP
with your server’s domain or IP address.
Step 5.3: Enable the Nginx Configuration
Link the configuration file to sites-enabled and restart Nginx:
sudo ln -s /etc/nginx/sites-available/flask_ml_app /etc/nginx/sites-enabled
sudo systemctl restart nginx
6. Manage the Application with Systemd
We’ll configure Systemd to keep the application running, even after server reboots.
Step 6.1: Create a Systemd Service
Create a new service file:
sudo nano /etc/systemd/system/flask_ml_app.service
Add the following content:
[Unit]
Description=Gunicorn instance to serve Flask ML app
After=network.target
[Service]
User=username
Group=www-data
WorkingDirectory=/home/username/flask_ml_app
Environment="PATH=/home/username/flask_ml_app/venv/bin"
ExecStart=/home/username/flask_ml_app/venv/bin/gunicorn --workers 3 --bind unix:flask_ml_app.sock -m 007 app:app
[Install]
WantedBy=multi-user.target
Replace username
with your actual username.
Step 6.2: Start and Enable the Service
Reload Systemd, start the service, and enable it to run on boot:
sudo systemctl daemon-reload
sudo systemctl start flask_ml_app
sudo systemctl enable flask_ml_app
7. Testing and Final Adjustments
You can now access your Flask app by visiting your server’s IP or domain in the browser.
Step 7.1: Test the Model API
Use cURL or Postman to send a request to your deployed app:
curl -X POST http://your_domain_or_IP/predict -H "Content-Type: application/json" -d '{"image": [[...]]}'
You should receive a JSON response with the predicted class.
8. Securing the Application with SSL
For security, it’s critical to secure the app using Let’s Encrypt SSL.
Step 8.1: Install Certbot
Install Certbot, the Let’s Encrypt client, to obtain and manage your SSL certificates:
sudo apt install certbot python3-certbot-nginx
Step 8.2: Obtain the SSL Certificate
Run the following command to automatically configure Nginx with SSL:
sudo certbot --nginx -d your_domain_or_IP
Follow the prompts to complete the SSL setup.
Conclusion
In this tutorial, we walked through the entire process of deploying a machine learning model using Flask and TensorFlow on a dedicated server. You learned how to:
- Set up a server with Python and TensorFlow.
- Train and save a machine learning model.
- Serve the model using Flask.
- Use Gunicorn and Nginx for production.
- Secure the app with SSL.
This setup is now ready to handle production traffic and can be extended to more complex models.