Streamlit for Machine Learning: Build ML Web Apps in Minutes

You’ve built a great ML model. Your manager wants a demo. Your client needs an interface to test it. Your portfolio needs something more impressive than “here’s my Jupyter notebook.” You know you need a web app, but you’re a data scientist, not a web developer. Learning React, Flask templates, CSS, and deployment infrastructure sounds like a three-month detour from actual ML work.

I spent two weeks building my first ML demo with Flask before discovering Streamlit. What took 500 lines of HTML, CSS, JavaScript, and Python boilerplate became 50 lines of pure Python. No frontend knowledge required. No CSS debugging. No JavaScript wrestling. Just Python code that turns into a functional web app in minutes. Streamlit is the difference between “I’ll build that demo eventually” and “I built that demo during lunch.”

Let me show you how to stop avoiding demos and start shipping ML applications.

What Is Streamlit and Why It Exists

Streamlit is a Python framework for creating web applications with pure Python — no HTML, CSS, or JavaScript required. For ML practitioners, it’s the fastest way to turn models into shareable applications.

What Streamlit provides:

Pure Python web app development
Automatic UI from Python code
Built-in widgets (sliders, file uploaders, etc.)
Real-time interactivity
Easy deployment
Data visualization integration

What problems it solves:

Need frontend skills for ML demos
Slow iteration on ML applications
Difficulty sharing models with non-technical users
Complex deployment requirements
Notebook-to-production gap

Think of Streamlit as “Jupyter notebooks but for end users” — interactive, visual, and code-centric, but actually shareable as real applications.

Installation and Your First App

Getting started takes literally 2 minutes:

bash

pip install streamlit

Create app.py:

python

import streamlit as st

st.title("My First ML App")
st.write("Hello, Streamlit!")

name = st.text_input("What's your name?")
st.write(f"Hello, {name}!")

Run it:

bash

streamlit run app.py

A browser window opens with your app running. Change the code, save, and the app automatically updates. That’s it — you’ve built and deployed a web app.

Real ML Demo: Image Classification

Let’s build an actual ML application:

python

import streamlit as st
import torch
import torchvision.transforms as transforms
from PIL import Image
import torchvision.models as models

# Page config
st.set_page_config(
    page_title="Image Classifier",
    page_icon="🖼️",
    layout="wide"
)

# Title
st.title("🖼️ Image Classification Demo")
st.write("Upload an image to classify it using ResNet50")

# Load model (cached to avoid reloading)
@st.cache_resource
def load_model():
    model = models.resnet50(pretrained=True)
    model.eval()
    return model

model = load_model()

# Load ImageNet labels
@st.cache_data
def load_labels():
    with open('imagenet_classes.txt') as f:
        return [line.strip() for line in f.readlines()]

labels = load_labels()

# File uploader
uploaded_file = st.file_uploader(
    "Choose an image...",
    type=['jpg', 'jpeg', 'png']
)

if uploaded_file is not None:
    # Display image
    image = Image.open(uploaded_file)
    st.image(image, caption='Uploaded Image', use_column_width=True)
    
    # Preprocess
    transform = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225]
        )
    ])
    
    input_tensor = transform(image).unsqueeze(0)
    
    # Predict
    with st.spinner('Classifying...'):
        with torch.no_grad():
            output = model(input_tensor)
        
        # Get top 5 predictions
        probabilities = torch.nn.functional.softmax(output[0], dim=0)
        top5_prob, top5_idx = torch.topk(probabilities, 5)
    
    # Display results
    st.subheader("Top 5 Predictions:")
    for i, (prob, idx) in enumerate(zip(top5_prob, top5_idx)):
        st.write(f"{i+1}. **{labels[idx]}**: {prob.item()*100:.2f}%")
        st.progress(prob.item())

Run this and you have a functional image classifier with:

File upload
Image display
Model inference
Results visualization
Progress indicators

All in ~60 lines of Python. No HTML, CSS, or JavaScript.

Core Streamlit Widgets

Streamlit provides widgets for everything you need:

Input Widgets

python

# Text input
name = st.text_input("Enter your name")

# Number input
age = st.number_input("Enter your age", min_value=0, max_value=120)

# Slider
temperature = st.slider("Select temperature", 0.0, 1.0, 0.5)

# Select box
model = st.selectbox("Choose model", ["ResNet50", "VGG16", "EfficientNet"])

# Multi-select
features = st.multiselect(
    "Select features",
    ["feature1", "feature2", "feature3"]
)

# Checkbox
show_raw_data = st.checkbox("Show raw data")

# Radio buttons
option = st.radio("Pick one", ["Option A", "Option B", "Option C"])

# File uploader
file = st.file_uploader("Upload CSV", type=['csv'])

# Date input
date = st.date_input("Select date")

# Time input
time = st.time_input("Select time")

# Text area (multi-line)
feedback = st.text_area("Enter feedback")

# Color picker
color = st.color_picker("Pick a color")

Display Elements

python

# Text
st.write("Regular text")
st.markdown("**Bold** and *italic*")
st.title("Title")
st.header("Header")
st.subheader("Subheader")
st.caption("Caption text")
st.code("print('hello')", language='python')

# Data
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
st.dataframe(df)  # Interactive table
st.table(df)  # Static table
st.json({'key': 'value'})  # JSON

# Metrics
st.metric("Accuracy", "94.2%", "2.1%")  # Shows delta

# Charts
st.line_chart(df)
st.bar_chart(df)
st.area_chart(df)

# Media
st.image("image.jpg")
st.audio("audio.mp3")
st.video("video.mp4")

Layout and Organization

Control your app’s layout:

Columns

python

col1, col2, col3 = st.columns(3)

with col1:
    st.header("Column 1")
    st.write("Content for column 1")

with col2:
    st.header("Column 2")
    st.write("Content for column 2")

with col3:
    st.header("Column 3")
    st.write("Content for column 3")

Tabs

python

tab1, tab2, tab3 = st.tabs(["Model", "Data", "About"])

with tab1:
    st.write("Model information here")

with tab2:
    st.dataframe(df)

with tab3:
    st.write("About this app")

Sidebar

python

st.sidebar.title("Settings")
model = st.sidebar.selectbox("Model", ["ResNet", "VGG"])
threshold = st.sidebar.slider("Threshold", 0.0, 1.0, 0.5)

# Main content
st.title("Main Content")
st.write(f"Using {model} with threshold {threshold}")

Expanders

python

with st.expander("See explanation"):
    st.write("Detailed explanation here")
    st.image("diagram.png")

Containers

python

header = st.container()
body = st.container()

with header:
    st.title("My App")
    st.write("Description")

with body:
    st.write("Main content")

Caching: Make Apps Fast

Streamlit reruns your entire script on every interaction. Caching prevents expensive operations from repeating:

Cache Data

python

@st.cache_data
def load_data(file_path):
    """Cache data loading."""
    return pd.read_csv(file_path)

df = load_data("large_dataset.csv")  # Only loads once

Cache Resources (Models)

python

@st.cache_resource
def load_model():
    """Cache model loading."""
    model = torch.load("model.pth")
    model.eval()
    return model

model = load_model()  # Only loads once

When to use which:

@st.cache_data: For data (DataFrames, arrays, etc.)
@st.cache_resource: For models, connections, expensive objects

Without caching, your app reloads the model on every interaction. With caching, it loads once and reuses it. This is crucial for performance.

Real-World Example: Complete ML Dashboard

Let’s build a full ML dashboard with multiple features:

python

import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

# Page config
st.set_page_config(
    page_title="ML Dashboard",
    page_icon="📊",
    layout="wide"
)

# Title
st.title("📊 Machine Learning Dashboard")

# Sidebar for settings
st.sidebar.header("Settings")
uploaded_file = st.sidebar.file_uploader("Upload CSV", type=['csv'])

if uploaded_file:
    # Load data
    @st.cache_data
    def load_data(file):
        return pd.read_csv(file)
    
    df = load_data(uploaded_file)
    
    # Tabs
    tab1, tab2, tab3, tab4 = st.tabs(["📈 Data", "🔍 EDA", "🤖 Model", "📊 Results"])
    
    # Tab 1: Data Overview
    with tab1:
        st.header("Data Overview")
        
        col1, col2, col3 = st.columns(3)
        col1.metric("Rows", df.shape[0])
        col2.metric("Columns", df.shape[1])
        col3.metric("Missing Values", df.isnull().sum().sum())
        
        st.subheader("Dataset Preview")
        st.dataframe(df.head(), use_container_width=True)
        
        st.subheader("Dataset Statistics")
        st.dataframe(df.describe(), use_container_width=True)
        
        if st.checkbox("Show missing values"):
            st.write(df.isnull().sum())
    
    # Tab 2: Exploratory Data Analysis
    with tab2:
        st.header("Exploratory Data Analysis")
        
        # Select columns for visualization
        numeric_cols = df.select_dtypes(include=[np.number]).columns.tolist()
        
        if numeric_cols:
            col1, col2 = st.columns(2)
            
            with col1:
                x_axis = st.selectbox("X-axis", numeric_cols)
            with col2:
                y_axis = st.selectbox("Y-axis", numeric_cols)
            
            # Scatter plot
            fig = px.scatter(df, x=x_axis, y=y_axis, title=f"{x_axis} vs {y_axis}")
            st.plotly_chart(fig, use_container_width=True)
            
            # Distribution plot
            selected_col = st.selectbox("Select column for distribution", numeric_cols)
            fig2 = px.histogram(df, x=selected_col, title=f"Distribution of {selected_col}")
            st.plotly_chart(fig2, use_container_width=True)
    
    # Tab 3: Model Training
    with tab3:
        st.header("Model Training")
        
        # Select target
        target = st.selectbox("Select target variable", df.columns.tolist())
        
        # Select features
        feature_options = [col for col in df.columns if col != target]
        features = st.multiselect("Select features", feature_options, default=feature_options[:5])
        
        if features and target:
            # Model parameters
            st.subheader("Model Parameters")
            col1, col2 = st.columns(2)
            
            with col1:
                n_estimators = st.slider("Number of trees", 10, 200, 100)
            with col2:
                test_size = st.slider("Test size", 0.1, 0.5, 0.2)
            
            if st.button("Train Model"):
                with st.spinner("Training model..."):
                    # Prepare data
                    X = df[features]
                    y = df[target]
                    
                    X_train, X_test, y_train, y_test = train_test_split(
                        X, y, test_size=test_size, random_state=42
                    )
                    
                    # Train model
                    model = RandomForestClassifier(n_estimators=n_estimators, random_state=42)
                    model.fit(X_train, y_train)
                    
                    # Predictions
                    y_pred = model.predict(X_test)
                    
                    # Store in session state
                    st.session_state['model'] = model
                    st.session_state['X_test'] = X_test
                    st.session_state['y_test'] = y_test
                    st.session_state['y_pred'] = y_pred
                    st.session_state['features'] = features
                    
                st.success("Model trained successfully!")
    
    # Tab 4: Results
    with tab4:
        st.header("Model Results")
        
        if 'model' in st.session_state:
            model = st.session_state['model']
            y_test = st.session_state['y_test']
            y_pred = st.session_state['y_pred']
            
            # Metrics
            accuracy = (y_pred == y_test).mean()
            st.metric("Accuracy", f"{accuracy:.2%}")
            
            # Classification report
            st.subheader("Classification Report")
            report = classification_report(y_test, y_pred, output_dict=True)
            st.dataframe(pd.DataFrame(report).transpose())
            
            # Confusion matrix
            st.subheader("Confusion Matrix")
            cm = confusion_matrix(y_test, y_pred)
            fig, ax = plt.subplots()
            sns.heatmap(cm, annot=True, fmt='d', ax=ax)
            st.pyplot(fig)
            
            # Feature importance
            st.subheader("Feature Importance")
            importance_df = pd.DataFrame({
                'Feature': st.session_state['features'],
                'Importance': model.feature_importances_
            }).sort_values('Importance', ascending=False)
            
            fig = px.bar(importance_df, x='Feature', y='Importance')
            st.plotly_chart(fig, use_container_width=True)
        else:
            st.info("Train a model first in the Model tab")
else:
    st.info("Upload a CSV file to get started")

This creates a complete ML dashboard with:

Data upload and overview
Interactive EDA
Model training with configurable parameters
Results visualization
Feature importance analysis

All interactive, all in Python, ready to share.

Click Here to Build a Data-centric Web App Using Streamlit

In this project, You will learn how to use the Streamlit library to create a data-centric web application. We’ll also learn data visualization in Streamlit using libraries like Plotly and pydeck.

Session State: Remember User Interactions

Session state stores data across reruns:

python

# Initialize
if 'counter' not in st.session_state:
    st.session_state.counter = 0

# Increment button
if st.button("Increment"):
    st.session_state.counter += 1

st.write(f"Counter: {st.session_state.counter}")

# Store model predictions
if st.button("Make Prediction"):
    prediction = model.predict(input_data)
    st.session_state.prediction = prediction

if 'prediction' in st.session_state:
    st.write(f"Prediction: {st.session_state.prediction}")

Session state is essential for multi-step workflows and maintaining user data.

Deployment Options

Streamlit apps are easy to deploy:

Streamlit Cloud (Free)

Push code to GitHub
Go to streamlit.io/cloud
Connect repository
Deploy

Free hosting for public apps. Takes 5 minutes.

Docker

dockerfile

FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

EXPOSE 8501

CMD ["streamlit", "run", "app.py", "--server.port=8501"]

Heroku, AWS, GCP

Streamlit runs anywhere Python runs. Deploy like any Python web app.

Best Practices and Patterns

Pattern 1: Loading Indicators

python

with st.spinner("Loading model..."):
    model = load_large_model()

st.success("Model loaded!")

# Or progress bars
progress_bar = st.progress(0)
for i in range(100):
    # Do work
    progress_bar.progress(i + 1)

Pattern 2: Error Handling

python

try:
    result = risky_operation()
    st.success("Operation successful!")
except Exception as e:
    st.error(f"Error occurred: {str(e)}")

Pattern 3: Form Submission

python

with st.form("my_form"):
    name = st.text_input("Name")
    age = st.number_input("Age")
    submitted = st.form_submit_button("Submit")
    
    if submitted:
        st.write(f"Hello {name}, age {age}")

Forms batch inputs — the app only reruns when form is submitted, not on every input change.

Common Mistakes to Avoid

Learn from these Streamlit failures:

Mistake 1: Not Using Caching

python

# Bad - reloads model on every interaction
model = torch.load("model.pth")

# Good - caches model
@st.cache_resource
def load_model():
    return torch.load("model.pth")

model = load_model()

Without caching, apps are painfully slow. Always cache expensive operations.

Mistake 2: Putting Heavy Logic in Main Script

python

# Bad - runs on every rerun
for i in range(1000000):
    # Expensive computation
    pass

# Good - in function with caching
@st.cache_data
def expensive_computation():
    result = []
    for i in range(1000000):
        # Expensive computation
        pass
    return result

result = expensive_computation()

Streamlit reruns the entire script on every interaction. Cache or move expensive code into functions.

Mistake 3: Not Managing Session State

Multi-step workflows without session state lose data on every interaction. Use session state for anything that needs to persist.

Mistake 4: Ignoring Performance

python

# Bad - loads huge dataset every time
df = pd.read_csv("huge_file.csv")

# Good - caches data
@st.cache_data
def load_data():
    return pd.read_csv("huge_file.csv")

df = load_data()

Performance matters. Users abandon slow apps. IMO, caching is the most important Streamlit concept to master.

The Bottom Line

Streamlit transforms “I need to learn web development to demo my model” into “I can ship a demo in an afternoon.” It’s not about building production web applications — it’s about rapidly creating shareable, interactive interfaces for ML models without frontend expertise.

Use Streamlit when:

Demoing ML models
Building internal tools
Creating data dashboards
Prototyping ML applications
Sharing work with non-technical stakeholders

Consider alternatives when:

Building production applications (FastAPI + React)
Need extreme customization (full web framework)
Performance is critical (Streamlit has overhead)

For ML practitioners who need to share their work, Streamlit is invaluable. The alternative is either learning full-stack web development or never shipping demos. Streamlit makes “ship something” the default instead of the exception.

Installation:

bash

pip install streamlit

Stop avoiding building demos because you don’t know web development. Start using Streamlit to turn models into shareable applications in pure Python. Your manager wants to see your model work, your portfolio needs visual projects, and your clients need interfaces they can actually use. Streamlit makes all of that possible without leaving Python. :)

Sam Austin

Search This Blog

Latest Post

Reinforcement Learning for Credit Scoring: Applications in Fintech