Saving and Loading Trained Models: joblib, pickle, and ONNX

After a predictive model is trained, evaluated, and selected, it needs to be saved so it can be reused later. Saving a trained model allows us to make predictions without retraining the model every time.

Model saving and loading is also called model persistence. Common tools include joblib, pickle, and ONNX. Each tool has different strengths, limitations, and deployment use cases.

Why Save a Trained Model?

Training a model can take time and computing resources. Once the best model is trained, we usually want to store it and use it later for prediction, deployment, testing, reporting, or integration into business systems.

Saving a model ensures that the exact trained object can be loaded again with its learned parameters, preprocessing steps, and settings preserved.

Core Idea: Model persistence lets us save a trained model once and load it later for prediction without repeating the full training process.

Model Persistence at a Glance

Visual Intuition

Saved Model Files

.pkl

.joblib

.onnx

Prediction Pipeline

Raw Data

→

Preprocess

→

Model

Version Tracking

v1.0 Baseline model

v1.1 Tuned model

v2.0 New data model

What Should Be Saved?

In real projects, saving only the model is often not enough. The model expects the same feature format that it saw during training. Therefore, preprocessing steps must also be saved or reproduced exactly.

Object to Save	Why It Matters	Example
Trained Model	Contains learned patterns and parameters.	Random Forest, XGBoost, Logistic Regression.
Preprocessing Pipeline	Ensures new data is transformed the same way as training data.	Imputer, scaler, encoder, feature selector.
Feature List	Maintains correct column order and expected input schema.	age, income, tenure, complaint_count.
Label Mapping	Converts model outputs back to business labels.	0 = No Churn, 1 = Churn.
Metadata	Supports reproducibility and governance.	Model version, training date, metrics, data version.

Saving Models with joblib

joblib is commonly used for saving scikit-learn models and pipelines. It is especially useful when the model contains large numerical arrays, such as tree ensembles or preprocessing objects.

In many Python machine learning workflows, joblib is preferred for saving fitted scikit-learn estimators and pipelines.

# Saving a trained model with joblib
import joblib

joblib.dump(model, “churn_model.joblib”)

# Loading the model later
loaded_model = joblib.load(“churn_model.joblib”)

predictions = loaded_model.predict(new_data)

When to Use joblib

Use joblib When

You are working with scikit-learn models.
The model contains large NumPy arrays.
You want simple model persistence in Python.
You are saving full preprocessing pipelines.

Be Careful When

The model must run outside Python.
The deployment environment has different library versions.
You need long-term model portability across systems.
The file comes from an unknown source.

Saving Models with pickle

pickle is Python’s built-in object serialization tool. It can save many Python objects, including trained models, dictionaries, preprocessing objects, and custom classes.

pickle is flexible, but it should be used carefully. Pickle files are Python-specific and can be unsafe if loaded from untrusted sources.

# Saving a trained model with pickle
import pickle

with open(“model.pkl”, “wb”) as file:
pickle.dump(model, file)

# Loading the model later
with open(“model.pkl”, “rb”) as file:
loaded_model = pickle.load(file)

predictions = loaded_model.predict(new_data)

When to Use pickle

Use pickle When

You need to save general Python objects.
The model will be loaded in a controlled Python environment.
You are saving small or moderate objects.
You understand the security implications.

Be Careful When

The file source is unknown or untrusted.
The model must work outside Python.
Library versions may change significantly.
You need strong production portability.

Security Risk with pickle and joblib

Pickle and joblib can execute code during loading. This means loading files from unknown or untrusted sources can be dangerous. A malicious file can harm the system when loaded.

High-Risk Warning: Never load pickle or joblib files from unknown, untrusted, or unverifiable sources. Only load model files created by your own trusted training pipeline or trusted team.

What is ONNX?

ONNX stands for Open Neural Network Exchange. It is a model format designed to improve portability across different frameworks and deployment environments.

Instead of saving a Python object, ONNX stores the model in a standardized format. This allows the model to be used in environments that may not depend on the original training framework.

Simple Explanation: joblib and pickle save Python model objects. ONNX saves a portable model representation that can be used across different systems more easily.

When to Use ONNX

Use ONNX When

You need model portability across platforms.
The model may run outside the Python training environment.
Deployment performance matters.
You want a standardized model representation.

Be Careful When

The model or preprocessing step is not supported.
Custom Python logic is part of the prediction pipeline.
Conversion changes model behaviour slightly.
Input schema and data types are not clearly defined.

joblib vs pickle vs ONNX

Format	Best For	Strength	Limitation
joblib .joblib	Scikit-learn models and pipelines.	Efficient for large numerical Python objects.	Python-specific and version-sensitive.
pickle .pkl	General Python object serialization.	Flexible and built into Python.	Security risk and Python-specific.
ONNX .onnx	Portable model deployment.	Standardized format for cross-platform use.	Conversion support may vary by model and preprocessing logic.

Saving the Full Pipeline

In production, it is usually safer to save the full pipeline instead of only the model. A pipeline may include imputation, scaling, encoding, feature selection, and the final estimator.

This ensures that new data goes through exactly the same transformations as the training data before prediction.

# Example: saving a complete scikit-learn pipeline
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import joblib

pipeline = Pipeline([
(“scaler”, StandardScaler()),
(“model”, LogisticRegression())
])

pipeline.fit(X_train, y_train)
joblib.dump(pipeline, “churn_pipeline.joblib”)

Important: If preprocessing is not saved with the model, predictions on new data may be wrong because the input features may not match the training format.

Loading a Model for Prediction

When a saved model is loaded, it should receive data in the same structure used during training. The column names, column order, data types, missing value handling, and encoding assumptions must match.

Prediction Workflow with Saved Model

Load Model or Pipeline

→

Validate Input Schema

→

Transform New Data

→

Generate Prediction

→

Return Result

Model Versioning

Model versioning means tracking different versions of trained models. This is important because models are updated over time as data changes, features improve, or algorithms are tuned.

Versioning Item	Why It Matters	Example
Model Version	Identifies which model file is used.	churn_model_v1.2.joblib.
Training Data Version	Shows which data trained the model.	customer_data_2026_Q1.
Feature Schema Version	Tracks expected input columns and types.	schema_v3.json.
Metric Record	Stores performance at training time.	F1 = 0.71, ROC-AUC = 0.86.
Environment Version	Helps reproduce loading and prediction.	Python version, scikit-learn version, package list.

Environment Compatibility

Models saved with joblib or pickle may depend on the Python version and library versions used during training. If the deployment environment uses different versions, loading may fail or predictions may behave differently.

Deployment Risk: Always record the training environment. A model saved in one library version may not load correctly in another version months later.

Example: Churn Model Deployment

Business Scenario

A telecom company trains a churn prediction model. The model uses customer tenure, monthly charges, support complaints, payment delay, contract type, and usage change.

Step	Action	Why It Matters
1	Train preprocessing and model as one pipeline.	Ensures consistent transformation of new customer data.
2	Save pipeline using joblib.	Allows the same trained pipeline to be reused later.
3	Store metadata and feature schema.	Prevents incorrect input format during deployment.
4	Load model in prediction service.	Generates churn probabilities for new customers.
5	Monitor prediction quality over time.	Detects drift and performance degradation.

Example: ONNX for Portable Prediction

Deployment Scenario

A data science team trains a model in Python, but the production application runs in a different environment. Instead of depending on the full Python training stack, the team converts the model to ONNX for portable inference.

Training environment: Python model training and evaluation.
Conversion: Trained model is converted to ONNX format.
Deployment: ONNX runtime is used for prediction in production.
Validation: Predictions from the original model and ONNX model are compared before release.

Common Mistakes in Saving and Loading Models

Mistake	Why It Is Harmful	Better Approach
Saving only the model	New data may not be preprocessed correctly.	Save the full preprocessing pipeline with the model.
Not recording feature order	Model may receive wrong values in wrong columns.	Store feature names, schema, and expected data types.
Loading untrusted pickle files	Can execute malicious code.	Only load files from trusted sources.
Ignoring environment versions	Model may fail to load or behave differently.	Record Python and package versions.
No model versioning	Teams may not know which model is in production.	Use clear version names and metadata tracking.
No prediction validation after loading	Loaded model may not reproduce expected results.	Test loaded model on known sample inputs before deployment.

Best Practices for Model Persistence

Saving and Loading Checklist

Save the full pipeline: Include preprocessing and the final model together whenever possible.
Store feature schema: Record column names, order, data types, and expected categories.
Use joblib for scikit-learn pipelines: It is commonly used for Python-based model persistence.
Use pickle carefully: Only load pickle files from trusted sources.
Use ONNX for portability: Consider ONNX when the model must run across different systems.
Record environment details: Save Python version, package versions, and dependency files.
Version every model: Track model file, training data, metrics, and feature schema version.
Validate after loading: Check that loaded model predictions match expected outputs.
Protect model files: Treat production models as important business assets.
Monitor after deployment: Saving a model is not the end; performance must be checked over time.

Why Saving and Loading Models Matters

Saving and loading models is the bridge between experimentation and deployment. A model that cannot be reliably saved, loaded, and reproduced is not ready for real-world use.

Proper model persistence ensures consistency, reproducibility, deployment readiness, and long-term maintainability. It also helps teams track which model is being used and whether predictions are generated under the correct assumptions.

Practical Insight: A trained model is useful only when it can be reliably loaded with the same preprocessing, feature schema, environment, and business assumptions used during training.

Key Takeaways

Model persistence means saving trained models so they can be reused later.
joblib is commonly used for saving scikit-learn models and pipelines.
pickle can save general Python objects but must be used carefully due to security risks.
ONNX is useful for portable model deployment across different systems.
Saving only the model is often not enough; preprocessing and schema should also be saved.
Pickle and joblib files should never be loaded from untrusted sources.
Feature order, data types, and categories must match training expectations.
Model versioning helps track updates and production usage.
Environment compatibility matters for reliable loading and prediction.
A saved model should always be tested after loading before deployment.

8.1 Saving and loading trained models (joblib, pickle, ONNX)

Saving and Loading Trained Models: joblib, pickle, and ONNX

Why Save a Trained Model?

Model Persistence at a Glance

Visual Intuition

What Should Be Saved?

Saving Models with joblib

When to Use joblib

Saving Models with pickle

When to Use pickle

Security Risk with pickle and joblib

What is ONNX?

When to Use ONNX

joblib vs pickle vs ONNX

Saving the Full Pipeline

Loading a Model for Prediction

Prediction Workflow with Saved Model

Model Versioning

Environment Compatibility

Example: Churn Model Deployment

Business Scenario

Example: ONNX for Portable Prediction

Deployment Scenario

Common Mistakes in Saving and Loading Models

Best Practices for Model Persistence

Saving and Loading Checklist

Why Saving and Loading Models Matters

Key Takeaways