ML Model Tracking in Fabric Notebooks

mandarp0
May 8
2 min read

Updated: Sep 4

Modern machine learning pipelines demand not only high performance but also traceability, reproducibility, and governance. Microsoft Fabric provides a powerful integrated environment where data scientists can develop, train, evaluate, and track machine learning models using MLflow — all within a collaborative notebook-based interface.

The Setup: ML Project in a Fabric Notebook

Consider a basic regression problem: you have numerical features like size, count, or quantity, and you want to predict a target value (e.g. price, cost, etc.).

Start with essential imports

Imagine a basic regression problem: you have numerical features like size, count, or quantity, and you want to predict a target value (e.g. price, cost, etc.).

Data Loading and Preprocessing

Start by loading a structured dataset containing numeric features and a continuous target. This demo assumes you have already uploaded a CSV to your Fabric Lakehouse.

Feature Selection and Train-Test Split

Separate your input variables (X) and the target column (y), and perform a standard 80-20 split for evaluation.

Defining the Experiment

MLflow experiments in Fabric must follow a naming convention: names must be under 257 characters, begin with a letter or number, and contain only letters, numbers, underscores (_), or dashes (-).

Training and Tracking the Model

This is the core section where model training and tracking happen within a single MLflow run.

MLflow captures:

All parameters used (e.g., model type)
Evaluation metrics (r2_score, rmse)
The serialized model itself
A signature schema for future input validation

Pro Tips

Use mlflow.autolog() if you want Fabric to log parameters and metrics automatically.
Use infer_signature() to avoid shape mismatch errors when deploying models.
Clean your feature types early — Fabric will enforce schema at inference time.

Reviewing Results in Microsoft Fabric

After the notebook completes:

Navigate to the MLflow experiments pane in Fabric.
Select your experiment (house_price_experiment).
Review each run, view metrics like R² and RMSE, and inspect logged artifacts.
Use the Compare Runs tool to analyze multiple models and select the best one.

Conclusion

This walkthrough has illustrated how Microsoft Fabric, in combination with MLflow, creates a robust and auditable machine learning environment. With minimal setup, data scientists can:

Load and prepare data directly from Fabric’s Lakehouse
Train models using familiar Python libraries
Automatically log all relevant metadata, metrics, and models
Visualize experiment results and compare multiple runs
Register and version models for deployment and reuse

The integration promotes not only technical productivity, but also collaboration, transparency, and governance across the ML lifecycle. Whether you're iterating on models or deploying into production, this framework ensures every step is documented, reproducible, and ready for scale.

Numlytics

Cosmos DB in Microsoft Fabric (Preview)

DAX UDF for Power BI

Fast Copy in Dataflows Gen2

What’s Behind Your Rising Azure Data Factory Costs?

Copilot Now Available in Org Apps Reports (Preview)

ML Model Tracking in Fabric Notebooks