-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I am working on a notebook for this. Pull request to be opened soon.
Describe the feature you'd like
Add a new example notebook demonstrating the complete end-to-end workflow from training a PyTorch model to deploying it for inference on SageMaker, with MLflow 3.x tracking and model registry integration.
How would this feature be used? Please describe.
Users who want to leverage MLflow with SageMaker V3 SDK currently lack a comprehensive example showing the full workflow. This notebook would demonstrate:
- Connecting to SageMaker MLflow tracking server
- Training a PyTorch model with ModelTrainer while logging metrics/params to MLflow
- Registering the trained model to MLflow Model Registry
- Deploying directly from MLflow registry using ModelBuilder
- Testing the deployed endpoint
Example workflow:
# Train with MLflow logging
model_trainer = ModelTrainer(training_image=..., source_code=...)
model_trainer.train()
# Deploy from MLflow registry
model_builder = ModelBuilder(
model_metadata={"MLFLOW_MODEL_PATH": "models:/my-model/1", ...}
)
model_builder.build()
model_builder.deploy()
Describe alternatives you've considered
Existing notebooks cover training or inference separately, but none show the integrated MLflow workflow end-to-end.
Additional context
Target location: v3-examples/ml-ops-examples/
Notebook name: v3-mlflow-train-inference-e2e-example.ipynb
Prerequisites: SageMaker MLflow App (tracking server ARN)