Releases: aws/sagemaker-python-sdk
v2.257.0
v3.4.0
Features
- feat: add emr-serverless step for SageMaker Pipelines
Bug fixes and Other Changes
- Add Nova recipe training support in ModelTrainer
- Add Partner-app Auth provider
- Add sagemaker dependency for remote function by default V3
v2.256.1
Bug fixes and Other Changes
- Bug fixes remote function
v3.3.1
Bug fixes and Other Changes
- ProcessingJob fix - Remove tags in Processor while Job creation
- Telemetry Updates
- sagemaker-mlops bug fix - Correct source code 'dependencies' parameter to 'requirements'
- aws_batch bug fix - remove experiment config parameter as it Estimator is deprecated.
v2.256.0
Features
- Image for Numpy 2.0 support with XGBoost
Bug fixes and Other Changes
- Bug fix for Triton Model server for inference
- Removal of hmac key parameter for remote function
- Bug fixes for input validation for local mode and resource management for iterators
v3.3.0
Features
- AWS_Batch: queueing of training jobs with ModelTrainer
Bug fixes and Other Changes
- Fixes for model registry with ModelBuilder
v3.2.0
Features
- Evaluator handshake with trainer
- Datasets Format validation
Bug fixes and Other Changes
- Add xgboost 3.0-5 to release
- Fix get_child_process_ids parsing issue
v3.1.1
Bug fixes and Other Changes
- Fine-tuning SDK:
- Add validation to bedrock reward models
- Hyperparameter issue fixes, Add validation s3 output path
- Fix the recipe selection for multiple recipe scenario
- Train wait() timeout exception handling
- Update example notebooks to reflect recent code changes
v3.1.0
Model Fine-Tuning Support in SageMaker Python SDK V3
We’re excited to introduce comprehensive model fine-tuning capabilities in the SageMaker Python SDK V3, bringing state-of-the-art fine-tuning techniques to production ML workflows. Fine-tune foundation models with enterprise features including automated experiment tracking, serverless infrastructure, and integrated evaluation—all with just a few lines of code.
What's New
The SageMaker Python SDK V3 now includes four specialized Fine-Tuning Trainers for different fine-tuning techniques. Each trainer is optimized for specific use cases, following established research and industry best practices:
SFTTrainer - Supervised Fine-Tuning
Fine-tune models with labeled instruction-response pairs for task-specific adaptation.
from sagemaker.train import SFTTrainer
from sagemaker.train.common import TrainingType
trainer = SFTTrainer(
model="meta-llama/Llama-2-7b-hf",
training_type=TrainingType.LORA,
model_package_group_name="my-fine-tuned-models",
training_dataset="s3://bucket/train.jsonl"
)
training_job = trainer.train()DPOTrainer - Direct Preference Optimization
Align models with human preferences using the DPO algorithm. Unlike traditional RLHF, DPO eliminates the need for a separate reward model, simplifying the alignment pipeline while achieving comparable results. Use cases : Preference alignment, safety tuning, style adaptation.
from sagemaker.train import DPOTrainer
trainer = DPOTrainer(
model="meta-llama/Llama-2-7b-hf",
training_type=TrainingType.LORA,
model_package_group_name="my-dpo-models",
training_dataset="s3://bucket/preference_data.jsonl"
)
training_job = trainer.train()RLAIFTrainer - Reinforcement Learning from AI Feedback
Leverage AI-generated feedback as reward signals using Amazon Bedrock models. RLAIF offers a scalable alternative to human feedback while maintaining quality.
from sagemaker.train import RLAIFTrainer
trainer = RLAIFTrainer(
model="meta-llama/Llama-2-7b-hf",
training_type=TrainingType.LORA,
model_package_group_name="my-rlaif-models",
reward_model_id="anthropic.claude-3-5-haiku-20241022-v1:0",
reward_prompt="Builtin.Helpfulness",
training_dataset="s3://bucket/rlaif_data.jsonl"
)
training_job = trainer.train()RLVRTrainer - Reinforcement Learning from Verifiable Rewards
Train with custom, programmatic reward functions for domain-specific optimization.
from sagemaker.train import RLVRTrainer
trainer = RLVRTrainer(
model="meta-llama/Llama-2-7b-hf",
training_type=TrainingType.LORA,
model_package_group_name="my-rlvr-models",
custom_reward_function="arn:aws:sagemaker:region:account:hub-content/.../evaluator/1.0",
training_dataset="s3://bucket/rlvr_data.jsonl"
)
training_job = trainer.train()Key Features
Parameter-Efficient Fine-Tuning
- LoRA (Low-Rank Adaptation): Default, memory-efficient approach
- Full Fine-Tuning: Train all model parameters, for maximum performance
Built-in MLflow Integration
Automatic experiment tracking with intelligent defaults:
- Auto-resolves MLflow tracking servers
- Domain-aware server selection
- Automatic experiment and run management
- Provides ongoing visibility into performance and loss metrics during training
Dynamic Hyperparameter Management
Discover and customize training hyperparameters with built-in validation:
# View available hyperparameters
trainer.hyperparameters.get_info()
# Customize training
trainer.hyperparameters.learning_rate = 0.0001
trainer.hyperparameters.max_epochs = 3
trainer.hyperparameters.lora_alpha = 32Continued Fine-Tuning
Build on previously fine-tuned models for iterative improvement:
from sagemaker.core.resources import ModelPackage
# Use a previously fine-tuned model
base_model = ModelPackage.get(
model_package_name="arn:aws:sagemaker:region:account:model-package/..."
)
trainer = SFTTrainer(
model=base_model, # Continue from fine-tuned model
training_type=TrainingType.LORA,
model_package_group_name="my-models-v2"
)Flexible Dataset Support
Multiple input formats with automatic validation:
- S3 URIs: s3://bucket/path/data.jsonl
- SageMaker AI Registry Dataset ARNs
- DataSet objects with validation
Serverless Training
No infrastructure management required—just specify your model and data:
- Automatic compute provisioning
- Managed training infrastructure
- Pay only for training time
Enterprise-Ready
Production-ready security features:
- VPC support for secure training
- KMS encryption for outputs
- IAM role management
- EULA acceptance for gated models
Model Evaluation
Comprehensive evaluation framework with three evaluator types:
- BenchMarkEvaluator: Standard benchmarks (MMLU, BBH, GPQA, MATH, STRONG_REJECT, IFEVAL, GEN_QA, MMMU, LLM_JUDGE, INFERENCE_ONLY)
- CustomScorerEvaluator: Built-in metrics (PRIME_MATH, PRIME_CODE) or custom evaluators
- LLMAsJudgeEvaluator: LLM-based evaluation with explanations
See the "Evaluating Fine-Tuned Models" section below for detailed examples.
Evaluating Fine-Tuned Models
Evaluate your fine-tuned models using standard benchmarks, custom metrics, or LLM-based evaluation.
Benchmark Evaluation
Evaluate against 11 standard benchmarks including MMLU, BBH, GPQA, MATH, and more.
Discover available benchmarks
from sagemaker.train.evaluate import BenchMarkEvaluator, get_benchmarks, get_benchmark_properties
Benchmark = get_benchmarks()
print(list(Benchmark))
# [MMLU, MMLU_PRO, BBH, GPQA, MATH, STRONG_REJECT, IFEVAL, GEN_QA, MMMU, LLM_JUDGE, INFERENCE_ONLY]
Get benchmark details
props = get_benchmark_properties(Benchmark.MMLU)
print(props['description'])
print(props['subtasks'])Run evaluation
evaluator = BenchMarkEvaluator(
benchmark=Benchmark.MMLU,
model_package_arn="arn:aws:sagemaker:...",
base_model="meta-llama/Llama-2-7b-hf",
output_s3_location="s3://bucket/eval-results/",
mlflow_resource_arn="arn:aws:sagemaker:..."
)
execution = evaluator.evaluate(subtask="college_mathematics")
execution.wait()
execution.show_results()Custom Scorer Evaluation
Use built-in metrics or custom evaluators
from sagemaker.train.evaluate import CustomScorerEvaluator, get_builtin_metrics
# Discover built-in metrics
BuiltInMetric = get_builtin_metrics()
# [PRIME_MATH, PRIME_CODE]
# Using built-in metric
evaluator = CustomScorerEvaluator(
evaluator=BuiltInMetric.PRIME_MATH,
dataset="s3://bucket/eval-data.jsonl",
model_package_arn="arn:aws:sagemaker:...",
mlflow_resource_arn="arn:aws:sagemaker:..."
)
execution = evaluator.evaluate()
execution.wait()
execution.show_results()LLM-as-Judge Evaluation
Leverage large language models for nuanced evaluation with explanations:
from sagemaker.train.evaluate import LLMAsJudgeEvaluator
evaluator = LLMAsJudgeEvaluator(
judge_model_id="anthropic.claude-3-5-haiku-20241022-v1:0",
evaluation_prompt="Rate the helpfulness of this response",
dataset="s3://bucket/eval-data.jsonl",
model_package_arn="arn:aws:sagemaker:...",
mlflow_resource_arn="arn:aws:sagemaker:..."
)
execution = evaluator.evaluate()
execution.wait()
# Show first 5 results
execution.show_results()
# Show next 5 with explanations
execution.show_results(limit=5, offset=5, show_explanations=True)Deploying Fine-Tuned Models
Flexible deployment options for production inference. Deploy your fine-tuned models to SageMaker endpoints or Amazon Bedrock.
Deploy to SageMaker Endpoint
Deploy from training job
from sagemaker.core.resources import TrainingJob
from sagemaker.serve import ModelBuilder
training_job = TrainingJob.get(training_job_name="my-training-job")
model_builder = ModelBuilder(model=training_job)
model = model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")Deploy from model package
from sagemaker.core.resources import ModelPackage
model_package = ModelPackage.get(model_package_name="arn:aws:sagemaker:...")
model_builder = ModelBuilder(model=model_package)
model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")Deploy from trainer
trainer = SFTTrainer(...)
training_job = trainer.train()
model_builder = ModelBuilder(model=trainer)
endpoint = model_builder.deploy()Deploy Multiple Adapters to Same Endpoint
Deploy multiple fine-tuned adapters to a single endpoint for cost-efficient serving
# Deploy base model
model_builder = ModelBuilder(model=training_job)
model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")
# Deploy adapter to same endpoint
model_builder2 = ModelBuilder(model=training_job2)
model_builder2.build()
endpoint2 = model_builder2.deploy(
endpoint_name="my-endpoint", # Same endpoint
inference_component_name="my-adapter" # New adapter
)Deploy to Amazon Bedrock
from sagemaker.serve.bedrock_model_builder import BedrockModelBuilder
training_job = TrainingJob.get(training_job_name="my-training-job")
bedrock_builder = BedrockModelBuilder(model=training_job)
deployment_result = bedrock_builder.deploy(
job_name="my-bedrock-job",
imported_model_name="my-bedrock-model",
role_arn="arn:aws:iam::..."
)Find Endpoints Using a Base Model
model_builder = ModelBuilder(model=training_job)
endpoint_names = model_builder.fetch_endpoint_names_for_base_model()
# Returns: set of endpoint name...v3.0.1
What's Changed
- Update pyproject.toml and prepare for v3.0.1 release by @zhaoqizqwang in #5329
Full Changelog: v3.0.0...v3.0.1
Note
This release is created retroactively for code deployed on Thu Nov 20 2025
All changes listed below are already live in production.