Skip to content

Releases: aws/sagemaker-python-sdk

v2.257.0

03 Feb 22:26
5b3b127

Choose a tag to compare

v2.257.0 (2026-02-03)

Features

  • Update image URIs for DJL 0.36.0 release

v3.4.0

23 Jan 00:33
6abd5b6

Choose a tag to compare

Features

  • feat: add emr-serverless step for SageMaker Pipelines

Bug fixes and Other Changes

  • Add Nova recipe training support in ModelTrainer
  • Add Partner-app Auth provider
  • Add sagemaker dependency for remote function by default V3

v2.256.1

21 Jan 23:53
511ff6a

Choose a tag to compare

Bug fixes and Other Changes

  • Bug fixes remote function

v3.3.1

13 Jan 04:33
4ed9d91

Choose a tag to compare

Bug fixes and Other Changes

  • ProcessingJob fix - Remove tags in Processor while Job creation
  • Telemetry Updates
  • sagemaker-mlops bug fix - Correct source code 'dependencies' parameter to 'requirements'
  • aws_batch bug fix - remove experiment config parameter as it Estimator is deprecated.

v2.256.0

09 Jan 00:25
a140cfc

Choose a tag to compare

Features

  • Image for Numpy 2.0 support with XGBoost

Bug fixes and Other Changes

  • Bug fix for Triton Model server for inference
  • Removal of hmac key parameter for remote function
  • Bug fixes for input validation for local mode and resource management for iterators

v3.3.0

20 Dec 05:40
36fd82f

Choose a tag to compare

Features

  • AWS_Batch: queueing of training jobs with ModelTrainer

Bug fixes and Other Changes

  • Fixes for model registry with ModelBuilder

v3.2.0

20 Dec 05:38
36fd82f

Choose a tag to compare

Features

  • Evaluator handshake with trainer
  • Datasets Format validation

Bug fixes and Other Changes

  • Add xgboost 3.0-5 to release
  • Fix get_child_process_ids parsing issue

v3.1.1

11 Dec 20:13
9da502a

Choose a tag to compare

Bug fixes and Other Changes

  • Fine-tuning SDK:
    • Add validation to bedrock reward models
    • Hyperparameter issue fixes, Add validation s3 output path
    • Fix the recipe selection for multiple recipe scenario
    • Train wait() timeout exception handling
    • Update example notebooks to reflect recent code changes

v3.1.0

03 Dec 23:50
0443307

Choose a tag to compare

Model Fine-Tuning Support in SageMaker Python SDK V3

We’re excited to introduce comprehensive model fine-tuning capabilities in the SageMaker Python SDK V3, bringing state-of-the-art fine-tuning techniques to production ML workflows. Fine-tune foundation models with enterprise features including automated experiment tracking, serverless infrastructure, and integrated evaluation—all with just a few lines of code.

What's New

The SageMaker Python SDK V3 now includes four specialized Fine-Tuning Trainers for different fine-tuning techniques. Each trainer is optimized for specific use cases, following established research and industry best practices:

SFTTrainer - Supervised Fine-Tuning

Fine-tune models with labeled instruction-response pairs for task-specific adaptation.

from sagemaker.train import SFTTrainer
from sagemaker.train.common import TrainingType

trainer = SFTTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-fine-tuned-models",
    training_dataset="s3://bucket/train.jsonl"
)

training_job = trainer.train()

DPOTrainer - Direct Preference Optimization

Align models with human preferences using the DPO algorithm. Unlike traditional RLHF, DPO eliminates the need for a separate reward model, simplifying the alignment pipeline while achieving comparable results. Use cases : Preference alignment, safety tuning, style adaptation.

from sagemaker.train import DPOTrainer

trainer = DPOTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-dpo-models",
    training_dataset="s3://bucket/preference_data.jsonl"
)

training_job = trainer.train()

RLAIFTrainer - Reinforcement Learning from AI Feedback

Leverage AI-generated feedback as reward signals using Amazon Bedrock models. RLAIF offers a scalable alternative to human feedback while maintaining quality.

from sagemaker.train import RLAIFTrainer

trainer = RLAIFTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-rlaif-models",
    reward_model_id="anthropic.claude-3-5-haiku-20241022-v1:0",
    reward_prompt="Builtin.Helpfulness",
    training_dataset="s3://bucket/rlaif_data.jsonl"
)

training_job = trainer.train()

RLVRTrainer - Reinforcement Learning from Verifiable Rewards

Train with custom, programmatic reward functions for domain-specific optimization.

from sagemaker.train import RLVRTrainer

trainer = RLVRTrainer(
    model="meta-llama/Llama-2-7b-hf",
    training_type=TrainingType.LORA,
    model_package_group_name="my-rlvr-models",
    custom_reward_function="arn:aws:sagemaker:region:account:hub-content/.../evaluator/1.0",
    training_dataset="s3://bucket/rlvr_data.jsonl"
)

training_job = trainer.train()

Key Features

Parameter-Efficient Fine-Tuning

  • LoRA (Low-Rank Adaptation): Default, memory-efficient approach
  • Full Fine-Tuning: Train all model parameters, for maximum performance

Built-in MLflow Integration

Automatic experiment tracking with intelligent defaults:

  • Auto-resolves MLflow tracking servers
  • Domain-aware server selection
  • Automatic experiment and run management
  • Provides ongoing visibility into performance and loss metrics during training

Dynamic Hyperparameter Management

Discover and customize training hyperparameters with built-in validation:

# View available hyperparameters
trainer.hyperparameters.get_info()

# Customize training
trainer.hyperparameters.learning_rate = 0.0001
trainer.hyperparameters.max_epochs = 3
trainer.hyperparameters.lora_alpha = 32

Continued Fine-Tuning

Build on previously fine-tuned models for iterative improvement:

from sagemaker.core.resources import ModelPackage

# Use a previously fine-tuned model
base_model = ModelPackage.get(
    model_package_name="arn:aws:sagemaker:region:account:model-package/..."
)

trainer = SFTTrainer(
    model=base_model,  # Continue from fine-tuned model
    training_type=TrainingType.LORA,
    model_package_group_name="my-models-v2"
)

Flexible Dataset Support

Multiple input formats with automatic validation:

  • S3 URIs: s3://bucket/path/data.jsonl
  • SageMaker AI Registry Dataset ARNs
  • DataSet objects with validation

Serverless Training

No infrastructure management required—just specify your model and data:

  • Automatic compute provisioning
  • Managed training infrastructure
  • Pay only for training time

Enterprise-Ready

Production-ready security features:

  • VPC support for secure training
  • KMS encryption for outputs
  • IAM role management
  • EULA acceptance for gated models

Model Evaluation

Comprehensive evaluation framework with three evaluator types:

  • BenchMarkEvaluator: Standard benchmarks (MMLU, BBH, GPQA, MATH, STRONG_REJECT, IFEVAL, GEN_QA, MMMU, LLM_JUDGE, INFERENCE_ONLY)
  • CustomScorerEvaluator: Built-in metrics (PRIME_MATH, PRIME_CODE) or custom evaluators
  • LLMAsJudgeEvaluator: LLM-based evaluation with explanations

See the "Evaluating Fine-Tuned Models" section below for detailed examples.

Evaluating Fine-Tuned Models

Evaluate your fine-tuned models using standard benchmarks, custom metrics, or LLM-based evaluation.

Benchmark Evaluation

Evaluate against 11 standard benchmarks including MMLU, BBH, GPQA, MATH, and more.

Discover available benchmarks

from sagemaker.train.evaluate import BenchMarkEvaluator, get_benchmarks, get_benchmark_properties

Benchmark = get_benchmarks()
print(list(Benchmark))
# [MMLU, MMLU_PRO, BBH, GPQA, MATH, STRONG_REJECT, IFEVAL, GEN_QA, MMMU, LLM_JUDGE, INFERENCE_ONLY]

Get benchmark details

props = get_benchmark_properties(Benchmark.MMLU)
print(props['description'])
print(props['subtasks'])

Run evaluation

evaluator = BenchMarkEvaluator(
    benchmark=Benchmark.MMLU,
    model_package_arn="arn:aws:sagemaker:...",
    base_model="meta-llama/Llama-2-7b-hf",
    output_s3_location="s3://bucket/eval-results/",
    mlflow_resource_arn="arn:aws:sagemaker:..."
)

execution = evaluator.evaluate(subtask="college_mathematics")
execution.wait()
execution.show_results()

Custom Scorer Evaluation

Use built-in metrics or custom evaluators

from sagemaker.train.evaluate import CustomScorerEvaluator, get_builtin_metrics

# Discover built-in metrics
BuiltInMetric = get_builtin_metrics()
# [PRIME_MATH, PRIME_CODE]

# Using built-in metric
evaluator = CustomScorerEvaluator(
    evaluator=BuiltInMetric.PRIME_MATH,
    dataset="s3://bucket/eval-data.jsonl",
    model_package_arn="arn:aws:sagemaker:...",
    mlflow_resource_arn="arn:aws:sagemaker:..."
)

execution = evaluator.evaluate()
execution.wait()
execution.show_results()

LLM-as-Judge Evaluation

Leverage large language models for nuanced evaluation with explanations:

from sagemaker.train.evaluate import LLMAsJudgeEvaluator

evaluator = LLMAsJudgeEvaluator(
    judge_model_id="anthropic.claude-3-5-haiku-20241022-v1:0",
    evaluation_prompt="Rate the helpfulness of this response",
    dataset="s3://bucket/eval-data.jsonl",
    model_package_arn="arn:aws:sagemaker:...",
    mlflow_resource_arn="arn:aws:sagemaker:..."
)

execution = evaluator.evaluate()
execution.wait()

# Show first 5 results
execution.show_results()

# Show next 5 with explanations
execution.show_results(limit=5, offset=5, show_explanations=True)

Deploying Fine-Tuned Models

Flexible deployment options for production inference. Deploy your fine-tuned models to SageMaker endpoints or Amazon Bedrock.

Deploy to SageMaker Endpoint

Deploy from training job

from sagemaker.core.resources import TrainingJob
from sagemaker.serve import ModelBuilder

training_job = TrainingJob.get(training_job_name="my-training-job")
model_builder = ModelBuilder(model=training_job)
model = model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")

Deploy from model package

from sagemaker.core.resources import ModelPackage

model_package = ModelPackage.get(model_package_name="arn:aws:sagemaker:...")
model_builder = ModelBuilder(model=model_package)
model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")

Deploy from trainer

trainer = SFTTrainer(...)
training_job = trainer.train()
model_builder = ModelBuilder(model=trainer)
endpoint = model_builder.deploy()

Deploy Multiple Adapters to Same Endpoint

Deploy multiple fine-tuned adapters to a single endpoint for cost-efficient serving

# Deploy base model
model_builder = ModelBuilder(model=training_job)
model_builder.build()
endpoint = model_builder.deploy(endpoint_name="my-endpoint")

# Deploy adapter to same endpoint
model_builder2 = ModelBuilder(model=training_job2)
model_builder2.build()
endpoint2 = model_builder2.deploy(
    endpoint_name="my-endpoint", # Same endpoint
    inference_component_name="my-adapter" # New adapter
)

Deploy to Amazon Bedrock

from sagemaker.serve.bedrock_model_builder import BedrockModelBuilder

training_job = TrainingJob.get(training_job_name="my-training-job")
bedrock_builder = BedrockModelBuilder(model=training_job)

deployment_result = bedrock_builder.deploy(
    job_name="my-bedrock-job",
    imported_model_name="my-bedrock-model",
    role_arn="arn:aws:iam::..."
)

Find Endpoints Using a Base Model

model_builder = ModelBuilder(model=training_job)
endpoint_names = model_builder.fetch_endpoint_names_for_base_model()
# Returns: set of endpoint name...
Read more

v3.0.1

03 Dec 18:37
869f3ee

Choose a tag to compare

What's Changed

Full Changelog: v3.0.0...v3.0.1

Note

This release is created retroactively for code deployed on Thu Nov 20 2025
All changes listed below are already live in production.