AWS Adds Multi-Turn RL to SageMaker Agentic AI

Amazon Web Services (AWS) has announced the launch of multi-turn reinforcement learning (RL) within Amazon SageMaker AI, delivering a fully managed, serverless model customization technique tailored for complex, multi-step agentic tasks. Designed to bypass the heavy lifting of building and maintaining custom training infrastructure, this new capability allows developers to fine-tune foundation models including Qwen 3.6 27B, Nova Lite 2.0, GPT-OSS-20B, and Gemma 31B directly against their unique agent environments. By scoring and rewarding the entire sequence of decisions an AI agent makes across an entire task, multi-turn RL successfully specializes smaller, more cost-effective models to match or outpace the accuracy of much larger, general-purpose models on targeted enterprise workloads.

Also Read: Cyera and Snowflake Partner to Secure and Accelerate Enterprise AI Agent Deployment

The solution features native integration with execution environments like Amazon Bedrock AgentCore Runtime, Amazon EKS, and Amazon EC2, while managing the entire loop from trajectory collection to checkpointing. Furthermore, with built-in MLflow tracking and evaluation metrics like pass@k, B2B teams gain granular visibility into agent traces and performance metrics before deployment. Available via SageMaker Studio and the SageMaker Python SDK in the US West (Oregon) and US East (N. Virginia) regions, this serverless offering scales automatically, ensuring enterprises pay strictly for the tokens processed while accelerating their time-to-market for production-grade AI agents.

AWS Advances Agentic AI Customization with Multi-Turn Reinforcement Learning on SageMaker AI

Also Read: Cyera and Snowflake Partner to Secure and Accelerate Enterprise AI Agent Deployment

Read More: Amazon SageMaker AI launches multi-turn reinforcement learning for AI agent model customization

About Us

Latest

Popular

Quick Link