Automatic scaling of Amazon SageMaker AI models - Amazon SageMaker AI
This documentation is a draft for private preview for regions in the AWS European Sovereign Cloud. Documentation content will continue to evolve. Published: January 1, 2026.

Automatic scaling of Amazon SageMaker AI models

Amazon SageMaker AI supports automatic scaling (auto scaling) for your hosted models. Auto scaling dynamically adjusts the number of instances provisioned for a model in response to changes in your workload. When the workload increases, auto scaling brings more instances online. When the workload decreases, auto scaling removes unnecessary instances so that you don't pay for provisioned instances that you aren't using.