Resume training from a checkpoint - Amazon SageMaker AI
This documentation is a draft for private preview for regions in the AWS European Sovereign Cloud. Documentation content will continue to evolve. Published: January 2, 2026.

Resume training from a checkpoint

To resume a training job from a checkpoint, run a new estimator with the same checkpoint_s3_uri that you created in the Enable checkpointing section. Once the training has resumed, the checkpoints from this S3 bucket are restored to checkpoint_local_path in each instance of the new training job. Ensure that the S3 bucket is in the same Region as that of the current SageMaker AI session.

Architecture diagram of syncing checkpoints to resume training.