Automatic remediation with `jobStateTimeLimitActions`

Optionally, you can configure the jobStateTimeLimitActions parameter through CreateJobQueue and UpdateJobQueue API actions.

Note

Currently, for job queues connected to Amazon ECS, Amazon EKS, or Fargate compute environments, the only action you can use with jobStateLimitActions.action is to cancel a job.

The jobStateTimeLimitActions parameter is used to specify a set of actions that AWS Batch performs on jobs in a specific state. You can set a time threshold in seconds through the maxTimeSeconds field.

When a job has been in a RUNNABLE state with the defined statusReason, AWS Batch performs the action specified after maxTimeSeconds have elapsed.

For example, you can set the jobStateTimeLimitActions parameter to wait up to 4 hours for any job in the RUNNABLE state that is waiting for sufficient capacity to become available. You can do this by setting statusReason to CAPACITY:INSUFFICIENT_INSTANCE_CAPACITY and maxTimeSeconds to 14400 before cancelling the job and allowing the next job to advance to the head of the job queue.

The statusReason values returned by the ListJobs and DescribeJobs API actions are the same values you can define for the jobStateTimeLimitActions.statusReason parameter. However, not all statusReason values support automatic remediation.

The following statusReason values support jobStateTimeLimitActions:

CAPACITY:INSUFFICIENT_INSTANCE_CAPACITY
MISCONFIGURATION:COMPUTE_ENVIRONMENT_MAX_RESOURCE
MISCONFIGURATION:JOB_RESOURCE_REQUIREMENT
MISCONFIGURATION:EC2_INSTANCE_CONFIGURATION_UNSUPPORTED

The following statusReason values do not support jobStateTimeLimitActions and require manual investigation:

MISCONFIGURATION:SERVICE_ROLE_PERMISSIONS
ACTION_REQUIRED
UNDETERMINED

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Undetermined root cause

Common causes of jobs stuck in RUNNABLE without a statusReason

Automatic remediation with jobStateTimeLimitActions

Note

Automatic remediation with `jobStateTimeLimitActions`