Manage compute for AI/ML workloads with EKS Auto Mode and Karpenter - Amazon EKS
Services or capabilities described in AWS documentation might vary by Region. To see the differences applicable to the AWS European Sovereign Cloud Region, see the AWS European Sovereign Cloud User Guide.

Help improve this page

To contribute to this user guide, choose the Edit this page on GitHub link that is located in the right pane of every page.

Manage compute for AI/ML workloads with EKS Auto Mode and Karpenter

Tip

Register for upcoming Amazon EKS AI/ML workshops.

This section covers how to manage accelerated compute (AWS Trainium, NVIDIA GPUs) for AI training and inference workloads using Amazon EKS Auto Mode or self-managed Karpenter.

EKS Auto Mode and Karpenter support two provisioning modes: dynamic provisioning and static provisioning. With dynamic provisioning, EKS Auto Mode and Karpenter provision and scale accelerated compute instances as workloads are scheduled on the cluster. With static provisioning, EKS Auto Mode and Karpenter provision and maintain a fixed number of nodes. Dynamic and static provisioning can be used in the same cluster to maintain a constant baseline capacity pool while scaling with workload demands.

EKS Auto Mode and Karpenter support all four capacity purchase options (On-Demand, Spot, Capacity Blocks, and ODCRs) and always provision reserved capacity first, followed by Spot or On-Demand.

EKS Auto Mode vs Karpenter

Both approaches share the NodePool API, but they differ in operational ownership, resource APIs, operating system support, Spot interruption handling, and configuration flexibility.

Feature EKS Auto Mode Self-managed Karpenter

Best for

Teams who prefer managed infrastructure with minimal operational overhead

Teams who prefer full control over node lifecycle, AMIs, OS tuning, and patching.

Operational model

AWS provisions and manages the Karpenter controller, GPU/Trainium drivers, device plugins, OS patching, and Spot interruption handling.

You install and operate the Karpenter controller in your cluster and own GPU/Trainium drivers, device plugins, AMI lifecycle, patching, and Spot interruption handling.

Compute options

On-Demand, Spot, ODCRs, Capacity Blocks for ML

On-Demand, Spot, ODCRs, Capacity Blocks for ML

Resource APIs

NodePool (karpenter.sh/v1), NodeClass (eks.amazonaws.com/v1).

NodePool (karpenter.sh/v1), EC2NodeClass (karpenter.k8s.aws/v1).

Node operating system

Bottlerocket only. NVIDIA GPU, AWS Trainium, and EFA dependencies included.

AL2023, Bottlerocket, Windows, or your own AMI.

Node lifetime

21-day maximum node lifetime for security patching. Workloads must tolerate node rotation.

You define the node lifecycle through NodePool expireAfter and disruption budgets.

Spot interruption handling

Native. No SQS queue or Node Termination Handler required.

Your responsibility to configure and enable.

Fast container pulls

SOCI parallel pull included in all G, P, and Trn family instances

Your responsibility to configure and enable.

EC2 placement groups

Cluster, partition, spread

Cluster, partition, spread

Network interface config

Not supported

Per interface configuration for type interface or EFA-only

Node repair

Enabled by default, EKS node monitoring agent included

Optionally enabled, EKS node monitoring agent self-managed

Pricing

EKS Auto Mode management fee in addition to underlying EC2 instance cost.

Open source. You pay for the underlying EC2 instances.

Common AI/ML well-known labels

EKS Auto Mode and Karpenter expose instance labels that you can use in NodePool requirements and Pod nodeSelector or nodeAffinity to target workloads without hardcoding instance types. The label prefix differs between the two: EKS Auto Mode uses eks.amazonaws.com/ while self-managed Karpenter uses karpenter.k8s.aws/.

The tables below show relevant labels that can be used in NodePools. EKS Auto Mode and Karpenter also apply the labels listed in the Karpenter documentation to nodes as part of the provisioning process that can be further used for workload targeting.

EKS Auto Mode

For the full list, see EKS Auto Mode Supported Labels.

Label Example value Description

eks.amazonaws.com/instance-family

p5

Instance types of similar properties but different resource quantities.

eks.amazonaws.com/instance-category

p

Instance category, usually the letter before the generation number.

eks.amazonaws.com/instance-generation

5

Instance type generation number within a category.

eks.amazonaws.com/instance-gpu-name

h100

Name of the GPU on the instance.

eks.amazonaws.com/instance-gpu-manufacturer

nvidia

Name of the GPU manufacturer.

eks.amazonaws.com/instance-gpu-count

8

Number of GPUs on the instance.

eks.amazonaws.com/instance-gpu-memory

81920

Mebibytes of memory per GPU.

karpenter.sh/capacity-type

reserved

Capacity type: spot, on-demand, or reserved.

topology.kubernetes.io/zone

us-east-1a

Availability Zone.

Self-managed Karpenter

For the full list, see Karpenter Well-Known Labels.

Label Example value Description

karpenter.k8s.aws/instance-family

p5

Instance types of similar properties but different resource quantities.

karpenter.k8s.aws/instance-category

p

Instance category, usually the letter before the generation number.

karpenter.k8s.aws/instance-generation

5

Instance type generation number within a category.

karpenter.k8s.aws/instance-gpu-name

h100

Name of the GPU on the instance.

karpenter.k8s.aws/instance-gpu-manufacturer

nvidia

Name of the GPU manufacturer.

karpenter.k8s.aws/instance-gpu-count

8

Number of GPUs on the instance.

karpenter.sh/capacity-type

reserved

Capacity type: spot, on-demand, or reserved.

topology.kubernetes.io/zone

us-east-1a

Availability Zone.

kubernetes.io/arch

amd64

CPU architecture.

Scheduling labels for reserved capacity

When EKS Auto Mode or Karpenter launches a node into a reservation, it adds the following labels. Use them in nodeSelector, node affinity, or NodePool requirements to route workloads.

  • karpenter.sh/capacity-type: reserved, on-demand, or spot. Indicates the capacity backing the node.

  • karpenter.k8s.aws/capacity-reservation-id: The specific reservation ID the node was launched into.

  • karpenter.k8s.aws/capacity-reservation-type: default for ODCRs, capacity-block for Capacity Blocks.

The following examples show common scheduling patterns:

Pin a Pod to one specific reservation (no fallback):

spec: nodeSelector: karpenter.sh/capacity-type: reserved karpenter.k8s.aws/capacity-reservation-id: "cr-0123456789abcdef0"

Target ODCR nodes only (any ODCR, not Capacity Blocks):

spec: nodeSelector: karpenter.sh/capacity-type: reserved karpenter.k8s.aws/capacity-reservation-type: default

Target any reserved capacity (ODCR or Capacity Block):

spec: nodeSelector: karpenter.sh/capacity-type: reserved

Prefer reserved but fall back to Spot or On-Demand if unavailable:

spec: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 preference: matchExpressions: - key: karpenter.sh/capacity-type operator: In values: ["reserved"]

Reservation expiration behavior

ODCRs and Capacity Blocks behave differently when the reservation ends. Make sure your scheduling and checkpointing strategy matches the type of reservation backing your workload.

ODCRs

An instance launched into an ODCR is not in that ODCR indefinitely. The ODCR can expire, be cancelled, or the instance can be manually removed from the ODCR. If any of these occur and EKS Auto Mode / Karpenter detects that the instance no longer belongs to an ODCR, it updates the node’s karpenter.sh/capacity-type label from reserved to on-demand. The instance keeps running as standard On-Demand capacity, and existing Pods continue running uninterrupted.

Note

Any Pod scheduled with a strict nodeSelector: karpenter.sh/capacity-type: reserved will not schedule onto the node if it has been relabeled. For workloads to survive an ODCR expiry or cancellation, use the preferredDuringSchedulingIgnoredDuringExecution pattern shown above instead of a nodeSelector.

Capacity Blocks

Unlike ODCRs, Capacity Blocks always have an end time, and EC2 terminates Capacity Block instances 30 minutes ahead of the end time (60 minutes for UltraServer instance types). Plan training and inference jobs to complete or save state before the reservation window closes. Pods that use a strict nodeSelector for a specific capacity-reservation-id go Pending once the block expires and will not reschedule elsewhere. Combine checkpointing with the flexible affinity pattern above if you need workloads to move to other capacity during Capacity Block expiry.

  • You can use reserved instances until 30 minutes before the Capacity Block end time for most instance types, or 60 minutes before the end time for UltraServer instance types.

  • EKS Auto Mode and Karpenter preemptively begin draining nodes in a Capacity Block 10 minutes before EC2 starts termination, so workloads have time to checkpoint and shut down gracefully.

Static capacity NodePools

EKS Auto Mode and Karpenter support static capacity NodePools, which maintain a fixed number of nodes regardless of workload demand. Static pools eliminate cold-start delays for latency-sensitive inference, and let you reserve a minimum infrastructure footprint for your cluster.

Static capacity is configured by setting the replicas field on the NodePool.

Considerations

  • Once replicas is set on a NodePool, you cannot remove it. A single NodePool cannot switch between static and dynamic capacity provisioning.

  • Static capacity NodePools are not considered for consolidation. Set limits.nodes above replicas to allow temporary scaling during AMI drift or expiration.

  • For predictable Availability Zone (AZ) distribution, create one static capacity NodePool per AZ rather than spanning multiple zones in a single pool.

EKS Auto Mode

The example below shows a static capacity NodePool that uses the default EKS Auto Mode NodeClass and creates a static NodePool with 4 nodes (replicas) that can be at most 6 nodes (limits.nodes).

apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-static-inference spec: replicas: 4 template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: default requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["on-demand"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["g6e"] - key: "topology.kubernetes.io/zone" operator: In values: ["us-east-1a"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule limits: nodes: 6 # Allow temporary headroom during node replacement
Self-managed Karpenter

With self-managed Karpenter, static capacity is gated by the alpha StaticCapacity feature (launched in Karpenter version v1.8), which must be enabled in the Helm values:

settings: featureGates: staticCapacity: true

The NodePool references a custom EC2NodeClass named my-nodeclass and creates a static NodePool with 4 nodes (replicas) that can be at most 6 nodes (limits.nodes).

apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-static-inference spec: replicas: 4 template: spec: nodeClassRef: group: karpenter.k8s.aws kind: EC2NodeClass name: my-nodeclass requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["on-demand"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["g6e"] - key: "topology.kubernetes.io/zone" operator: In values: ["us-east-1a"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule limits: nodes: 6 # Allow temporary headroom during node replacement

Capacity Blocks for ML

Capacity Blocks for ML allow you to reserve P-family and Trainium instances for a defined future window. They are pre-paid, so EKS Auto Mode and Karpenter model them as free and prioritize them over On-Demand and Spot. Capacity Blocks for ML can have a reservation duration of 1-14 days or a multiple of 7 days, up to 182 days (6 months).

To use Capacity Blocks for ML with EKS Auto Mode or Karpenter, configure capacityReservationSelectorTerms with your capacity reservation ID in your NodeClass. You cannot use open reservation matching with Capacity Blocks for ML. A term can specify an ID, a set of tags, or instance match criteria to select against. When specifying tags, it will select all capacity reservations accessible from the account with matching tags. This can be further restricted by specifying an owner account ID.

For more examples, see the Karpenter documentation.

EKS Auto Mode

Create a NodeClass that references your Capacity Block reservation, then create a NodePool that uses it.

With consolidateAfter: Never set, Karpenter will not attempt to replace, merge, or terminate nodes to reduce cost or pack workloads more efficiently. This is recommended for Capacity Blocks because the capacity is already pre-paid.

apiVersion: eks.amazonaws.com/v1 kind: NodeClass metadata: name: capacity-block-gpu spec: capacityReservationSelectorTerms: - id: "cr-0123456789abcdef0" # Your Capacity Block reservation ID # Alternative: select by tags # - tags: # role: "production-inference" # owner: "012345678901" --- apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-capacity-block spec: disruption: consolidationPolicy: WhenEmpty consolidateAfter: Never template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: capacity-block-gpu requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["reserved"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["p5", "p5e", "p5en", "p4d"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule
Self-managed Karpenter

Create an EC2NodeClass that includes AMI, subnet, and security group selectors in addition to capacityReservationSelectorTerms, then create a NodePool that uses it.

With consolidateAfter: Never set, Karpenter will not attempt to replace, merge, or terminate nodes to reduce cost or pack workloads more efficiently. This is recommended for Capacity Blocks because the capacity is already pre-paid.

apiVersion: karpenter.k8s.aws/v1 kind: EC2NodeClass metadata: name: capacity-block-gpu spec: amiSelectorTerms: - alias: al2023@latest subnetSelectorTerms: - tags: karpenter.sh/discovery: ml-cluster securityGroupSelectorTerms: - tags: karpenter.sh/discovery: ml-cluster capacityReservationSelectorTerms: - id: "cr-0123456789abcdef0" # Your Capacity Block reservation ID # Alternative: select by tags # - tags: # role: "production-inference" # owner: "012345678901" --- apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-capacity-block spec: disruption: consolidationPolicy: WhenEmpty consolidateAfter: Never template: spec: nodeClassRef: group: karpenter.k8s.aws kind: EC2NodeClass name: capacity-block-gpu requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["reserved"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["p5", "p5e", "p5en", "p4d"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule

On-Demand Capacity Reservations (ODCRs)

ODCRs guarantee capacity in a specific Availability Zone (AZ) without a long-term commitment. You’re billed at standard On-Demand rates whether the capacity is used or not. ODCRs support all NVIDIA GPU families, including G-family instances that aren’t supported by Capacity Blocks for ML. ODCRs are pre-paid, so EKS Auto Mode and Karpenter model them as free and prioritize them over On-Demand and Spot.

ODCRs behave differently from Capacity Blocks for ML at the end of the reservation. When an ODCR expires or is cancelled, the instance keeps running as standard On-Demand. See Reservation expiration behavior for details.

To use ODCRs with EKS Auto Mode or Karpenter, configure capacityReservationSelectorTerms with your capacity reservation terms in your NodeClass. A term can specify an ID, a set of tags, or instance match criteria to select against. When specifying tags, it will select all capacity reservations accessible from the account with matching tags. When specifying instance match criteria, it selects reservations by their matching behavior: open (matches all compatible instances) or targeted (matches only explicitly targeted instances). This can be further restricted by specifying an owner account ID.

For more examples, see the Karpenter documentation.

EKS Auto Mode

Create a NodeClass with capacityReservationSelectorTerms and a NodePool that prioritizes reserved with on-demand fallback. Pin topology.kubernetes.io/zone to the ODCR’s AZ:

apiVersion: eks.amazonaws.com/v1 kind: NodeClass metadata: name: odcr-gpu-production spec: capacityReservationSelectorTerms: - id: "cr-0987654321fedcba0" # Alternative: select by tags # - tags: # Purpose: "production-inference" # owner: "012345678901" --- apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-reserved-production spec: disruption: consolidationPolicy: WhenEmpty consolidateAfter: Never template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: odcr-gpu-production requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["reserved", "on-demand"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["p5", "g6e"] - key: "topology.kubernetes.io/zone" operator: In values: ["us-east-1a"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule
Self-managed Karpenter

Create an EC2NodeClass with AMI, subnet, and security group selectors in addition to capacityReservationSelectorTerms, then create the NodePool:

apiVersion: karpenter.k8s.aws/v1 kind: EC2NodeClass metadata: name: odcr-gpu-production spec: amiSelectorTerms: - alias: al2023@latest subnetSelectorTerms: - tags: karpenter.sh/discovery: ml-cluster securityGroupSelectorTerms: - tags: karpenter.sh/discovery: ml-cluster capacityReservationSelectorTerms: - id: "cr-0987654321fedcba0" --- apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-reserved-production spec: disruption: consolidationPolicy: WhenEmpty consolidateAfter: Never template: spec: nodeClassRef: group: karpenter.k8s.aws kind: EC2NodeClass name: odcr-gpu-production requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["reserved", "on-demand"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["p5", "g6e"] - key: "topology.kubernetes.io/zone" operator: In values: ["us-east-1a"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule

On-Demand

On-Demand is the default capacity type and can be used with static or dynamic provisioning in EKS Auto Mode and Karpenter. You can explicitly request On-Demand instances by setting karpenter.sh/capacity-type: on-demand in your NodePool. EKS Auto Mode and Karpenter select the lowest-priced instance that satisfies the Pod’s resource requests. Use On-Demand for development, prototyping, unpredictable inference scaling, and any workload that needs immediate availability without interruption risk.

EKS Auto Mode
apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-ondemand spec: template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: default requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["on-demand"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["g6", "g6e", "g7e"] - key: "karpenter.k8s.aws/instance-gpu-manufacturer" operator: In values: ["nvidia"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule
Self-managed Karpenter
apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-ondemand spec: template: spec: nodeClassRef: group: karpenter.k8s.aws kind: EC2NodeClass name: default requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["on-demand"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["g6", "g6e", "g7e"] - key: "karpenter.k8s.aws/instance-gpu-manufacturer" operator: In values: ["nvidia"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule

Spot

Spot offers up to 90% savings versus On-Demand by using spare EC2 capacity. AWS can reclaim Spot instances with a 2-minute interruption notice. Maximize availability by listing multiple instance families on the NodePool. Pair Spot workloads with a PodDisruptionBudget and checkpoint to durable storage (Amazon S3 or Amazon EFS) at regular intervals so Pods can save state during the drain window.

Spot is a good fit for fault-tolerant, resumable training and inference workloads where occasional interruption is acceptable in exchange for significant cost savings.

Common candidates include:

  • Hyperparameter tuning and sweeps: many short, parallel trials that can be retried if interrupted.

  • Distributed training with checkpointing: long-running jobs that periodically save state to S3 or FSx and can resume from the last checkpoint after node loss.

  • Batch and offline inference: large-scale scoring jobs against datasets where end-to-end latency is measured in hours, not seconds.

  • Data preprocessing and feature engineering pipelines: parallel transformations over large datasets.

  • Model evaluation and benchmarking: repeatable jobs that produce idempotent results.

  • Development, prototyping, and notebooks: interactive experimentation where users can tolerate occasional restarts.

Avoid Spot for latency-sensitive real-time inference, SLA-bound production endpoints, and workloads that don’t checkpoint or can’t tolerate restarts.

You can explicitly request Spot instances by setting karpenter.sh/capacity-type: spot in your NodePool.

EKS Auto Mode

EKS Auto Mode handles Spot interruptions natively. No SQS queue or Node Termination Handler is required.

apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-spot spec: disruption: budgets: - nodes: 10% consolidationPolicy: WhenEmpty consolidateAfter: 1h template: spec: nodeClassRef: group: eks.amazonaws.com kind: NodeClass name: default requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["spot"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["g6", "g6e", "g7e"] - key: "karpenter.k8s.aws/instance-gpu-manufacturer" operator: In values: ["nvidia"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule limits: resources: nvidia.com/gpu: "64"
Self-managed Karpenter

Self-managed Karpenter requires you to enable native interruption handling on the Karpenter controller (not on the NodePool) by configuring an interruption queue: an SQS queue that receives EC2 Spot interruption and Rebalance Recommendation events. You configure this once at install time.

If you install Karpenter directly with Helm, set settings.interruptionQueue in your values.yaml:

# karpenter values.yaml (Helm) settings: clusterName: my-cluster interruptionQueue: my-queue # Name of the SQS queue receiving Spot events

If you bootstrap Karpenter with eksctl, set withSpotInterruptionQueue: true in your cluster config file. eksctl creates the SQS queue and EventBridge rules and configures the Karpenter controller to use them.

# eksctl ClusterConfig karpenter: version: "${KARPENTER_VERSION}" withSpotInterruptionQueue: true

Once the controller is set up to use your queue, no extra configuration is needed on individual NodePool resources. The interruption handling applies cluster-wide.

apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: gpu-spot spec: disruption: budgets: - nodes: 10% consolidationPolicy: WhenEmpty consolidateAfter: 1h template: spec: nodeClassRef: group: karpenter.k8s.aws kind: EC2NodeClass name: default requirements: - key: "karpenter.sh/capacity-type" operator: In values: ["spot"] - key: "karpenter.k8s.aws/instance-family" operator: In values: ["g6", "g6e", "g7e"] - key: "karpenter.k8s.aws/instance-gpu-manufacturer" operator: In values: ["nvidia"] taints: - key: nvidia.com/gpu value: "true" effect: NoSchedule limits: resources: nvidia.com/gpu: "64"