Help improve this page
To contribute to this user guide, choose the Edit this page on GitHub link that is located in the right pane of every page.
Manage EFA devices on Amazon EKS
Elastic Fabric Adapter (EFA) is a network device for Amazon EC2 instances that enables high-performance inter-node communication and RDMA (Remote Direct Memory Access) for artificial intelligence, machine learning, and High Performance Computing (HPC) workloads. Amazon EKS supports two mechanisms for managing EFA devices in EKS clusters: the EFA Dynamic Resource Allocation (DRA) driver (DRANET) and the EFA device plugin.
It’s recommended to use the EFA DRA driver (DRANET) for new deployments on EKS clusters running Kubernetes version 1.34 or later with EKS managed node groups or self-managed node groups. The EFA DRA driver makes it possible for you to configure topology-aware allocation that pairs EFA interfaces with their topologically-local GPUs or Neuron devices, and supports device sharing between Pods.
The EFA DRA driver is not supported with Karpenter or EKS Auto Mode. Use the EFA device plugin with Karpenter and EKS Auto Mode. The EFA device plugin also remains supported for EKS managed node groups and self-managed nodes.
EFA DRA driver vs. EFA device plugin
| Feature | EFA DRA driver | EFA device plugin |
|---|---|---|
|
Minimum Kubernetes version |
1.34 |
All EKS-supported Kubernetes versions |
|
EKS Compute |
Managed node groups, self-managed nodes |
EKS Auto Mode, Karpenter, managed node groups, self-managed nodes |
|
EKS-optimized AMIs |
AL2023 (NVIDIA, Neuron), Bottlerocket |
AL2023 (NVIDIA, Neuron), Bottlerocket |
|
Device advertisement |
Rich attributes via |
Integer count of |
|
GPU-EFA affinity |
DRA-native topology-awareness |
Automatic topology-awareness (EKS-optimized AL2023 AMIs only) |
|
Neuron-EFA affinity |
DRA-native topology-awareness |
Automatic topology-awareness (EKS-optimized AL2023 AMIs only) |
|
Device sharing |
Multiple Pods can share the same EFA device through shared |
Not supported. Each EFA device is exclusively allocated to one Pod. |
Creating EKS nodes with EFA interfaces
When you create EKS nodes with EFA interfaces, the EFA interfaces are attached to the instance during instance provisioning. You can customize the per-device EFA configuration and use placement groups with Karpenter, EKS managed node groups, or EKS self-managed node groups. With Karpenter, you pass configuration for each network interface via the NodeClass. With EKS managed node groups or self-managed nodes, you pass configuration for each network interface with launch templates. EKS Auto Mode support for per-device EFA configuration and placement groups is coming soon.
When using eksctl for provisioning EKS nodes with the efaEnabled setting, all interfaces are configured with interface type EFA, an EFA-specific security group is created, and the EFA device plugin is installed on the cluster. If you need to customize the per-device EFA configuration when using eksctl, it is recommended to use `eksctl’s support for launch templates.
The following examples show how to configure NodeClass and launch templates with EFA interfaces. This is useful to customize the interfaces used for EFA vs standard IP-based traffic. For information on the number of EFA interfaces supported by each instance type and how to configure them for maximum network bandwidth, see Maximize network bandwidth for EFA-enabled instance types in the Amazon EC2 User Guide.
Karpenter
Each entry in networkInterfaces specifies a networkCardIndex, deviceIndex, and interfaceType. The interfaceType can be interface for standard network interfaces or efa-only for EFA interfaces that are dedicated to RDMA traffic and do not have IP addresses assigned. When networkInterfaces is configured, instances launched by the NodePool referencing the NodeClass use this configuration regardless of whether Pods request vpc.amazonaws.com/efa resources.
When using Karpenter without specifying networkInterfaces in your NodeClass, instances created for Pods requesting vpc.amazonaws.com/efa have all interfaces configured with interface type EFA.
The networkInterfaces configuration for EC2NodeClass was added in Karpenter v1.11. The following example shows an EC2NodeClass configured for a P6-B200 instance with 1 ENA interface and 8 EFA-only interfaces.
apiVersion: karpenter.k8s.aws/v1 kind: EC2NodeClass metadata: name: efa-node-class spec: networkInterfaces: - networkCardIndex: 0 deviceIndex: 0 interfaceType: interface - networkCardIndex: 0 deviceIndex: 1 interfaceType: efa-only - networkCardIndex: 1 deviceIndex: 0 interfaceType: efa-only - networkCardIndex: 2 deviceIndex: 0 interfaceType: efa-only - networkCardIndex: 3 deviceIndex: 0 interfaceType: efa-only - networkCardIndex: 4 deviceIndex: 0 interfaceType: efa-only - networkCardIndex: 5 deviceIndex: 0 interfaceType: efa-only - networkCardIndex: 6 deviceIndex: 0 interfaceType: efa-only - networkCardIndex: 7 deviceIndex: 0 interfaceType: efa-only
EKS managed node groups and self-managed nodes
With EKS managed node groups or self-managed nodes, you pass configuration for each network interface with launch templates.
The following example shows a launch template configured for a P6-B200 instance with 1 ENA interface and 8 EFA-only interfaces. The primary network interface (network card 0, device index 0) uses a standard interface type for IP traffic, while additional interfaces use efa-only for dedicated RDMA traffic. Adjust the number of efa-only interfaces based on your instance type. For the number of EFA interfaces supported by each instance type, see Maximize network bandwidth for EFA-enabled instance types in the Amazon EC2 User Guide.
Replace
security-group-id
with your values. The security group must allow all inbound and outbound traffic to and from itself to enable EFA OS-bypass functionality. For more information, see Step 1: Prepare an EFA-enabled security group in the Amazon EC2 User Guide.
Important
Do not specify SubnetId in the launch template when using EKS managed node groups. EKS requires that all subnets are specified through the CreateNodegroup API and rejects launch templates that include subnet configuration.
{ "LaunchTemplateName": "efa-launch-template", "LaunchTemplateData": { "InstanceType": "p6-b200.48xlarge", "NetworkInterfaces": [ { "NetworkCardIndex": 0, "DeviceIndex": 0, "InterfaceType": "interface", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 0, "DeviceIndex": 1, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 1, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 2, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 3, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 4, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 5, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 6, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] }, { "NetworkCardIndex": 7, "DeviceIndex": 0, "InterfaceType": "efa-only", "Groups": ["security-group-id"] } ] } }
Using EKS-optimized AMIs with EFA
The EKS-optimized AL2023 accelerated AMIs (NVIDIA and Neuron) and all Bottlerocket AMIs include the host-level components required to use EFA, specifically the components installed by the aws-efa-installer. The EKS AL2023 and Bottlerocket AMIs do not include the EFA DRA driver or EFA device plugin, and these must be installed separately on your cluster before deploying workloads.
Conserving IP address allocation
EFA-enabled instances such as p5.48xlarge and p6-b200.48xlarge support many network interfaces. By default, the Amazon VPC CNI allocates IP addresses across all IP-enabled attached ENIs, which can consume a large number of IP addresses from your subnet even when those addresses are not actively used by Pods. On instances with dozens of network interfaces, this can quickly exhaust your subnet’s available IP space.
To reduce IP address consumption on EFA-enabled nodes, configure your network interfaces to use efa-only for all interfaces except the primary. EFA-only interfaces are dedicated to RDMA traffic and do not have IP addresses assigned, so they do not consume addresses from your subnet. For example configurations, see Karpenter and EKS managed node groups and self-managed nodes. For the recommended interface layout for each instance type, see Maximize network bandwidth for EFA-enabled instance types in the Amazon EC2 User Guide.
In addition to using efa-only interfaces, you can configure the Amazon VPC CNI to limit the number of warm (pre-allocated) IP addresses and ENIs. By default, the VPC CNI pre-allocates a warm pool of ENIs and IP addresses for faster Pod startup, but on large instances this can reserve hundreds of unused IP addresses. Set the WARM_IP_TARGET and WARM_ENI_TARGET environment variables on the aws-node DaemonSet to control how many spare IP addresses and ENIs the CNI maintains. For more information on these settings, see Amazon VPC CNI best practices.
Note
The WARM_ENI_TARGET and WARM_IP_TARGET settings are cluster-wide and apply to all nodes managed by the VPC CNI. There is currently no way to set different values per node group or instance type. If you need more granular control of these settings, provide feedback on containers-roadmap issue #1834
Install the EFA DRA driver (DRANET)
The EFA DRA driver is built in the upstream DRANET
The EFA DRA driver advertises EFA devices as ResourceSlice objects with the driver name dra.net and the DeviceClass name efa.networking.k8s.aws. The EFA DRA driver runs as a DaemonSet on each node and automatically discovers EFA devices.
Prerequisites
-
An Amazon EKS cluster running Kubernetes version 1.34 or later with EKS managed node groups or self-managed node groups.
-
Nodes with EFA-enabled Amazon EC2 instance types. For a list of supported instance types, see Supported instance types in the Amazon EC2 User Guide.
-
Nodes with host-level components installed for EFA, see Install the EFA software for more information. The EKS-optimized AL2023 NVIDIA and Neuron AMIs, and the Bottlerocket AMIs include the EFA host-level components.
-
Helm installed in your command-line environment, see the Setup Helm instructions for more information.
-
kubectlconfigured to communicate with your cluster, see Install or update kubectl for more information.
Procedure
Important
Do not install the EFA DRA driver on nodes where the EFA device plugin is running. The two mechanisms cannot coexist on the same node. See upstream Kubernetes KEP-5004
-
Add the EKS Helm chart repository.
helm repo add eks https://aws.github.io/eks-charts -
Update your local Helm repository.
helm repo update -
Install the EFA DRA driver on your cluster using Helm. The EFA DRA driver automatically detects that it is running on EC2 instances via the Instance Metadata Service (IMDS) and enables EFA device discovery. The EFA DRA driver is deployed as a DaemonSet in the
kube-systemnamespace by default. See the Helm values.yaml in the EKS Helm chart GitHub repositoryfor the configurable parameters. helm install aws-dranet eks/aws-dranet --namespace kube-system -
Verify that the DRANET DaemonSet is running.
kubectl get daemonset -n kube-system aws-dranetNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE aws-dranet 2 2 2 2 2 <none> 60s -
Verify that the
DeviceClasswas created.kubectl get deviceclassNAME AGE efa.networking.k8s.aws 60s -
Verify that
ResourceSliceobjects are advertised for your nodes.kubectl get resourceslices --field-selector spec.driver=dra.netIf you experience errors with the steps above, you can check the logs for DRANET with the following command.
kubectl logs -n kube-system -l app=aws-dranet -
To request EFA devices using the DRA driver, create a
ResourceClaimorResourceClaimTemplatethat references the EFADeviceClassand reference it in your Pod specification. The following example requests a single EFA device.apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: single-efa-claim spec: spec: devices: requests: - name: efa exactly: deviceClassName: efa.networking.k8s.aws count: 1 --- apiVersion: v1 kind: Pod metadata: name: efa-workload spec: containers: - name: app ... resources: claims: - name: efa-device resourceClaims: - name: efa-device resourceClaimTemplateName: single-efa-claim
Topology-aware EFA and GPU/Neuron device allocation
The EFA DRA driver supports topology-aware allocation that pairs EFA interfaces with GPUs or Neuron devices on the same PCIe root. Use the matchAttribute constraint to align EFA and GPU or Neuron device allocations. To use this capability, you must also use the NVIDIA or Neuron DRA drivers. For more information, see Manage NVIDIA GPU devices on Amazon EKS and Manage Neuron devices on Amazon EKS.
The following example requests 1 EFA interface aligned with 1 NVIDIA GPU:
apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: aligned-efa-nvidia spec: spec: devices: requests: - name: 1-efa exactly: deviceClassName: efa.networking.k8s.aws count: 1 - name: 1-gpu exactly: deviceClassName: gpu.nvidia.com count: 1 constraints: - requests: ["1-gpu", "1-efa"] matchAttribute: "resource.kubernetes.io/pcieRoot"
The following example requests 4 EFA interfaces aligned with 4 Neuron devices:
apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: aligned-efa-neuron spec: spec: devices: requests: - name: 4-neurons exactly: deviceClassName: neuron.aws.com count: 4 - name: 4-efas exactly: deviceClassName: efa.networking.k8s.aws count: 4 constraints: - requests: ["4-neurons", "4-efas"] matchAttribute: "resource.aws.com/devicegroup4_id"
The number in the devicegroup attribute name corresponds to the number of Neuron devices in the connected topology group. For example, resource.aws.com/devicegroup1_id identifies a single Neuron device, resource.aws.com/devicegroup4_id identifies a group of 4 connected devices, and resource.aws.com/devicegroup8_id and resource.aws.com/devicegroup16_id identify groups of 8 and 16 connected devices respectively. Choose the matchAttribute that matches the device count in your request so that the allocated Neuron devices and EFA interfaces belong to the same connected topology group. For more information on these attributes, see the Neuron DRA driver documentation
You can use allocationMode to simplify how EFA devices are allocated to aligned GPU or Neuron accelerators. The allocationMode field supports two values: ExactCount (the default) requests a specific number of devices specified by count, and All requests all matching devices in a pool. For example, on p5.48xlarge instances there are four EFA devices that share the same PCIe root with one GPU. To allocate these groups of EFA devices with aligned GPUs, even if you do not know the exact EFA-GPU device mapping and count of aligned EFA devices, you can configure your ResourceClaimTemplate with allocationMode: All for the EFA devices.
apiVersion: resource.k8s.io/v1 kind: ResourceClaimTemplate metadata: name: aligned-all-efa-one-nvidia spec: spec: devices: requests: - name: all-efas exactly: deviceClassName: efa.networking.k8s.aws allocationMode: All - name: one-gpu exactly: deviceClassName: gpu.nvidia.com allocationMode: ExactCount count: 1 constraints: - requests: ["all-efas", "one-gpu"] matchAttribute: "resource.kubernetes.io/pcieRoot"
Share EFA devices between multiple Pods
The EFA DRA driver supports sharing EFA devices between multiple Pods by using a ResourceClaim. Unlike a ResourceClaimTemplate, which generates a separate claim for each Pod, a ResourceClaim is a named object that you create independently and reference from multiple Pods. All Pods that reference the same ResourceClaim share access to the same allocated EFA devices and are scheduled to the same node where those devices are available.
To share EFA devices between Pods, create a ResourceClaim that requests the EFA devices, then reference that claim by name in each Pod’s resourceClaims field using resourceClaimName. The ResourceClaim must exist in the cluster before the Pods that reference it are created. If a referenced ResourceClaim does not exist, the Pods remain in a pending state until the claim is created.
The following example creates a ResourceClaim that requests 4 EFA devices, and two Pods that share access to those devices.
-
Create the
ResourceClaim.apiVersion: resource.k8s.io/v1 kind: ResourceClaim metadata: name: shared-efa spec: devices: requests: - name: efa exactly: deviceClassName: efa.networking.k8s.aws count: 4 -
Reference the
ResourceClaimby name in each Pod that needs access to the EFA devices. Each Pod usesresourceClaimNameto reference the existing claim instead ofresourceClaimTemplateName.apiVersion: v1 kind: Pod metadata: name: training-worker spec: containers: - name: worker image: my-training-image resources: claims: - name: efa-devices resourceClaims: - name: efa-devices resourceClaimName: shared-efa --- apiVersion: v1 kind: Pod metadata: name: training-monitor spec: containers: - name: monitor image: my-monitor-image resources: claims: - name: efa-devices resourceClaims: - name: efa-devices resourceClaimName: shared-efa
Both Pods reference the same shared-efa
ResourceClaim and are scheduled to the node where those EFA devices are allocated. The ResourceClaim lifecycle is independent of the Pods — it persists until you delete it, even if all Pods referencing it are removed.
Install the EFA Kubernetes device plugin
The EFA Kubernetes device plugin advertises EFA devices as vpc.amazonaws.com/efa extended resources. You request EFA devices in container resource requests and limits. For a complete walkthrough of setting up EFA with training workloads, see Run machine learning training on Amazon EKS with Elastic Fabric Adapter.
Important
Topology-aligned allocation of NVIDIA GPUs or Neuron devices with EFA interfaces happens automatically when using the EKS-optimized AL2023 accelerated AMIs. This automatic alignment does not occur when using Bottlerocket EKS-optimized AMIs or custom AMIs. If you need topology-aligned accelerator and EFA device allocation with Bottlerocket or custom AMIs, use the EFA DRA driver and the corresponding Neuron DRA driver. The NVIDIA DRA driver is not supported on Bottlerocket. For more information, see Topology-aware EFA and GPU/Neuron device allocation.
Important
Starting with NVIDIA k8s-device-plugin v0.19.0, the --mofed-enabled flag defaults to true, which causes the NVIDIA device plugin to mount all /dev/infiniband/uverbs* devices into containers requesting GPUs. This conflicts with the EFA device plugin, which should be the component managing EFA device allocation at /dev/infiniband. If you are using EKS managed node groups or self-managed nodes with the NVIDIA device plugin, you must explicitly disable MOFED. For instructions, see Install the NVIDIA Kubernetes device plugin.
EKS Auto Mode does not enable MOFED by default and is not affected by this issue.
Prerequisites
-
An Amazon EKS cluster.
-
Nodes with EFA-enabled Amazon EC2 instance types. For a list of supported instance types, see Supported instance types in the Amazon EC2 User Guide.
-
Nodes with host-level components installed for EFA, see Install the EFA software for more information. The EKS-optimized AL2023 NVIDIA and Neuron AMIs, and the Bottlerocket AMIs include the EFA host-level components.
-
Helm installed in your command-line environment, see the Setup Helm instructions for more information.
-
kubectlconfigured to communicate with your cluster, see Install or update kubectl for more information.
Procedure
-
Add the EKS Helm chart repository.
helm repo add eks https://aws.github.io/eks-charts -
Update your local Helm repository.
helm repo update -
Install the EFA device plugin.
helm install efa eks/aws-efa-k8s-device-plugin -n kube-system -
Verify the EFA device plugin DaemonSet is running.
kubectl get daemonset -n kube-system efa-aws-efa-k8s-device-pluginNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE efa-aws-efa-k8s-device-plugin 2 2 2 2 2 <none> 60s -
Verify that your nodes have allocatable EFA resources.
kubectl get nodes "-o=custom-columns=NAME:.metadata.name,EFA:.status.allocatable.vpc\.amazonaws\.com/efa"NAME EFA ip-192-168-11-225.us-west-2.compute.internal 4 ip-192-168-24-96.us-west-2.compute.internal 4 -
To request EFA devices using the device plugin, specify the
vpc.amazonaws.com/efaresource in your container resource requests and limits.apiVersion: v1 kind: Pod metadata: name: efa-workload spec: containers: - name: app ... resources: limits: vpc.amazonaws.com/efa: 4 hugepages-2Mi: ... requests: vpc.amazonaws.com/efa: 4 hugepages-2Mi: ...