Amazon ECS task definitions for AWS Neuron machine learning workloads
You can use Amazon EC2
Trn1
Amazon EC2 Trn1 and Trn2 instances are powered by AWS Trainium
The Amazon EC2 Inf1 instances and Inf2 instances are powered by AWS Inferentia
Machine learning models are deployed to containers using AWS Neuron
Considerations
Before you begin deploying Neuron on Amazon ECS, consider the following:
-
Depending on the launch type, your clusters can contain a mix of Trn1, Trn2, Inf1, Inf2, and other instances.
-
You need a Linux application in a container that uses a machine learning framework that supports AWS Neuron.
Important
Applications that use other frameworks might not have improved performance on Trn1, Trn2, Inf1, and Inf2 instances.
-
Amazon ECS supports two approaches for configuring Neuron device access:
-
Managed Neuron device allocation – Use the
resourceRequirementsparameter with typeNeuronDevicein your container definition. Amazon ECS automatically discovers and assigns Neuron devices to your containers. Available on Managed Instances only. For more information, see Managed Neuron device allocation. -
Manual Neuron device specification – Use the
linuxParameters.devicesparameter to explicitly specify Neuron device paths. Available on both EC2 launch type and Managed Instances. For more information, see Manual Neuron device specification.
Important
Use only one approach consistently to avoid conflicts.
-
Managed Neuron device allocation
With Managed Instances, you can use the resourceRequirements
parameter in your container definition to request Neuron devices. Amazon ECS automatically
discovers Neuron devices on the instance, assigns them to your task, and configures
the container with access to all Neuron devices on the instance. Because the task
requires exclusive access to all devices, only one Neuron task runs per
instance.
Note
Inf1 instances are only supported on the EC2 launch type.
To use Inf1 instances, see Manual Neuron device specification.
Neuron instance selection
To select Neuron-enabled instance types for your Managed Instances workloads,
use the instanceRequirements object in the launch template of the
capacity provider. You can use the following attributes to select Neuron-enabled
instances:
-
acceleratorManufacturers– Useamazon-web-servicesto select instances with AWS accelerators (includes Inferentia and Trainium). -
acceleratorNames– Useinferentia2,trainium, ortrainium2to select specific accelerator chips. -
allowedInstanceTypes– Useinf*andtrn*to select Neuron instance types by name.
The following example uses allowedInstanceTypes:
{ "instanceRequirements": { "allowedInstanceTypes": ["inf*", "trn*"] } }
Task definition
To request Neuron devices in your task definition, add a
resourceRequirements entry with type NeuronDevice
and value ALL. This gives the container exclusive access to all
Neuron devices on the instance.
The following constraints apply:
-
At most one container definition can specify
NeuronDeviceinresourceRequirements. -
You can't combine
resourceRequirementswith typeNeuronDeviceandlinuxParameters.devicesfor Neuron devices in the same task definition.
After your task starts, you can verify the Neuron device assignment by calling
the DescribeTasks API operation. The response includes a
neuronDeviceIds field on each container that shows the IDs of the
assigned Neuron devices. You can also call the
DescribeContainerInstances API operation to view
NEURON_DEVICES in the registeredResources and
remainingResources fields for the container instance.
For an example task definition, see Example Neuron task definitions.
Manual Neuron device specification
With this approach, you manually specify AWS Trainium or AWS Inferentia
device paths in your task definition using the linuxParameters.devices
parameter. This approach works on both the EC2 launch type and Managed
Instances.
Only one inference or inference-training task can run on each AWS Trainium
For the EC2 launch type, you can use instance type attributes when you configure task placement constraints to ensure that the task is launched on the instance type you specify. For more information, see How Amazon ECS places tasks on container instances.
Task definition requirements
The task definition must be specific to a single instance type. You must configure
a container to use specific AWS Trainium or AWS Inferentia devices that are
available on the host container instance. You can do so using the
linuxParameters parameter. The following table details the chips
that are specific to each instance type.
| Instance Type | vCPUs | RAM (GiB) | AWS ML accelerator chips | Device Paths |
|---|---|---|---|---|
| trn1.2xlarge | 8 | 32 | 1 | /dev/neuron0 |
| trn1.32xlarge | 128 | 512 | 16 |
/dev/neuron0, /dev/neuron1,
/dev/neuron2, /dev/neuron3,
/dev/neuron4, /dev/neuron5,
/dev/neuron6, /dev/neuron7,
/dev/neuron8, /dev/neuron9,
/dev/neuron10, /dev/neuron11,
/dev/neuron12, /dev/neuron13,
/dev/neuron14, /dev/neuron15
|
| trn2.48xlarge | 192 | 1536 | 16 |
/dev/neuron0, /dev/neuron1,
/dev/neuron2, /dev/neuron3,
/dev/neuron4, /dev/neuron5,
/dev/neuron6, /dev/neuron7,
/dev/neuron8, /dev/neuron9,
/dev/neuron10, /dev/neuron11,
/dev/neuron12, /dev/neuron13,
/dev/neuron14, /dev/neuron15
|
| inf1.xlarge | 4 | 8 | 1 | /dev/neuron0 |
| inf1.2xlarge | 8 | 16 | 1 | /dev/neuron0 |
| inf1.6xlarge | 24 | 48 | 4 | /dev/neuron0, /dev/neuron1,
/dev/neuron2, /dev/neuron3 |
| inf1.24xlarge | 96 | 192 | 16 |
/dev/neuron0, /dev/neuron1,
/dev/neuron2, /dev/neuron3,
/dev/neuron4, /dev/neuron5,
/dev/neuron6, /dev/neuron7,
/dev/neuron8, /dev/neuron9,
/dev/neuron10, /dev/neuron11,
/dev/neuron12, /dev/neuron13,
/dev/neuron14, /dev/neuron15
|
| inf2.xlarge | 8 | 16 | 1 | /dev/neuron0 |
| inf2.8xlarge | 32 | 64 | 1 | /dev/neuron0 |
| inf2.24xlarge | 96 | 384 | 6 | /dev/neuron0, /dev/neuron1,
/dev/neuron2, /dev/neuron3,
/dev/neuron4, /dev/neuron5 |
| inf2.48xlarge | 192 | 768 | 12 | /dev/neuron0, /dev/neuron1,
/dev/neuron2, /dev/neuron3,
/dev/neuron4, /dev/neuron5,
/dev/neuron6, /dev/neuron7,
/dev/neuron8, /dev/neuron9,
/dev/neuron10, /dev/neuron11 |
For an example task definition, see Example Neuron task definitions.
Managed Instances
Managed Instances automatically use an AMI that includes the Neuron driver. No additional AMI configuration is required.
EC2 launch type
Amazon ECS provides an Amazon ECS optimized AMI that's based on Amazon Linux 2023 for AWS Trainium and AWS Inferentia workloads. It comes with the AWS Neuron drivers and runtime for Docker. This AMI makes running machine learning inference workloads easier on Amazon ECS.
We recommend using the Amazon ECS-optimized Amazon Linux 2023 (Neuron) AMI when launching your Amazon EC2 Trn1, Inf1, and Inf2 instances.
You can retrieve the current Amazon ECS-optimized Amazon Linux 2023 (Neuron) AMI using the AWS CLI with the following command.
aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2023/neuron/recommended