Considerations Managed Neuron device allocation Manual Neuron device specification

Amazon ECS task definitions for AWS Neuron machine learning workloads

You can use Amazon EC2 Trn1, Amazon EC2 Trn2, Amazon EC2 Inf1 (Inf1 is only supported on EC2 launch type), and Amazon EC2 Inf2 instances with your clusters for machine learning workloads.

Amazon EC2 Trn1 and Trn2 instances are powered by AWS Trainium chips. These instances provide high performance and low cost training for machine learning in the cloud. You can train a machine learning inference model using a machine learning framework with AWS Neuron on a Trn1 or Trn2 instance. Then, you can run the model on an Inf1 instance (Inf1 is only supported on EC2 launch type), or an Inf2 instance to use the acceleration of the AWS Inferentia chips.

The Amazon EC2 Inf1 instances and Inf2 instances are powered by AWS Inferentia chips They provide high performance and lowest cost inference in the cloud.

Machine learning models are deployed to containers using AWS Neuron, which is a specialized Software Developer Kit (SDK). The SDK consists of a compiler, runtime, and profiling tools that optimize the machine learning performance of AWS machine learning chips. AWS Neuron supports popular machine learning frameworks such as TensorFlow, PyTorch, and Apache MXNet.

Considerations

Before you begin deploying Neuron on Amazon ECS, consider the following:

Depending on the launch type, your clusters can contain a mix of Trn1, Trn2, Inf1, Inf2, and other instances.
You need a Linux application in a container that uses a machine learning framework that supports AWS Neuron.

Important
Applications that use other frameworks might not have improved performance on Trn1, Trn2, Inf1, and Inf2 instances.
Amazon ECS supports two approaches for configuring Neuron device access:
- Managed Neuron device allocation – Use the resourceRequirements parameter with type NeuronDevice in your container definition. Amazon ECS automatically discovers and assigns Neuron devices to your containers. Available on Managed Instances only. For more information, see Managed Neuron device allocation.
- Manual Neuron device specification – Use the linuxParameters.devices parameter to explicitly specify Neuron device paths. Available on both EC2 launch type and Managed Instances. For more information, see Manual Neuron device specification.
Important
Use only one approach consistently to avoid conflicts.

Managed Neuron device allocation

With Managed Instances, you can use the resourceRequirements parameter in your container definition to request Neuron devices. Amazon ECS automatically discovers Neuron devices on the instance, assigns them to your task, and configures the container with access to all Neuron devices on the instance. Because the task requires exclusive access to all devices, only one Neuron task runs per instance.

Note

Inf1 instances are only supported on the EC2 launch type. To use Inf1 instances, see Manual Neuron device specification.

Neuron instance selection

To select Neuron-enabled instance types for your Managed Instances workloads, use the instanceRequirements object in the launch template of the capacity provider. You can use the following attributes to select Neuron-enabled instances:

acceleratorManufacturers – Use amazon-web-services to select instances with AWS accelerators (includes Inferentia and Trainium).
acceleratorNames – Use inferentia2, trainium, or trainium2 to select specific accelerator chips.
allowedInstanceTypes – Use inf* and trn* to select Neuron instance types by name.

The following example uses allowedInstanceTypes:


{
    "instanceRequirements": {
        "allowedInstanceTypes": ["inf*", "trn*"]
    }
}

Task definition

To request Neuron devices in your task definition, add a resourceRequirements entry with type NeuronDevice and value ALL. This gives the container exclusive access to all Neuron devices on the instance.

The following constraints apply:

At most one container definition can specify NeuronDevice in resourceRequirements.
You can't combine resourceRequirements with type NeuronDevice and linuxParameters.devices for Neuron devices in the same task definition.

After your task starts, you can verify the Neuron device assignment by calling the DescribeTasks API operation. The response includes a neuronDeviceIds field on each container that shows the IDs of the assigned Neuron devices. You can also call the DescribeContainerInstances API operation to view NEURON_DEVICES in the registeredResources and remainingResources fields for the container instance.

For an example task definition, see Example Neuron task definitions.

Manual Neuron device specification

With this approach, you manually specify AWS Trainium or AWS Inferentia device paths in your task definition using the linuxParameters.devices parameter. This approach works on both the EC2 launch type and Managed Instances.

Only one inference or inference-training task can run on each AWS Trainium or AWS Inferentia chip. You can run as many tasks as there are chips on the instance by assigning different devices to each task.

For the EC2 launch type, you can use instance type attributes when you configure task placement constraints to ensure that the task is launched on the instance type you specify. For more information, see How Amazon ECS places tasks on container instances.

Task definition requirements

The task definition must be specific to a single instance type. You must configure a container to use specific AWS Trainium or AWS Inferentia devices that are available on the host container instance. You can do so using the linuxParameters parameter. The following table details the chips that are specific to each instance type.

Instance Type	vCPUs	RAM (GiB)	AWS ML accelerator chips	Device Paths
trn1.2xlarge	8	32	1	`/dev/neuron0`
trn1.32xlarge	128	512	16	`/dev/neuron0`, `/dev/neuron1`, `/dev/neuron2`, `/dev/neuron3`, `/dev/neuron4`, `/dev/neuron5`, `/dev/neuron6`, `/dev/neuron7`, `/dev/neuron8`, `/dev/neuron9`, `/dev/neuron10`, `/dev/neuron11`, `/dev/neuron12`, `/dev/neuron13`, `/dev/neuron14`, `/dev/neuron15`
trn2.48xlarge	192	2048	16	`/dev/neuron0`, `/dev/neuron1`, `/dev/neuron2`, `/dev/neuron3`, `/dev/neuron4`, `/dev/neuron5`, `/dev/neuron6`, `/dev/neuron7`, `/dev/neuron8`, `/dev/neuron9`, `/dev/neuron10`, `/dev/neuron11`, `/dev/neuron12`, `/dev/neuron13`, `/dev/neuron14`, `/dev/neuron15`
inf1.xlarge	4	8	1	`/dev/neuron0`
inf1.2xlarge	8	16	1	`/dev/neuron0`
inf1.6xlarge	24	48	4	`/dev/neuron0`, `/dev/neuron1`, `/dev/neuron2`, `/dev/neuron3`
inf1.24xlarge	96	192	16	`/dev/neuron0`, `/dev/neuron1`, `/dev/neuron2`, `/dev/neuron3`, `/dev/neuron4`, `/dev/neuron5`, `/dev/neuron6`, `/dev/neuron7`, `/dev/neuron8`, `/dev/neuron9`, `/dev/neuron10`, `/dev/neuron11`, `/dev/neuron12`, `/dev/neuron13`, `/dev/neuron14`, `/dev/neuron15`
inf2.xlarge	4	16	1	`/dev/neuron0`
inf2.8xlarge	32	128	1	`/dev/neuron0`
inf2.24xlarge	96	384	6	`/dev/neuron0`, `/dev/neuron1`, `/dev/neuron2`, `/dev/neuron3`, `/dev/neuron4`, `/dev/neuron5`
inf2.48xlarge	192	768	12	`/dev/neuron0`, `/dev/neuron1`, `/dev/neuron2`, `/dev/neuron3`, `/dev/neuron4`, `/dev/neuron5`, `/dev/neuron6`, `/dev/neuron7`, `/dev/neuron8`, `/dev/neuron9`, `/dev/neuron10`, `/dev/neuron11`

For an example task definition, see Example Neuron task definitions.

Managed Instances

Managed Instances automatically use an AMI that includes the Neuron driver. No additional AMI configuration is required.

EC2 launch type

Amazon ECS provides an Amazon ECS optimized AMI that's based on Amazon Linux 2023 for AWS Trainium and AWS Inferentia workloads. It comes with the AWS Neuron drivers and runtime for Docker. This AMI makes running machine learning inference workloads easier on Amazon ECS.

We recommend using the Amazon ECS-optimized Amazon Linux 2023 (Neuron) AMI when launching your Amazon EC2 Trn1, Inf1, and Inf2 instances.

You can retrieve the current Amazon ECS-optimized Amazon Linux 2023 (Neuron) AMI using the AWS CLI with the following command.


aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2023/neuron/recommended

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Specifying video transcoding in a task definition

Example Neuron task definitions