Choosing an approach Attribution behind an LLM gateway

Track usage and costs in Amazon Bedrock

Amazon Bedrock provides multiple ways to attribute model inference usage and costs to specific users, teams, applications, environments, or experiments. You can use a single mechanism or combine several. For example, use IAM principal attribution for per-user visibility alongside projects for per-application tagging, and request metadata for per-call experiment tracking.

Tip

If you're not sure which mechanism fits your use case, start with the Frequently asked questions at the end of this chapter. It answers common decision questions like "I want per-user, per-prompt attribution — what are my choices?" and "What's the difference between classic CUR and CUR 2.0?".

Choosing an approach

The cost attribution method you choose depends on which dimension you want to track, which Amazon Bedrock APIs you use, and what level of granularity you need. The following two tables present complementary views. Use the first to look up mechanisms by your goal, and the second to compare mechanisms side by side.

Choose by goal

If you know what you want out of cost tracking, start here.

If your goal is…	Use
Per-user or per-team dollars on your bill	IAM principal attribution
Per-application or per-workload dollars	Application inference profiles (`bedrock-runtime`), or Projects and Workspaces (`bedrock-mantle`)
Per-prompt token usage and cost, sliced by any dimension	Per-request metadata tagging, with model invocation logs
Per-user and per-prompt detail	Model invocation logs, with the user taken from the `identity` ARN or a request-metadata tag
Both invoice-accurate dollars and per-prompt detail	Combine a native method (for example, IAM principal attribution) with Per-request metadata tagging

Compare mechanisms

The following table compares the available mechanisms by what they let you attribute by, what they output, the granularity of that output, where the data is delivered, and which endpoints they support.

Mechanism	Attribute by	Output	Granularity	Data destination	Supported APIs
IAM principal attribution	IAM identity	Billed dollars	Aggregated, per usage type per day	AWS Cost Explorer / CUR 2.0	InvokeModel, Converse, Chat Completions
Application inference profiles	Profile resource tags	Billed dollars	Aggregated, per usage type per day	AWS Cost Explorer / CUR 2.0	InvokeModel, Converse, Chat Completions
Projects	Project resource tags	Billed dollars	Aggregated, per usage type per day	AWS Cost Explorer / CUR 2.0	Responses, Chat Completions
Workspaces	Project resource tags via workspace header	Billed dollars	Aggregated, per usage type per day	AWS Cost Explorer / CUR 2.0	Anthropic Messages
Per-request metadata tagging	Per-request key-value tags	Token counts (you convert to cost)	Per request	Invocation logs only	InvokeModel, InvokeModelWithResponseStream, Converse, ConverseStream

Note

The native methods (IAM principal attribution, Application inference profiles, Projects, and Workspaces) deliver aggregated billed dollars to AWS Cost Explorer and CUR 2.0. The finest grain is per usage type per day, attributed by identity or tag; they do not produce a per-request row. For per-prompt detail, use model invocation logs, where each call is a separate record carrying its own token counts.

Attribution behind an LLM gateway

When a gateway or proxy calls Amazon Bedrock on behalf of many users, Amazon Bedrock records the gateway's IAM role as the caller's identity. To preserve user-level attribution, choose based on the output you need.

For per-user dollars in your billing tools, have the gateway assume its Amazon Bedrock role per user or tenant, using a per-user RoleSessionName or session tags. Cache the resulting credentials for the session lifetime to avoid an AWS STS call on every request. For more information, see IAM principal attribution.
For per-prompt detail, set the user in request metadata on each call. Request metadata varies per request without additional AWS STS calls, which session tags cannot do on a shared session.

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Best practices

IAM principal attribution