Track usage and costs in Amazon Bedrock
Amazon Bedrock provides multiple ways to attribute model inference usage and costs to specific users, teams, applications, environments, or experiments. You can use a single mechanism or combine several. For example, use IAM principal attribution for per-user visibility alongside projects for per-application tagging, and request metadata for per-call experiment tracking.
Tip
If you're not sure which mechanism fits your use case, start with the Frequently asked questions at the end of this chapter. It answers common decision questions like "I want per-user, per-prompt attribution — what are my choices?" and "What's the difference between classic CUR and CUR 2.0?".
Choosing an approach
The cost attribution method you choose depends on which dimension you want to track, which Amazon Bedrock APIs you use, and what level of granularity you need. The following two tables present complementary views. Use the first to look up mechanisms by your goal, and the second to compare mechanisms side by side.
Choose by goal
If you know what you want out of cost tracking, start here.
| If your goal is… | Use |
|---|---|
| Per-user or per-team dollars on your bill | IAM principal attribution |
| Per-application or per-workload dollars | Application inference profiles (bedrock-runtime), or Projects and Workspaces (bedrock-mantle) |
| Per-prompt token usage and cost, sliced by any dimension | Per-request metadata tagging, with model invocation logs |
| Per-user and per-prompt detail | Model invocation logs, with the user taken from the identity ARN or a request-metadata tag |
| Both invoice-accurate dollars and per-prompt detail | Combine a native method (for example, IAM principal attribution) with Per-request metadata tagging |
Compare mechanisms
The following table compares the available mechanisms by what they let you attribute by, what they output, the granularity of that output, where the data is delivered, and which endpoints they support.
| Mechanism | Attribute by | Output | Granularity | Data destination | Supported APIs | bedrock-runtime |
bedrock-mantle |
|---|---|---|---|---|---|---|---|
| IAM principal attribution | IAM identity | Billed dollars | Aggregated, per usage type per day | AWS Cost Explorer / CUR 2.0 | InvokeModel, Converse, Chat Completions | ||
| Application inference profiles | Profile resource tags | Billed dollars | Aggregated, per usage type per day | AWS Cost Explorer / CUR 2.0 | InvokeModel, Converse, Chat Completions | ||
| Projects | Project resource tags | Billed dollars | Aggregated, per usage type per day | AWS Cost Explorer / CUR 2.0 | Responses, Chat Completions | ||
| Workspaces | Project resource tags via workspace header | Billed dollars | Aggregated, per usage type per day | AWS Cost Explorer / CUR 2.0 | Anthropic Messages | ||
| Per-request metadata tagging | Per-request key-value tags | Token counts (you convert to cost) | Per request | Invocation logs only | InvokeModel, InvokeModelWithResponseStream, Converse, ConverseStream |
Note
The native methods (IAM principal attribution, Application inference profiles, Projects, and Workspaces) deliver aggregated billed dollars to AWS Cost Explorer and CUR 2.0. The finest grain is per usage type per day, attributed by identity or tag; they do not produce a per-request row. For per-prompt detail, use model invocation logs, where each call is a separate record carrying its own token counts.
Attribution behind an LLM gateway
When a gateway or proxy calls Amazon Bedrock on behalf of many users, Amazon Bedrock records the gateway's IAM role as the caller's identity. To preserve user-level attribution, choose based on the output you need.
-
For per-user dollars in your billing tools, have the gateway assume its Amazon Bedrock role per user or tenant, using a per-user
RoleSessionNameor session tags. Cache the resulting credentials for the session lifetime to avoid an AWS STS call on every request. For more information, see IAM principal attribution. -
For per-prompt detail, set the user in request metadata on each call. Request metadata varies per request without additional AWS STS calls, which session tags cannot do on a shared session.