API compatibility - Amazon Bedrock
Services or capabilities described in AWS documentation might vary by Region. To see the differences applicable to the AWS European Sovereign Cloud Region, see the AWS European Sovereign Cloud User Guide.

API compatibility

Amazon Bedrock supports four families of runtime APIs, each designed for different integration patterns and use cases.

Invoke family: InvokeModel handles synchronous, single-response calls. InvokeModelWithResponseStream returns responses as a real-time stream. InvokeModelWithBidirectionalStream enables full-duplex streaming for interactive applications. AsyncInvoke submits long-running requests asynchronously, storing output to Amazon S3.

Converse family: Converse provides a unified, model-agnostic interface for synchronous multi-turn conversations. ConverseStream delivers the same experience with streaming output.

OpenAI-compatible family: ChatCompletions implements the OpenAI Chat Completions interface, enabling existing OpenAI-based integrations to run on Bedrock with minimal changes. Responses API implements the OpenAI Responses interface, supporting stateful, agentic interactions with built-in tool use and conversation history management.

Messages family: Messages implements the Anthropic Messages interface on the bedrock-mantle endpoint, enabling existing Anthropic SDK-based integrations to run on Bedrock with minimal changes.

We will now look at the list of APIs supported by each model.

AI21

Model name Invoke Converse Chat Completions Responses Messages
Jamba 1.5 Large*
Jamba 1.5 Mini*

Amazon

Anthropic

Cohere

Model name Invoke Converse Chat Completions Responses Messages
Command R*
Command R+*
Embed English
Embed Multilingual
Embed v4
Rerank 3.5

DeepSeek

Model name Invoke Converse Chat Completions Responses Messages
DeepSeek V3.2*
DeepSeek-R1*
DeepSeek-V3.1*

Google

Model name Invoke Converse Chat Completions Responses Messages
Gemma 3 12B IT*
Gemma 3 27B PT*
Gemma 3 4B IT*

Meta

MiniMax

Model name Invoke Converse Chat Completions Responses Messages
MiniMax M2*
MiniMax M2.1*
MiniMax M2.5*

Mistral

Moonshot

Model name Invoke Converse Chat Completions Responses Messages
Kimi K2 Thinking*
Kimi K2.5*

NVIDIA

Model name Invoke Converse Chat Completions Responses Messages
NVIDIA Nemotron Nano 9B v2*
NVIDIA Nemotron Nano 12B v2 VL BF16*
Nemotron Nano 3 30B*
NVIDIA Nemotron 3 Super 120B*

OpenAI

Model name Invoke Converse Chat Completions Responses Messages
GPT OSS Safeguard 120B*
GPT OSS Safeguard 20B*
gpt-oss-120b*
gpt-oss-20b*

Qwen

Stability

TwelveLabs

Model name Invoke Converse Chat Completions Responses Messages
Marengo Embed 3.0
Marengo Embed v2.7
Pegasus v1.2

Writer

Model name Invoke Converse Chat Completions Responses Messages
Palmyra Vision 7B
Palmyra X4*
Palmyra X5*

Z.AI

Model name Invoke Converse Chat Completions Responses Messages
GLM 4.7*
GLM 4.7 Flash*
GLM 5*
Note

* Streaming Support: Models marked with an asterisk (*) also support InvokeModelWithResponseStream, which returns responses as a real-time stream.

Models supporting StartAsyncInvoke

StartAsyncInvoke is an Amazon Bedrock Runtime API that allows callers to submit a model invocation request and immediately receive back an invocationArn without waiting for the model to finish processing. The job runs in the background, and the output is written to a caller-specified S3 bucket once complete. Callers can then poll job status using the companion GetAsyncInvoke and ListAsyncInvokes APIs. The pattern is purpose-built for workloads involving large or latency-insensitive inputs, particularly video, audio, and bulk embedding generation, where holding an open synchronous connection would be impractical.

In terms of which models support it, the following models support StartAsyncInvoke:

  • TwelveLabs Marengo Embed 2.7 (twelvelabs.marengo-embed-2-7-v1:0) — required for video and audio input; InvokeModel only handles text and image

  • TwelveLabs Marengo Embed 3.0 (twelvelabs.marengo-embed-3-0-v1:0) — same pattern; async required for video/audio at scale

  • Amazon Nova Reel (amazon.nova-reel-v1:0 and v1:1) — video generation is exclusively async; output lands in S3

  • Amazon Nova Multimodal Embeddings (amazon.nova-2-multimodal-embeddings-v1:0) — async is required for video inputs larger than 25MB base64-encoded; sync is available for text, image, and document inputs

InvokeModelWithBidirectionalStream

InvokeModelWithBidirectionalStream is an Amazon Bedrock Runtime API that establishes a persistent, full-duplex channel between the caller and the model, allowing audio data to flow in both directions simultaneously and continuously. Unlike the standard InvokeModel or even InvokeModelWithResponseStream APIs, which follow a request-then-response pattern, this API keeps the connection open for the duration of a session so that the model can process incoming audio as it arrives and stream generated speech back in near real-time, without waiting for a complete utterance to finish. The interaction is structured around three phases: session initialization (where the client sends configuration events to set up the stream), audio streaming (where captured audio is encoded and sent as a continuous event stream), and response streaming (where the model simultaneously returns text transcriptions of user speech and synthesized audio output). InvokeModelWithBidirectionalStream cannot be used with Amazon Bedrock API keys and requires standard AWS credential-based authentication, reflecting its more complex session lifecycle compared to other Bedrock Runtime operations.

The following models support this API:

  • Amazon Nova Sonic family: Both amazon.nova-sonic-v1:0 and amazon.nova-2-sonic-v1:0 use it as their sole invocation path, since the speech-to-speech architecture fundamentally requires a live bidirectional channel that neither InvokeModel nor Converse can provide.