Use a deployment for on-demand inference

After you deploy your custom model, you use the deployment's Amazon Resource Name (ARN) as the modelId parameter when you submit prompts and generate responses with model inference.

For information about making inference requests, see the following topics:

Submit prompts and generate responses with model inference
Prerequisites for running model inference
Submit prompts and generate responses using the API

Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Deploy a custom model

Delete a custom model deployment