Adaptive thinking
Adaptive thinking is the recommended way to use Extended thinking with Claude Opus 4.6. Instead of manually setting a thinking token budget, adaptive thinking lets Claude dynamically decide when and how much to think based on the complexity of each request. Adaptive thinking reliably drives better performance than extended thinking with a fixed budget_tokens, and we recommend moving to adaptive thinking to get the most intelligent responses from Claude Opus 4.6. No beta header is required.
The supported models are as follows:
| Model | Model ID |
|---|---|
Claude Opus 4.6 |
|
Note
thinking.type: "enabled" and budget_tokens are deprecated on Claude Opus 4.6 and will be removed in a future model release. Use thinking.type: "adaptive" with the effort parameter instead.
Older models (Claude Sonnet 4.5, Claude Opus 4.5, etc.) do not support adaptive thinking and require thinking.type: "enabled" with budget_tokens.
How adaptive thinking works
In adaptive mode, Claude evaluates the complexity of each request and decides whether and how much to think. At the default effort level (high), Claude will almost always think. At lower effort levels, Claude may skip thinking for simpler problems.
Adaptive thinking also automatically enables Interleaved thinking (beta). This means Claude can think between tool calls, making it especially effective for agentic workflows.
Set thinking.type to "adaptive" in your API request:
Adaptive thinking with the effort parameter
You can combine adaptive thinking with the effort parameter to guide how much thinking Claude does. The effort level acts as soft guidance for Claude's thinking allocation:
| Effort level | Thinking behavior |
|---|---|
max |
Claude always thinks with no constraints on thinking depth. Claude Opus 4.6 only — requests using max on other models will return an error. |
high (default) |
Claude always thinks. Provides deep reasoning on complex tasks. |
medium |
Claude uses moderate thinking. May skip thinking for very simple queries. |
low |
Claude minimizes thinking. Skips thinking for simple tasks where speed matters most. |
Prompt caching
Consecutive requests using adaptive thinking preserve prompt cache breakpoints. However, switching between adaptive and enabled/disabled thinking modes breaks cache breakpoints for messages. System prompts and tool definitions remain cached regardless of mode changes.
Tuning thinking behavior
If Claude is thinking more or less often than you'd like, you can add guidance to your system prompt:
Extended thinking adds latency and should only be used when it will meaningfully improve answer quality — typically for problems that require multi-step reasoning. When in doubt, respond directly.
Warning
Steering Claude to think less often may reduce quality on tasks that benefit from reasoning. Measure the impact on your specific workloads before deploying prompt-based tuning to production. Consider testing with lower effort levels first.