Documentation
Inline vs Hosted
How to choose the right SDK delivery mode without hidden magic.
The SDK supports additive `mode` selection where the provider surface allows it. The goal is to make the best path obvious, but still keep the final decision inspectable and under application control.
Mode model
- inline: your app registers tool schemas, executes tool calls, and owns the bounded loop.
- hosted: the SDK returns provider-native hosted MCP config and keeps the path thinner when the provider supports it.
- auto: the SDK resolves to the best supported path for that exact provider/API variant.
When to use inline
- Your application already owns transcript state and loop control.
- You need custom stop conditions, retries, or convergence rules.
- The provider path is inline-only, such as Gemini `generateContent` or AI SDK.
TS
const tools = milkey.openai.chat.tools({ client: milkeyClient, mode: "inline",})When to use hosted
- You want the cleanest provider-native MCP delivery path.
- The provider surface has a strong hosted story, such as OpenAI Responses, Anthropic Messages, or Gemini Interactions.
- You want to reduce the amount of application-side tool loop wiring.
TS
const tools = milkey.openai.responses.tools({ client: milkeyClient, mode: "hosted", approvalMode: "never",})Auto mode
`auto` is not a fuzzy heuristic. It resolves using the SDK capability table for the exact provider/API variant.
TS
const resolution = milkey.openai.responses.resolveMode({ mode: "auto",}) console.log(resolution.selectedMode) // "hosted"Design intent
`auto` exists to remove avoidable friction, not to hide behavior. If a path does not support a mode, the SDK throws early instead of silently guessing.