Documentation

Inline vs Hosted

How to choose the right SDK delivery mode without hidden magic.

The SDK supports additive `mode` selection where the provider surface allows it. The goal is to make the best path obvious, but still keep the final decision inspectable and under application control.

Mode model

  • inline: your app registers tool schemas, executes tool calls, and owns the bounded loop.
  • hosted: the SDK returns provider-native hosted MCP config and keeps the path thinner when the provider supports it.
  • auto: the SDK resolves to the best supported path for that exact provider/API variant.

When to use inline

  • Your application already owns transcript state and loop control.
  • You need custom stop conditions, retries, or convergence rules.
  • The provider path is inline-only, such as Gemini `generateContent` or AI SDK.
TS
const tools = milkey.openai.chat.tools({  client: milkeyClient,  mode: "inline",})

When to use hosted

  • You want the cleanest provider-native MCP delivery path.
  • The provider surface has a strong hosted story, such as OpenAI Responses, Anthropic Messages, or Gemini Interactions.
  • You want to reduce the amount of application-side tool loop wiring.
TS
const tools = milkey.openai.responses.tools({  client: milkeyClient,  mode: "hosted",  approvalMode: "never",})

Auto mode

`auto` is not a fuzzy heuristic. It resolves using the SDK capability table for the exact provider/API variant.

TS
const resolution = milkey.openai.responses.resolveMode({  mode: "auto",}) console.log(resolution.selectedMode) // "hosted"
Design intent
`auto` exists to remove avoidable friction, not to hide behavior. If a path does not support a mode, the SDK throws early instead of silently guessing.

© 2026 Milkey MCP