Documentation

Milkey Skills SDK

Use @milkeyskills/sdk to connect Milkey skills infrastructure into existing AI products without giving up your current provider client.

@milkeyskills/sdk is the official TypeScript SDK for plugging Milkey skills into real AI products. It helps teams add Milkey-managed skills to OpenAI-compatible, Anthropic, Gemini, and AI SDK-based applications while preserving the current model provider client, prompts, and orchestration stack.

What It Is

The SDK is intentionally thin. It does not replace your model provider, own your prompts, or run your entire agent loop. It wraps Milkey's skill contracts, exposes provider-specific adapters, and gives your app typed access to the resolve → get → execute flow.

  • The SDK owns: Milkey client requests, canonical tool metadata, provider-safe aliases, and adapter formatting.
  • Your app owns: model client creation, prompting, message history, retries, observability, security policy, and business workflows.
  • The key mental model: start with hosted delivery for the simplest connection story; use inline mode when your application needs full control of the tool loop.
Bring Your Own Model Client
The SDK is best framed as Milkey skills infrastructure for multi-provider AI products. You keep your current LLM client. Milkey adds the managed skill layer.

Who Should Use It

  • App teams that want Milkey skills inside an existing OpenAI, Anthropic, Gemini, or AI SDK stack.
  • Internal platform teams that need one Milkey integration across multiple model vendors.
  • SaaS and enterprise products that want provider choice plus application-owned orchestration, logging, and policy.
  • Copilot-style products that need to resolve and fetch the right skill at runtime rather than hardcode prompt packs.

Install

npm install @milkeyskills/sdk

The package is public, versioned, and currently ships at 0.1.3. It requires Node >=20.

For most teams, hosted is the best default because it is the simplest way to connect Milkey skills into an existing AI product. It reduces application-side orchestration work and keeps the setup easier to explain to both developers and business stakeholders.

Use inline only when your application needs to own every step of the tool loop for custom retries, convergence rules, or advanced policy controls.

Recommended hosted setup
const tools = milkey.openai.responses.tools({  client: milkeyClient,  delivery: "hosted",  approvalMode: "never",})
Default Decision
Start with hosted delivery unless you have a concrete reason to own the tool loop yourself. Inline mode is the secondary path for advanced products that need custom execution control.

Core Model

Most implementations follow the same structure:

  1. Create your provider client.
  2. Create the Milkey client with baseUrl and apiKey.
  3. Register Milkey tools with the provider adapter.
  4. Send the first model request.
  5. If the model calls a tool, execute it through Milkey and append the tool result.
  6. Repeat until the model stops requesting tools or you hit a bounded turn count.

Canonical Tools and Aliases

Milkey tool model
resolve-skill            -> milkey_resolve_skillget-skill                -> milkey_get_skillget-skill-reference      -> milkey_get_skill_reference

Providers receive the alias names. Milkey still executes the canonical tool names internally.

Client Creation

createClient
import { milkey } from "@milkeyskills/sdk" const milkeyClient = milkey.createClient({  baseUrl: process.env.MILKEY_BASE_URL!,  apiKey: process.env.MILKEY_API_KEY!,})
  • Required: baseUrl, apiKey.
  • Optional: timeoutMs, headers, userAgent, fetch.
  • Business meaning: these are the knobs teams use for corporate proxies, tracing, edge runtimes, and internal networking controls.

OpenAI Chat Completions

Use Chat Completions when your product already relies on that API or when you intentionally want to own the full inline transcript loop. For new OpenAI integrations, prefer the Responses adapter with hosted delivery first. In chat mode, register tools with milkey.openai.chat.tools(), send the model request, then convert tool calls into tool messages with milkey.openai.chat.messages().

OpenAI Chat inline example
import OpenAI from "openai"import { milkey } from "@milkeyskills/sdk" const openai = new OpenAI({  apiKey: process.env.OPENAI_API_KEY,}) const milkeyClient = milkey.createClient({  baseUrl: process.env.MILKEY_BASE_URL!,  apiKey: process.env.MILKEY_API_KEY!,}) const tools = milkey.openai.chat.tools({  client: milkeyClient,}) const transcript = [  {    role: "user",    content: "Find the best Milkey skill for PostgreSQL query optimization.",  },] for (let turn = 1; turn <= 4; turn += 1) {  const completion = await openai.chat.completions.create({    model: process.env.OPENAI_MODEL ?? "your-model",    messages: transcript,    tools,    tool_choice: "auto",  })   const assistant = completion.choices[0]?.message  if (!assistant) throw new Error("No assistant message returned.")   transcript.push({    role: "assistant",    content: assistant.content ?? "",    tool_calls: assistant.tool_calls?.map((toolCall) => ({      id: toolCall.id,      type: "function",      function: {        name: toolCall.function.name,        arguments: toolCall.function.arguments,      },    })),  })   if (!assistant.tool_calls?.length) break   const toolMessages = await milkey.openai.chat.messages(assistant, milkeyClient)  transcript.push(...toolMessages)}
What Your App Must Own in Inline Mode
The app owns the transcript, the maximum tool-turn count, repeated call detection, and the decision to continue or stop. Assistant messages with no text but with tool calls are normal.

OpenAI Responses API

This is the recommended OpenAI path for most teams. Start with hosted delivery for the cleanest setup, then fall back to inline mode only if your product must own the function-call loop and use milkey.openai.responses.outputs() to convert provider output into function_call_output items.

const tools = milkey.openai.responses.tools({  client: milkeyClient,  delivery: "hosted",  approvalMode: "never",})

Hosted delivery should be the default here. Your product still owns model keys, business logic, tenant controls, and auditing, while Milkey keeps the tool delivery path simpler. Inline remains available when you explicitly need custom loop control.

Anthropic

Anthropic integrations use milkey.anthropic.tools() for tool registration and milkey.anthropic.messages() to convert tool-use blocks into tool_result items. Prefer hosted delivery first and move to inline only when your application needs to own the loop directly.

const anthropicConfig = milkey.anthropic.tools({  client: milkeyClient,  delivery: "hosted",})

Gemini

Gemini integrations register function declarations through milkey.gemini.tools(). When Gemini returns function calls, use milkey.gemini.parts() to convert the results back into Gemini-compatible functionResponse parts. Keep the same rule here: choose the simplest hosted path available in your product architecture first, and use inline orchestration only when you need deeper control.

Gemini example
const tools = milkey.gemini.tools({  client: milkeyClient,}) const result = await model.generateContent({  contents,  tools,}) const calls = extractGeminiFunctionCalls(result)const followUp = await milkey.gemini.parts(calls, milkeyClient)

AI SDK

AI SDK integrations are the most direct. milkey.aiSdk.tools() returns tool objects with description, inputSchema, and execute, so Milkey slots naturally into AI SDK-native tool flows.

AI SDK example
const tools = milkey.aiSdk.tools({  client: milkeyClient,  allowedTools: ["resolve-skill", "get-skill"],}) const response = await generateText({  model,  prompt: "Find the best Milkey skill for PostgreSQL query optimization.",  tools,})

Business Controls

The SDK is designed for production apps, not only demos. The following controls matter for business and enterprise teams:

  • allowedTools: limit tool scope for cost, predictability, and security.
  • timeoutMs: bound Milkey latency and surface timeouts clearly.
  • headers and userAgent: add request tracing, internal gateway headers, and compliance metadata.
  • fetch: route through custom proxies, edge runtimes, or enterprise networking layers.

Good Fits by Team Type

  • Small teams: start with hosted delivery, usually through OpenAI Responses, to minimize orchestration work.
  • Mid-size SaaS products: keep hosted as the default, then use bounded inline loops only for flows that truly need custom control.
  • Enterprise teams: often want custom fetch, tracing headers, approval controls, tenant separation, and stronger audit logging.

Errors

The SDK exports a structured error model so teams can distinguish retryable operational issues from fix-required integration issues.

  • MilkeyConfigError: missing or invalid config values like blank base URL or API key.
  • MilkeyTimeoutError: Milkey request exceeded the configured timeout.
  • MilkeyProblemError: RFC 9457 problem response from Milkey.
  • MilkeyResponseError: non-200 response with raw body content.
  • MilkeyToolCallError: malformed provider tool arguments, such as invalid JSON or non-object inputs.
Retry vs Fix
Timeouts and transient backend failures are often retry candidates. Malformed tool arguments, blank env vars, or invalid reference slug formats usually require code or configuration fixes instead.

Production Checklist

  1. Set explicit timeouts for both the model client and the Milkey client.
  2. Bound the tool loop to 3 or 4 turns.
  3. Detect repeated identical tool calls and break recursion.
  4. Log tool name, arguments, result summary, and error class.
  5. Restrict allowedTools for narrow workflows where possible.
  6. Use environment variables or secret managers. Never hardcode secrets.
  7. Test from a clean external install, not only from the local SDK repo.

For the shipped source, examples, and release context, review the Milkey Skills SDK repository.

© 2026 Milkey MCP