Building agents with Anthropic SDK
Learn how to assemble a agent using Anthropic SDK
What you’ll build
A resilient Claude agent that streams responses, invokes tools, and ships with production-grade observability. Use this blueprint to bootstrap internal copilots or automate operational workflows with Anthropic’s TypeScript SDK.
Highlights
- Streaming UX
- Tool orchestration
Safety nets
- Typed error paths
- Usage telemetry
Launch-ready
- Deployment checklist
- Troubleshooting
Building Agents with Anthropic SDK
Agents are showing up everywhere—answering support tickets, triaging alerts, even running internal workflows. If you want to harness that momentum with strong safety guarantees, Anthropic’s official TypeScript SDK is the toolbox to reach for. In this guide, we’ll build an end-to-end agent that streams replies, invokes tools, and stays resilient in production.
Quick Start
Want to see the agent in action before we dive deep? Drop this file into a fresh project (npm install @anthropic-ai/sdk dotenv) and run it with tsx quickstart.ts or node --loader ts-node/esm quickstart.ts.
// quickstart.ts
import "dotenv/config";
import { Anthropic } from "@anthropic-ai/sdk";
async function main() {
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });
const stream = await client.messages.stream({
model: "claude-sonnet-4-5",
max_tokens: 400,
system: "You summarise insights crisply.",
messages: [{ role: "user", content: "List three focused ways agents improve internal ops." }],
});
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
const finalMessage = await stream.finalMessage();
console.log("\nDone. Usage:", finalMessage.usage);
}
main().catch((error) => {
console.error("Quick start failed:", error);
process.exitCode = 1;
});
If that prints streaming text and a usage summary, you’re ready for the end-to-end build.
What You’ll Need Before You Start
- Runtime: Node.js 20+ (or Bun/Deno with global
fetch+ streams). - Anthropic API key: keep it in
.envlocally, secrets manager in prod. - Beta access (optional): Needed for server tools, MCP, Files API uploads.
- Toolchain: TypeScript compiler plus dotenv and whichever test runner you trust.
- Experience: Comfortable with async/await and JSON Schema validation.
Think of the final architecture as four collaborating layers:
- SDK Client – one shared instance with disciplined timeouts, retries, and headers.
- Prompt & Goal Manager – crafts system instructions, tracks objectives, enforces safety.
- Tool Registry – maps model-issued
tool_usecalls to your own functions or remote services. - Persistence & Observability – stores transcripts, usage metadata, traces, and error logs.
Step 1. Install and Sanity Check the SDK
npm install @anthropic-ai/sdk typescript @types/node dotenv
Set up TypeScript with a modern tsconfig.json (module ESNext, target ES2022, moduleResolution bundler or node16, skipLibCheck: false).
Create a .env file with ANTHROPIC_API_KEY=.... Load it via dotenv/config during local development, and ensure production pulls the key from an encrypted secret store.
Confirm the key works by listing available models:
// scripts/listModels.ts
import "dotenv/config";
import { Anthropic } from "@anthropic-ai/sdk";
async function listModels() {
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });
const res = await client.models.list();
console.table(res.data.map(({ id, input_token_limit }) => ({
model: id,
contextWindow: input_token_limit,
})));
}
listModels().catch(console.error);
Step 2. Centralise Client Configuration
Keep SDK setup in one place so every part of your agent respects the same policies—timeouts, headers, base URL, and retry limits. Reusing a singleton client also keeps connection pooling hot and ensures every worker shares the same retry/backoff logic.
// src/anthropicClient.ts
import { Anthropic, type ClientOptions } from "@anthropic-ai/sdk";
const defaults: ClientOptions = {
apiKey: process.env.ANTHROPIC_API_KEY!,
timeout: 600_000, // scales with max_tokens for non-streaming calls
maxRetries: 3, // honours Retry-After and Retry-After-Ms headers
defaultHeaders: {
"User-Agent": "demo-agent/1.0",
},
};
let singleton: Anthropic | null = null;
export function getAnthropicClient(
overrides: Partial<ClientOptions> = {},
) {
if (!singleton) {
singleton = new Anthropic({ ...defaults, ...overrides });
}
return singleton;
}
Need to route traffic through a proxy or set a custom fetch? Pass overrides when you instantiate the client in tests, staging, or multi-tenant environments.
Step 3. Shape an Agent Persona and Prompt Loop
Start with a straightforward messages.create() call. We’ll evolve it into a streaming loop later.
// src/agent.ts
import type { MessageParam } from "@anthropic-ai/sdk/resources/messages.mjs";
import { getAnthropicClient } from "./anthropicClient";
export async function runSingleTurn(goal: string) {
const client = getAnthropicClient();
const history: MessageParam[] = [{ role: "user", content: goal }];
const response = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 2048,
system: "You are a pragmatic assistant that documents each reasoning step briefly.",
messages: history,
});
return { answer: response.content, usage: response.usage };
}
Keep every turn’s message payload small and purposeful. Use the system prompt to enforce boundaries (“ask for confirmation before executing irreversible actions”), and only add metadata keys that Anthropic documents—custom fields are rejected today.
Step 4. Stream Responses for Real-Time UX
- Switch to
messages.stream()when you want partial completions. Text arrives throughcontent_block_deltaevents withdelta.type === "text_delta"—pipe those into your UI in real time. - Tool invocations come through
content_block_*events whosecontent_block.type === "tool_use". The SDK streams tool inputs as JSON fragments viainput_json_delta; accumulate them until the matchingcontent_block_stop, parse the JSON, execute your tool, push atool_resultmessage into history, and start the next turn. - Welding all of that together is verbose, which is why many teams lean on the helper shown in the next step.
const pendingTools = new Map<
string,
{ name: string; buffer: string[] }
>();
const toolCalls: Array<{ id: string; name: string; input: unknown }> = [];
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
handlers.onText(event.delta.text);
}
if (event.type === "content_block_start" && event.content_block.type === "tool_use") {
pendingTools.set(event.content_block.id, {
name: event.content_block.name,
buffer: [],
});
}
if (event.type === "content_block_delta" && event.delta.type === "input_json_delta") {
pendingTools.get(event.content_block.id)?.buffer.push(event.delta.partial_json);
}
if (event.type === "content_block_stop" && event.content_block.type === "tool_use") {
const entry = pendingTools.get(event.content_block.id);
toolCalls.push({
id: event.content_block.id,
name: entry?.name ?? "unknown",
input: JSON.parse((entry?.buffer ?? []).join("")),
});
}
}
Feed toolCalls into your execution layer, append the tool_result message, and re-enter the loop.
Before we head there, run a quick smoke test to make sure streaming works end-to-end:
const stream = await client.messages.stream({
model: "claude-sonnet-4-5",
max_tokens: 512,
system: "You narrate your reasoning in one short paragraph.",
messages: [{ role: "user", content: "Give me three steps to prep cold brew at home." }],
});
for await (const event of stream) {
if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
process.stdout.write(event.delta.text);
}
}
console.log("\n---\n", await stream.finalMessage());
If you see incremental text followed by the final message payload, you’re ready to layer in tools.
Step 5. Define and Execute Tools
Tools extend your agent beyond text. Each tool requires a JSON Schema definition so the model understands valid inputs, and your code must supply an execution handler.
const tools = {
lookupCustomer: async ({ id }: { id: string }) => {
const record = await db.customers.findUnique({ where: { id } });
if (!record) return { error: `No customer found for ${id}` };
return { name: record.name, tier: record.tier, openTickets: record.tickets.length };
},
};
In the streaming loop, map tool names to handlers. Always guard side effects—rate-limit sensitive operations, verify auth, and log every invocation for audit trails.
Prefer to skip the manual plumbing altogether? Reach for the SDK’s helpers:
import Anthropic from "@anthropic-ai/sdk";
import { betaZodTool } from "@anthropic-ai/sdk/helpers/zod";
import { z } from "zod";
const lookupCustomer = betaZodTool({
name: "lookupCustomer",
description: "Fetch a customer profile by ID.",
inputSchema: z.object({ id: z.string().uuid("Customer IDs are UUIDs.") }),
run: async ({ id }) => {
const record = await db.customers.findUnique({ where: { id } });
if (!record) {
return { error: `No customer found for ${id}` };
}
return { name: record.name, tier: record.tier, openTickets: record.tickets.length };
},
});
const finalMessage = await client.beta.messages.toolRunner({
model: "claude-sonnet-4-5",
max_tokens: 2048,
system: "You are a pragmatic assistant...",
messages: [{ role: "user", content: "Look up customer 8d047c1e-..." }],
tools: [lookupCustomer],
});
toolRunner() continuously streams events, executes tools, and stitches tool results back into the conversation for you—exactly the workflow we described manually.
Using Anthropic Beta Tools and MCP
- Enable beta features by passing version strings via the
betasarray, for examplebetas: ["code-execution-2025-08-25", "mcp-client-2025-04-04"]. - Call
client.beta.messages.create()or.stream()to access extra block types (containers, computer control, MCP negotiation). - Configure
mcp_serversso the agent can request new tools dynamically—just ensure your server enforces permissions and logs approvals. - Working with the Files API? Include
betas: ["files-api-2025-04-14"]and request access first, otherwise uploads will be rejected.
Step 6. Manage State and Memory
Decide whether your agent is stateless or session-aware:
- Stateless: rebuild the minimal history every turn, relying on upstream context management.
- Session-based: persist transcripts in Redis, Postgres, or a vector store; use token estimators (the SDK publishes usage stats) to keep prompts under context limits.
- Hybrid: store structured summaries (
assistant_summary,action_log) and rehydrate full history only when needed.
Persist the usage object (input_tokens, output_tokens, cache hits) for billing dashboards and analytics.
Step 7. Harden Error Handling & Observability
Wrap agent loops with robust error handling. The SDK exposes typed errors (RateLimitError, APIConnectionError, APIResponseValidationError) so you can branch intelligently:
import { Anthropic } from "@anthropic-ai/sdk";
try {
await runStreamingAgent(goal, handlers);
} catch (error) {
if (error instanceof Anthropic.RateLimitError) {
logger.warn("Rate limit hit", { retryAfter: error.headers["retry-after"] });
} else if (error instanceof Anthropic.APIError) {
logger.error("Anthropic API error", {
status: error.status,
detail: error.error,
});
} else {
throw error;
}
}
- Retries: the SDK automatically retries 408/409/429/5xx with exponential backoff; tune
maxRetriesper environment. - Observability: instrument
await stream.finalMessage()usage metrics,anthropic-request-idheaders, andX-Stainless-Retry-Countmetrics. - Tracing: use
.withResponse()to capture low-level metadata for OpenTelemetry or Honeycomb spans.
Step 8. Test and Evaluate
- Unit tests: mock the client and assert prompt builders, tool handlers, and branching logic.
- Regression harness: record canonical dialogues (fixtures) and replay them on CI before shipping prompt or version changes.
- Evaluation loop: human spot-checking plus automated grading (e.g., pass/fail assertions, LLM judges) to monitor behavioural drift.
- Beta gating: keep beta headers behind feature flags; log beta versions to spot when an API revision lands.
Step 9. Prepare for Deployment
- Containerisation: build minimal Node images (distroless or Alpine) with trusted CAs; ensure your bundle includes fetch polyfills if needed.
- Secrets: rotate API keys regularly; separate read/write keys per environment.
- CI/CD: lint, type-check, run tests, and optionally run synthetic prompts against staging.
- Edge runtimes: if deploying to Cloudflare Workers or similar, bundle the SDK as pure ESM and set
dangerouslyAllowBrowser: trueonly when you accept the security trade-off. - Cost monitoring: surface token usage per tenant, per feature, and alert on spikes.
Troubleshooting Cheat Sheet
- 401 or 403 errors: double-check
ANTHROPIC_API_KEYis loaded (printprocess.env.ANTHROPIC_API_KEY?.slice(0, 4)in dev) and that the key has the right workspace permissions. APIConnectionTimeoutError: confirm outbound HTTPS access, increasetimeout, and retry with smallermax_tokens. Persistent failures often trace back to corporate proxies.- Streams never finish: make sure you drain the async iterator—breaking early requires calling
stream.controller.abort(); otherwise the promise never resolves. - Tool rejections: validate tool schemas; malformed JSON or missing required fields will produce schema errors in the response payload.
- Beta feature denied: the error body usually states the missing beta string—add it to
betasor request access through Anthropic support before retrying. - Unexpected browser warning: set
dangerouslyAllowBrowser: trueonly in trusted environments and rotate keys immediately if you accidentally shipped one client side.
Where to Go Next
- Add document or image ingestion via the Files API for multimodal context.
- Extend orchestration so one agent delegates tasks to specialised sub-agents or background jobs.
- Swap in the Vertex or Bedrock SDK variants to run the same agent logic inside existing cloud governance.
- Explore MCP servers to grant the agent on-demand capabilities under human supervision.
Wrap-Up
You now have a roadmap for turning Anthropic’s TypeScript SDK into a production-grade agent. We covered environment setup, configuration patterns, streaming workflows, tool execution, memory, resilience, testing, and deployment. Before you ship:
- cross-check model strings, betas, and helpers against the latest SDK docs;
- wire up
toolRunner()in a staging branch to validate your schemas; - spin up monitoring dashboards for token usage and retry counts;
- plan a rollback path for your first live deployment.
From here, tailor the blueprint to your use case—maybe add a UI, weave in analytics, or run evaluations on real user goals. Happy building!