agentuity · parteeksingh24 · Jan 16, 2026 · Jan 21, 2026 · Jan 21, 2026 · Jan 21, 2026
diff --git a/content/Agents/creating-agents.mdx b/content/Agents/creating-agents.mdx
@@ -197,7 +197,15 @@ All implement [StandardSchema](https://github.com/standard-schema/standard-schem
 
 ### Type Inference
 
-TypeScript automatically infers types from your schemas:
+TypeScript automatically infers types from your schemas. Don't add explicit type annotations to handler parameters:
+
+```typescript
+// Good: types inferred from schema
+handler: async (ctx, input) => { ... }
+
+// Bad: explicit types can cause issues
+handler: async (ctx: AgentContext, input: MyInput) => { ... }
+```
 
 ```typescript
 const agent = createAgent('Search', {
@@ -364,6 +372,10 @@ handler: async (ctx, input) => {
 
 ## Next Steps
 
+<Callout type="tip" title="AI-Assisted Development">
+The [OpenCode plugin](/Reference/CLI/opencode-plugin) provides AI-assisted development for full-stack Agentuity projects, including agents, routes, frontend, and deployment.
+</Callout>
+
 - [Using the AI SDK](/Agents/ai-sdk-integration): Add LLM capabilities with generateText and streamText
 - [Managing State](/Agents/state-management): Persist data across requests with thread and session state
 - [Calling Other Agents](/Agents/calling-other-agents): Build multi-agent workflows
diff --git a/content/Agents/evaluations.mdx b/content/Agents/evaluations.mdx
@@ -5,6 +5,21 @@ description: Automatically test and validate agent outputs for quality and compl
 
 Evaluations (evals) are automated tests that run after your agent completes. They validate output quality, check compliance, and monitor performance without blocking agent responses.
 
+## Why Evals?
+
+Most evaluation tools test the LLM: did the model respond appropriately? That's fine for chatbots, but agents aren't single LLM calls. They're entire runs with multiple model calls, tool executions, and orchestration working together.
+
+Agent failures can happen anywhere in the run—a tool call that returned bad data, a state bug that corrupted context, and more. Testing just the LLM response misses most of this.
+
+Agentuity evals test the whole run—every tool call, state change, and orchestration step. They run on every session in production, so you catch issues with real traffic.
+
+**The result:**
+
+- **Full-run evaluation**: Test the entire agent execution, not just LLM responses
+- **Production monitoring**: Once configured, evals run automatically on every session
+- **Async by default**: Evals don't block responses, so users aren't waiting
+- **Preset library**: Common checks (PII, safety, hallucination) available out of the box
+
 Evals come in two types: **binary** (pass/fail) for yes/no criteria, and **score** (0-1) for quality gradients.
 
 <Callout type="info" title="Where Scores Appear">
@@ -426,6 +441,37 @@ export const politenessCheck = agent.createEval(politeness({
 
 All preset evals use a default model optimized for cost and speed. Override `model` when you need specific capabilities.
 
+### Lifecycle Hooks
+
+Preset evals support `onStart` and `onComplete` hooks for custom logic around eval execution:
+
+```typescript
+import { politeness } from '@agentuity/evals';
+
+export const politenessCheck = agent.createEval(politeness({
+  onStart: async (ctx, input, output) => {
+    ctx.logger.info('Starting politeness eval', {
+      inputLength: input.request?.length,
+    });
+  },
+  onComplete: async (ctx, result) => {
+    // Track results in external monitoring
+    if (!result.passed) {
+      ctx.logger.warn('Politeness check failed', {
+        score: result.score,
+        reason: result.reason,
+      });
+    }
+  },
+}));
+```
+
+**Use cases for lifecycle hooks:**
+- Log eval execution for debugging
+- Send results to external monitoring systems
+- Track eval performance metrics
+- Trigger alerts on failures
+
 ### Schema Middleware
 
 Preset evals expect a standard input/output format:

diff --git a/content/Agents/schema-libraries.mdx b/content/Agents/schema-libraries.mdx
@@ -103,6 +103,29 @@ type User = s.infer<typeof User>;
 // { name: string; age: number; role: 'admin' | 'user' }
 ```
 
+### JSON Schema Generation
+
+Convert schemas to JSON Schema for use with LLM structured output:
+
+```typescript
+import { s } from '@agentuity/schema';
+
+const ResponseSchema = s.object({
+  answer: s.string(),
+  confidence: s.number(),
+});
+
+// Generate JSON Schema
+const jsonSchema = s.toJSONSchema(ResponseSchema);
+
+// Generate strict JSON Schema for LLM structured output
+const strictSchema = s.toJSONSchema(ResponseSchema, { strict: true });
+```
+
+<Callout type="tip" title="Strict Mode for LLMs">
+Use `{ strict: true }` when generating schemas for LLM structured output (e.g., OpenAI's `response_format`). Strict mode ensures the schema is compatible with model constraints and produces more reliable outputs.
+</Callout>
+
 <Callout type="info" title="When to Use">
 Use `@agentuity/schema` for simple validation needs. For advanced features like email validation, string length constraints, or complex transformations, consider Zod or Valibot.
 </Callout>

diff --git a/content/Agents/standalone-execution.mdx b/content/Agents/standalone-execution.mdx
@@ -16,10 +16,20 @@ import { createAgentContext } from '@agentuity/runtime';
 import chatAgent from '@agent/chat';
 
 const ctx = createAgentContext();
-const result = await ctx.invoke(() => chatAgent.run({ message: 'Hello' }));
+const result = await ctx.run(chatAgent, { message: 'Hello' });
 ```
 
-The `invoke()` method executes your agent with full infrastructure support: tracing, session management, and access to all storage services.
+The `run()` method executes your agent with full infrastructure support: tracing, session management, and access to all storage services.
+
+For agents that don't require input:
+
+```typescript
+const result = await ctx.run(statusAgent);
+```
+
+<Callout type="info" title="Legacy invoke() Method">
+The older `ctx.invoke(() => agent.run(input))` pattern still works but `ctx.run(agent, input)` is preferred for its cleaner syntax.
+</Callout>
 
 ## Options
 
@@ -45,10 +55,7 @@ await createApp();
 // Run cleanup every hour
 cron.schedule('0 * * * *', async () => {
   const ctx = createAgentContext({ trigger: 'cron' });
-
-  await ctx.invoke(async () => {
-    await cleanupAgent.run({ task: 'expired-sessions' });
-  });
+  await ctx.run(cleanupAgent, { task: 'expired-sessions' });
 });
 ```
 
@@ -58,35 +65,33 @@ For most scheduled tasks, use the [`cron()` middleware](/Routes/cron) instead. I
 
 ## Multiple Agents in Sequence
 
-Run multiple agents within a single `invoke()` call to share the same session and tracing context:
+Run multiple agents in sequence with the same context:
 
 ```typescript
 const ctx = createAgentContext();
 
-const result = await ctx.invoke(async () => {
-  // First agent analyzes the input
-  const analysis = await analyzeAgent.run({ text: userInput });
-
-  // Second agent generates response based on analysis
-  const response = await respondAgent.run({
-    analysis: analysis.summary,
-    sentiment: analysis.sentiment,
-  });
+// First agent analyzes the input
+const analysis = await ctx.run(analyzeAgent, { text: userInput });
 
-  return response;
+// Second agent generates response based on analysis
+const response = await ctx.run(respondAgent, {
+  analysis: analysis.summary,
+  sentiment: analysis.sentiment,
 });
 ```
 
+Each `ctx.run()` call shares the same session and tracing context.
+
 ## Reusing Contexts
 
 Create a context once and reuse it for multiple invocations:
 
 ```typescript
 const ctx = createAgentContext({ trigger: 'websocket' });
 
-// Each invoke() gets its own session and tracing span
+// Each run() gets its own session and tracing span
 websocket.on('message', async (data) => {
-  const result = await ctx.invoke(() => messageAgent.run(data));
+  const result = await ctx.run(messageAgent, data);
   websocket.send(result);
 });
 ```
@@ -104,6 +109,28 @@ Standalone contexts provide the same infrastructure as HTTP request handlers:
 - **Session events**: Start/complete events for observability
 </Callout>
 
+## Detecting Runtime Context
+
+Use `isInsideAgentRuntime()` to check if code is running within the Agentuity runtime:
+
+```typescript
+import { isInsideAgentRuntime, createAgentContext } from '@agentuity/runtime';
+import myAgent from '@agent/my-agent';
+
+async function processRequest(data: unknown) {
+  if (isInsideAgentRuntime()) {
+    // Already in runtime context, call agent directly
+    return myAgent.run(data);
+  }
+
+  // Outside runtime, create context first
+  const ctx = createAgentContext();
+  return ctx.run(myAgent, data);
+}
+```
+
+This is useful for writing utility functions that work both inside agent handlers and in standalone scripts.
+
 ## Next Steps
 
 - [Calling Other Agents](/Agents/calling-other-agents): Agent-to-agent communication patterns

diff --git a/content/Agents/streaming-responses.mdx b/content/Agents/streaming-responses.mdx
@@ -50,6 +50,10 @@ Streaming requires both: `schema.stream: true` in your agent (so the handler ret
 
 Enable streaming by setting `stream: true` in your schema and returning a `textStream`:
 
+<Callout type="info" title="AI SDK Integration">
+The `textStream` from AI SDK's `streamText()` works directly with Agentuity's streaming middleware. Return it from your handler without additional processing.
+</Callout>
+
 ```typescript
 import { createAgent } from '@agentuity/runtime';
 import { streamText } from 'ai';

diff --git a/content/Agents/workbench.mdx b/content/Agents/workbench.mdx
@@ -5,6 +5,19 @@ description: Use the built-in development UI to test agents, validate schemas, a
 
 Workbench is a built-in UI for testing your agents during development. It automatically discovers your agents, displays their input/output schemas, and lets you execute them with real inputs.
 
+## Why Workbench?
+
+Testing agents isn't like testing traditional APIs. You need to validate input schemas, see how responses format, test multi-turn conversations, and understand execution timing. Using `curl` or Postman means manually constructing JSON payloads and parsing responses.
+
+Workbench understands your agents. It reads your schemas, generates test forms, maintains conversation threads, and shows execution metrics. When something goes wrong, you see exactly what the agent received and returned.
+
+**Key capabilities:**
+
+- **Schema-aware testing**: Input forms generated from your actual schemas
+- **Thread persistence**: Test multi-turn conversations without manual state tracking
+- **Execution metrics**: See token usage and response times for every request
+- **Quick iteration**: Test prompts display in the UI for one-click execution
+
 ## Enabling Workbench
 
 Add a `workbench` section to your `agentuity.config.ts`:

diff --git a/content/Get-Started/quickstart.mdx b/content/Get-Started/quickstart.mdx
@@ -212,6 +212,14 @@ After your first deployment, the App populates with:
 
 ## What's Next?
 
+<Callout type="tip" title="AI-Assisted Development">
+Install the [OpenCode plugin](/Reference/CLI/opencode-plugin) for AI-assisted agent development. Get help writing agents, debugging, and deploying directly from your editor:
+
+```bash
+agentuity ai opencode install
+```
+</Callout>
+
 **Learn the concepts:**
 
 - [Understanding How Agents Work](/Learn/Cookbook/Tutorials/understanding-agents): Tools, loops, and autonomous behavior