# AI Generate

## Overview

The AI Generate node simplifies AI implementation in your Flows by providing a universal interface to multiple AI service providers. When paired with [AI Routes](/documentation-and-resources/components-and-data/ai-routes.md) and [AI Connections](/documentation-and-resources/components-and-data/connections/types-of-connections/ai-connections.md), it handles the complexity of provider-specific APIs, authentication, and failover scenarios automatically.

### What It Does

* Calls AI providers using a single, standardized input format regardless of provider
* Returns responses in a consistent, structured format with rich details
* Automatically handles provider failover through [AI Routes](/documentation-and-resources/components-and-data/ai-routes.md)
* Eliminates the need to refactor flows when switching providers or models

### Benefits

* **Provider-agnostic** - Switch providers or models without refactoring your flows
* **Automatic failover** - If one provider fails, [AI Routes](/documentation-and-resources/components-and-data/ai-routes.md) automatically try the next configured provider
* **Consistent interface** - Same input/output format across all providers
* **Rich response details** - Access token usage, model information, and execution details
* **Production-ready** - Works seamlessly with versioned [AI Routes](/documentation-and-resources/components-and-data/ai-routes.md) packaged in Services

### Prerequisites

Before using the AI Generate node, you'll need:

1. **AI Connection(s)** - [AI Connections](/documentation-and-resources/components-and-data/connections/types-of-connections/ai-connections.md) to one or more AI providers (OpenAI, Anthropic, Azure OpenAI, Google AI, Vertex AI, Vertex AI Anthropic)
2. **AI Route** - An [AI Route](/documentation-and-resources/components-and-data/ai-routes.md) that references your AI Connection(s) and specifies which model(s) to use

### Basic Setup

1. Drag the AI Generate node from the **AI Gateway** category in the Flow Editor palette onto your canvas
2. Double-click the node to open the configuration panel
3. **Select an AI Route**:
   * Choose from the dropdown of AI Routes created in your tenant
   * Or dynamically specify using a msg property, environment variable, or static string
4. **Specify the Input** - Designate the msg object property that contains your input data (prompt, temperature, file data, expected schema, etc.)
5. **Optional**: Configure tools (selected tool nodes, max steps, dynamic tool filtering), timeout, Output, and Response property names

## Tool Calling with AI Generate

AI Generate can invoke tools defined in the same flow. Tools let the model call specific functions (API calls, lookups, transformations) and then continue generation using the tool output.

### Quick Setup

1. From the Flow Editor left palette, add **Tool** and **Tool Response** nodes from the **AI Gateway** category.
2. Configure each Tool node with a unique **Tool ID** and an **Input Schema**.
3. Build your tool logic downstream of each Tool node, then end the path with **Tool Response** to return output.
4. In **AI Generate**, open **Tools** and choose **Selected Tool Nodes**, then select the Tool nodes you want available to the model.

### Configuration Details

* **Tools -> Selected Tool Nodes** - Pick the Tool nodes to expose to the model. Tool IDs must be unique.
* **Max steps** - Caps the number of tool calls in a single request, so a tool chain does not run indefinitely.
* **Enable dynamic tool filtering** - Provide a msg property that contains a list of Tool IDs and choose **Include only** or **Exclude** to control which tools are available per request.

For detailed Tool configuration, see [AI Tool](/documentation-and-resources/components-and-data/flows/node-reference/ai-gateway/ai-tool.md) and [AI Tool Response](/documentation-and-resources/components-and-data/flows/node-reference/ai-gateway/ai-tool-response.md).

### Execution Behavior

When a provider returns multiple tool calls in one model turn, AI Generate executes them in parallel by default.

For providers that support overrides, you can pass provider-specific options in the request (for example, `providerOptions.openai.parallelToolCalls: false`) to force one tool call per turn.

* Tool call events are dispatched in the same order returned by the provider, without waiting for earlier tool calls to finish.
* AI Generate waits until all pending tool calls are resolved before replying to the model for the next turn.

## Input Format

The AI Generate node uses **one universal, standardized input format** regardless of which AI provider your AI Route calls. This standardization means changing providers or models requires **little to no refactoring** of your surrounding flow logic.

Your input msg object might contain:

* One of the following is required:
  * `prompt` - simple text prompt (use this OR `messages` array)
  * `messages` - array of messages (use this OR `prompt`), which are required in order to pass `base64` file content
* `schema` - JSON Schema for structured output. Providing `schema` selects structured object generation; if `output` is omitted, it defaults to `"object"`. When `output` is `"object"` or `"array"`, `schema` is required.
* `system` - (optional) system instructions to guide the model's behavior
* `temperature` - (optional) Control randomness in responses
* `maxTokens` - maximum token to generate
* `topP` - Nucleus sampling parameter
* `topK` - Top-K sampling parameter
* `frequencyPenalty` - Penalty for repeated tokens
* `presencePenalty` - Penalty for repeated topics
* `stopSequences` - Array of sequences that stop generation
* `seed` - Random seed for deterministic output
* Other provider parameters as needed

### Retry Behavior

The following retry and failover behavior applies to both text and object generation:

* Malformed responses, internal server errors (`500`), too many requests (`429`), and similar transient failures are treated as retryable errors.
* Unknown errors, bad requests (`400`), and auth issues (`401`/`403`) shortcut to the next provider without same-provider retries.
* Retries happen on the same provider first, based on the `maxRetries` setting on your AI Route.
* If retries are exhausted and fallback providers are configured, the AI Route cascades to the next provider.
* All retries use exponential backoff with random jitter, with no delay when falling back to a new provider.
* Failed attempts are still captured in `providerErrors` and `steps` in the final output.

#### Object Generation Special Case

In object generation mode, the model response must be valid JSON and match your schema. Invalid JSON (or output that fails schema validation) is treated as a retryable generation error.

### Input Examples - Text Generation

**Example: Simple Prompt**

```json
msg.payload = {
  prompt: "What is 2+2? Answer in one sentence.",
  temperature: 0.3,
  maxTokens: 50
};
```

**Example: Messages Array**

```json
msg.payload = {
  messages: [
    {
      role: "user",
      content: "Explain photosynthesis in two sentences."
    }
  ],
  temperature: 0.5,
  maxTokens: 100
};
```

**Example: With System Message**

```json
msg.payload = {
  system: "You are a concise math tutor. Always show your work.",
  prompt: "What is 15 * 23?",
  temperature: 0.3,
  maxTokens: 100
};
```

**Example: Multi-Turn Conversation**

```json
msg.payload = {
  messages: [
    {
      role: "user",
      content: "What's the capital of France?"
    },
    {
      role: "assistant",
      content: "The capital of France is Paris."
    },
    {
      role: "user",
      content: "What's its population?"
    }
  ],
  temperature: 0.3,
  maxTokens: 100
};
```

### Input Examples - Object Generation

**Schema Configuration**:

* `output` - Type of output: `"object"` (default) or `"array"`. If omitted while `schema` is present, output defaults to `"object"`.
* `mode` - Generation mode: `"auto"` (default), `"json"`, or `"tool"`
* `schemaName` - Optional name for the schema
* `schemaDescription` - Optional description of what the schema represents
* `schema` - Required JSON Schema describing the object shape to generate

**Optional Configuration**: Common fields include `system`, `temperature`, `maxTokens`, `topP`, `topK`, `frequencyPenalty`, `presencePenalty`, and `seed`.

**Example: Simple Object Extraction**

```json
msg.payload = {
  prompt: "Extract: Sarah Martinez, age 32, works as Senior Software Engineer at TechCorp.",
  schema: {
    type: "object",
    properties: {
      name: { type: "string", description: "Full name" },
      age: { type: "number", description: "Age in years" },
      jobTitle: { type: "string", description: "Job title" },
      company: { type: "string", description: "Employer" }
    },
    required: ["name", "age", "company"]
  },
  temperature: 0.3
};
```

**Example: Array Generation**

```json
msg.payload = {
  prompt: "Generate 5 product ideas for smart home devices",
  output: "array",
  schema: {
    type: "object",
    properties: {
      name: { type: "string", description: "Product name" },
      description: { type: "string" },
      targetPrice: { type: "number", description: "Price in USD" }
    },
    required: ["name", "description", "targetPrice"]
  },
  temperature: 0.8
};
```

**Example: File/PDF Input**

```json
msg.payload = {
  system: "Extract key information from this invoice PDF",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "file",
          filename: "invoice-2025-001.pdf",
          mediaType: "application/pdf",
          data: msg.pdfData  // base64 encoded PDF
        }
      ]
    }
  ],
  schema: {
    type: "object",
    properties: {
      invoiceNumber: { type: "string" },
      invoiceDate: { type: "string", description: "YYYY-MM-DD" },
      vendorName: { type: "string" },
      total: { type: "number" }
    },
    required: ["invoiceNumber", "vendorName", "total"]
  }
};
```

## Output Format

The node writes a standardized output response and details - the generation result - to `msg[outputProperty]` (default: `msg.payload`). This standardization means changing providers or models requires **little to no refactoring** of your surrounding flow logic.

### Text Generation Output

When generating text, the output includes the generated text along with metadata about the generation process.

**Main Fields**:

* `generationType` - Always `"text"` for text generation
* `text` - The generated text response
* `reasoning` - Internal reasoning content (if available from reasoning models)
* `messages` - Array containing the response message

**Generation Metadata**:

* `model` - The model that generated the response (e.g., `"gemini-2.5-flash"`)
* `providerType` - The AI provider (`"openai"`, `"anthropic"`, `"vertex"`, etc.)
* `providerId` - ID of the provider connection used
* `apiId` - API connection ID used for the successful generation
* `finishReason` - Why generation stopped (`"stop"`, `"length"`, etc.)
* `steps` - Detailed generation steps (including request/response metadata when available)
* `providerErrors` - Array of errors from failed provider attempts

Step-level warnings are available in `steps[n].warnings` when provided by the model provider.

**Token Usage**:

* `usage.inputTokens` - Tokens in the input
* `usage.outputTokens` - Tokens in the output
* `usage.totalTokens` - Total tokens used
* `usage.reasoningTokens` - Reasoning tokens (for models like GPT-5, Gemini 2.5)
* `usage.cachedInputTokens` - Cached prompt tokens (if applicable)

**Example Output**

```json
// msg.payload after node execution
{
  generationType: "text",
  text: "Two plus two equals four.",
  messages: [
    {
      role: "assistant",
      content: "Two plus two equals four."
    }
  ],
  usage: {
    inputTokens: 12,
    outputTokens: 6,
    totalTokens: 18
  },
  model: "gemini-2.5-flash",
  providerType: "vertex",
  finishReason: "stop",
  providerId: "google-vertex-ai",
  apiId: "google-vertex-ai",
  providerErrors: [],
  steps: [...]
}
```

### Object Generation Output

When generating structured objects, the output includes the parsed object or array along with the same metadata as text generation.

If an earlier attempt failed (for example, invalid JSON) and a later retry succeeded, the final output still returns the successful `object`, while failure details remain available in `providerErrors` and `steps`.

**Main Fields**:

* `generationType` - Always `"object"` for object generation
* `object` - The generated structured data (object or array)
* `reasoning` - Internal reasoning content (if available from reasoning models)

**Generation Metadata**: Same fields as text generation (`model`, `providerType`, `providerId`, `apiId`, `finishReason`, `steps`, `providerErrors`)

**Token Usage**: Same fields as text generation (`inputTokens`, `outputTokens`, `totalTokens`, `reasoningTokens`, `cachedInputTokens`)

**Example Output**

```json
// msg.payload after node execution
{
  generationType: "object",
  object: {
    name: "Sarah Martinez",
    age: 32,
    jobTitle: "Senior Software Engineer",
    company: "TechCorp"
  },
  usage: {
    inputTokens: 81,
    outputTokens: 50,
    totalTokens: 131
  },
  model: "gemini-2.5-flash",
  providerType: "vertex",
  finishReason: "stop",
  providerId: "google-vertex-ai",
  apiId: "google-vertex-ai",
  providerErrors: [],
  steps: [...]
}
```

#### Response Metadata

The node writes response metadata to `msg[responseProperty]` (default: `msg._response`).

AI Generate uses a WebSocket connection to the AI Gateway, so this metadata is a normalized success marker rather than the full upstream HTTP response.

**Fields**:

* `statusCode` - `200` on successful generation
* `headers` - A minimal header map (currently includes `content-type`)

**Example**

```json
// msg._response after node execution
{
  statusCode: 200,
  headers: {
    "content-type": "application/json"
  }
}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.contextual.io/documentation-and-resources/components-and-data/flows/node-reference/ai-gateway/ai-generate.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
