# Scale and Performance

Contextual allows you to choose from 5 different Agent sizes that have different CPU and memory allocations to match to the precise needs of your AI-solution logic.

If you're running a low use HTTP endpoint that might be fine on a small agent. If you're running a machine learning model it might require an XL.

Additionally, on paid plans you can set scaling for your agents to create multiple instances to handle more load. Scaling can happen through either CPU-based utilization (for HTTP or Event-Based Agents) or Message Lag (for Event-Based Agents only).

<figure><img src="/files/fYHTjvAQoLNbsnT8D7Gr" alt=""><figcaption></figcaption></figure>

Per-hour pricing and detailed metrics for agents are available on our [pricing page](https://www.contextual.io/pricing).

![](/files/vYieCy99xLLXnhRg9UHi)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.contextual.io/documentation-and-resources/components-and-data/agents/scale-and-performance.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
