ARES: Agentic Research & Evaluation Suite

ARES is an RL-first framework for training and evaluating LLM agents, especially coding agents.

It is a modern gym: the environment layer powering RL research.

ARES treats LLMRequests as observations and LLMResponses as actions within the environment, so you can focus on training just the LLM - not the Code Agent surrounding it. The interface is entirely async, and supports scaling up to hundreds or thousands of parallel environments easily - check out example 3 to run this yourself.

Quick Start

Pre-requisites

Python >= 3.12

Getting Started

Install with uv:

uv add martian-ares

ARES comes packaged with useful presets for different code agent & environment configurations. List them with:

uv run python -c "import ares; print(ares.list_presets())"

You can get started by using this minimal loop to run mini-swe-agent on SWE-bench Verified sequentially.

Note: to run this particular example you will need:

Docker (with the daemon running)
A Martian API key (see below)

import asyncio

import ares
from ares import llms

async def main():
    # This requires `CHAT_COMPLETION_API_KEY` to be set with a Martian API key--see below.
    agent = llms.ChatCompletionCompatibleLLMClient(model="openai/gpt-5-mini")

    async with ares.make("sbv-mswea") as env:
        ts = await env.reset()
        while not ts.last():
            action = await agent(ts.observation)   # observation = LLM request
            ts = await env.step(action)            # action = LLM response
            print(f"{action}\n{ts}")

if __name__ == "__main__":
    asyncio.run(main())

To run the example above you'll need a Martian API key set in your .env file. To get a key:

Go to https://app.withmartian.com
on the Billing tab, add a payment method + top up some credits.
on the API Keys tab create an API key.
write CHAT_COMPLETION_API_KEY={your-key} in your .env

Alternatively, you can use another chat completions-compatible endpoint by setting both:

CHAT_COMPLETION_API_BASE_URL
CHAT_COMPLETION_API_KEY

Next Steps

Check out the examples
Read the docs to understand ARES and its key abstractions
Read our blog post about why ARES and what we hope to see

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
examples		examples
integration_tests		integration_tests
licences		licences
src/ares		src/ares
.env.example		.env.example
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARES: Agentic Research & Evaluation Suite

Quick Start

Pre-requisites

Getting Started

Next Steps

About

Uh oh!

Releases

Packages

Languages

License

Recursive-Safeguarding/ares

Folders and files

Latest commit

History

Repository files navigation

ARES: Agentic Research & Evaluation Suite

Quick Start

Pre-requisites

Getting Started

Next Steps

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages