ARES is an RL-first framework for training and evaluating LLM agents, especially coding agents.
It is a modern gym: the environment layer powering RL research.
ARES treats LLMRequests as observations and LLMResponses as actions within the environment, so you can focus on training just the LLM - not the Code Agent surrounding it. The interface is entirely async, and supports scaling up to hundreds or thousands of parallel environments easily - check out example 3 to run this yourself.
- Python >= 3.12
Install with uv:
uv add martian-ares
ARES comes packaged with useful presets for different code agent & environment configurations. List them with:
uv run python -c "import ares; print(ares.list_presets())"
You can get started by using this minimal loop to run mini-swe-agent on SWE-bench Verified sequentially.
Note: to run this particular example you will need:
- Docker (with the daemon running)
- A Martian API key (see below)
import asyncio
import ares
from ares import llms
async def main():
# This requires `CHAT_COMPLETION_API_KEY` to be set with a Martian API key--see below.
agent = llms.ChatCompletionCompatibleLLMClient(model="openai/gpt-5-mini")
async with ares.make("sbv-mswea") as env:
ts = await env.reset()
while not ts.last():
action = await agent(ts.observation) # observation = LLM request
ts = await env.step(action) # action = LLM response
print(f"{action}\n{ts}")
if __name__ == "__main__":
asyncio.run(main())To run the example above you'll need a Martian API key set in your .env file. To get a key:
- Go to https://app.withmartian.com
- on the
Billingtab, add a payment method + top up some credits. - on the
API Keystab create an API key. - write
CHAT_COMPLETION_API_KEY={your-key}in your.env
Alternatively, you can use another chat completions-compatible endpoint by setting both:
CHAT_COMPLETION_API_BASE_URLCHAT_COMPLETION_API_KEY
