Align Arena Logo

Enter the Agent Alignment Arena :D

It's tricky to evaluate what your agents are up to, especially when you give them tools. We turn your data into agentic eval datasets and spin up dozens of containers to run agentic evals at scale.

Features

Lots of models!

We support all major labs, and open source models through huggingface and fireworks ai

Lots of tools!

We support websearch, terminal sessions, filesystem, and any mcp server with sse endpoint

Agent tool logs!

Observe what your agent is up to

Parallelize agent runs!

Run dozens of experiments in parallel

Advanced, comprehensive traces and scoring

View stats and scores on what your agent is good at

Lots of models!
Click to view fullscreen

Lots of models!

We support all major labs, and open source models through huggingface and fireworks ai

See Agent Arena in Action

Watch how easy it is to set up, run, and analyze AI agent evaluations at scale. From dataset upload to comprehensive results in minutes.

Agent Arena Demo Video
Demo Video
Live Demo
Step-by-step walkthrough
Real results

Ready to evaluate your agents?

Frequently Asked Questions

Everything you need to know about Align Arena

We support all major AI providers including OpenAI, Anthropic, and open source models through Hugging Face and Fireworks AI. You can easily switch between different models to compare their performance.

Yes! We provide extensive customization options for both tools and scoring mechanisms. You can integrate your own custom tools and define specialized scoring criteria tailored to your specific evaluation needs.

Absolutely! You can upload your own evaluation datasets in various formats. Our platform is designed to handle custom datasets and can help transform your data into comprehensive agentic evaluation scenarios.

We offer flexible pricing based on your evaluation needs and usage volume. For detailed pricing information and custom enterprise solutions, please contact us at michaelyu713705 at gmail dot com to discuss your specific requirements.