Methodology

Data Collection, Ranking Rules & Model Classification

Data Sources

All benchmark results are collected from published papers and official repositories. We do not re-run experiments.

Ranking Rules

Models are ranked by their primary metric on each benchmark. LIBERO, Meta-World, RoboCasa365, and RoboCasa-GR1-Tabletop use Average Success Rate. CALVIN uses Average Length (Avg. Len.) on the ABC→D setting, RoboChallenge uses Score, and RoboTwin 2.0 uses Hard Success Rate.

Known Limitations

Results across different benchmarks are not directly comparable. Different papers may use slightly different evaluation protocols.

Model Classification

We classify models into two categories based on their open-source status:

✅

Open-Source Models

Default

Models with publicly available code, marked with an "Open Source" badge. These models provide the highest level of reproducibility and transparency.

Other Models

Optional

Models without the "Open Source" badge include: (1) models whose code repository we could not find, and (2) models that were in "Coming Soon" status before the data collection deadline. These models are hidden by default but can be shown using the "Include All Models" toggle.

Data Notice

•If you find any errors or omissions, please let us know by creating an issue on GitHub, contacting us via email: business@evomind-tech.com or joining our Wechat group!

Disclaimer

Cross-benchmark comparisons should be avoided. Each benchmark has its own evaluation protocol and metrics.

⭐ Support This Project

If you find this leaderboard helpful for your research, please consider giving us a star on GitHub!

Star on GitHub

Contact Us

Found errors or want to submit your model? Reach out via GitHub Issue, email or Wechat group!

GitHub Issue business@evomind-tech.com

Methodology

Data Sources

Ranking Rules

Known Limitations

Model Classification

Open-Source Models

Other Models

Data Notice

Disclaimer

Supported Benchmarks

LIBERO

Meta-World

CALVIN

RoboChallenge

RoboTwin 2.0

LIBERO Plus

RoboCasa-GR1-Tabletop

RoboCasa365

⭐ Support This Project

Contact Us