DexArt Benchmark Leaderboard
DexArt benchmarks emphasize articulated object manipulation and tool use.
View Benchmark Repository15 models
Click row to expand details.
Sort by:
| Rank | Model | Mean Succ | Date | Details |
|---|---|---|---|---|
| 1 | DP4 Spatial-Temporal Aware Visuomotor Diffusion Policy Learning Original | 82.50 | 2025.07 | |
| 2 | SAT Structural Action Transformer for 3D Dexterous Manipulation Original | 73.00 | 2026.03 | |
| 3 | DP3 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations Original | 68.50 | 2024.03 | |
| 4 | ManiFlow ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training Original | 63.20 | 2025.09 | |
| 5 | BridgePolicy Sample from What You See: Visuomotor Policy Learning via Diffusion Bridge with Observation-Embedded Stochastic Differential Equation Original | 60.00 | 2025.12 | |
| 6 | ManiCM ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation Reproduced by SDM | 56.80 | 2024.06 | |
| 7 | SDM Policy Score and Distribution Matching Policy: Advanced Accelerated Visuomotor Policies via Matched Distillation Original | 56.00 | 2024.12 | |
| 8 | UniAct Universal Actions for Enhanced Embodied Foundation Models Reproduced by SAT | 55.00 | 2025.01 | |
| 9 | FreqPolicy FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens Original | 54.30 | 2025.06 | |
| 10 | FlowPolicy FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation Reproduced by BridgePolicy | 54.00 | 2024.12 | |
| 11 | MambaPolicy MambaPolicy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models Original | 54.00 | 2024.09 | |
| 12 | Flow Matching Affordance-based Robot Manipulation with Flow Matching Reproduced by ManiFlow | 53.30 | 2024.09 | |
| 13 | HPT Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers Reproduced by SAT | 53.00 | 2024.09 | |
| 14 | Diffusion Policy Diffusion Policy: Visuomotor Policy Learning via Action Diffusion Reproduced by DP3 | 49.00 | 2023.03 | |
| 15 | VLA-OS(-A/I/H) VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models Original | 49.00 | 2025.06 |