MPBench

A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Zhaopan Xu, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao
Hongxun Yao†, Kaipeng Zhang†

†Corresponding Author: zhangkaipeng@pjlab.org.cn

🌈 we introduce MPBench, a comprehensive benchmark for assessing the effectiveness of multimodal process reward models (PRMs) in various scenarios, achieved through three evaluation paradigms: Step Correctness, Answer Aggregation, and Reasoning Process Search..

🔔News

✨[03/16/2025] We release our paper and project page. The data and codes will be openly available soon!

BibTeX


@article{xu2025mpbench,
  title={MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification},
  author={Zhaopan Xu, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao, Hongxun Yao, Kaipeng Zhang},
  journal={arXiv preprint arXiv:2503.12505},
  year={2025}
}