MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models Paper • 2602.12871 • Published Feb 13 • 18
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published May 4 • 348