TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published 3 days ago • 167
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos Paper • 2605.18233 • Published 6 days ago • 87
MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation Paper • 2512.18181 • Published 17 days ago • 85
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published Apr 10 • 2
SpatialGenEval Collection [ICLR 2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models • 1 item • Updated 17 days ago
VGPO-RL Collection [ACL 2026] Visually-Guided Policy Optimization for Multimodal Reasoning • 3 items • Updated 17 days ago
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 187
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics Paper • 2604.17295 • Published Apr 19 • 85
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published Apr 17 • 74
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published Apr 10 • 2
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 291