Efficient Training on Multiple Consumer GPUs with RoundPipe Paper • 2604.27085 • Published 6 days ago • 35
PhyCo: Learning Controllable Physical Priors for Generative Motion Paper • 2604.28169 • Published 5 days ago • 12
Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling Paper • 2604.27039 • Published 6 days ago • 20
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows Paper • 2604.28139 • Published 5 days ago • 33
FAMA: Failure-Aware Meta-Agentic Framework for Open-Source LLMs in Interactive Tool Use Environments Paper • 2604.25135 • Published 7 days ago • 8
RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments Paper • 2604.26067 • Published 7 days ago • 70
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 5 days ago • 84
DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios Paper • 2604.25914 • Published 7 days ago • 40
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published 6 days ago • 47
Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation Paper • 2604.25819 • Published 7 days ago • 16
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 8 days ago • 85
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing Paper • 2604.22782 • Published Apr 3 • 7
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents Paper • 2604.23781 • Published 9 days ago • 32
Stabilizing Efficient Reasoning with Step-Level Advantage Selection Paper • 2604.24003 • Published 8 days ago • 7
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis Paper • 2604.24198 • Published 8 days ago • 20
SketchVLM: Vision language models can annotate images to explain thoughts and guide users Paper • 2604.22875 • Published 12 days ago • 33
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning Paper • 2604.24300 • Published 8 days ago • 64
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation Paper • 2604.24763 • Published 8 days ago • 68