Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games Paper • 2606.19338 • Published 3 days ago • 42
EfficientRollout: System-Aware Self-Speculative Decoding for RL Rollouts Paper • 2606.18967 • Published 3 days ago • 20
The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL Paper • 2606.19162 • Published 3 days ago • 17
Learning from the Self-future: On-policy Self-distillation for dLLMs Paper • 2606.18195 • Published 4 days ago • 70
TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs Paper • 2606.09030 • Published 12 days ago • 28
OPD-Evolver: Cultivating Holistic Agent Evolver via On-Policy Distillation Paper • 2606.17628 • Published 4 days ago • 26
GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine? Paper • 2606.17861 • Published 4 days ago • 44
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 4 days ago • 138
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 4 days ago • 49
DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF Image-Text-to-Text • 39B • Updated 8 days ago • 589k • 401
view article Article Party is over: regularizing ColBERT models to fix efficient ANN methods lightonai • 3 days ago • 19
FastContext: Training Efficient Repository Explorer for Coding Agents Paper • 2606.14066 • Published 8 days ago • 83
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models Paper • 2606.16140 • Published 5 days ago • 97
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories Paper • 2606.11176 • Published 11 days ago • 118
Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents Paper • 2606.06036 • Published 16 days ago • 67