The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity Paper • 2506.06941 • Published Jun 7, 2025 • 17
Running on CPU Upgrade Featured 3.2k The Smol Training Playbook 📚 3.2k The secrets to building world-class LLMs
Mirage Persistent Kernel: A Compiler and Runtime for Mega-Kernelizing Tensor Programs Paper • 2512.22219 • Published Dec 22, 2025 • 1
Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence Paper • 2604.24954 • Published Apr 27 • 25
FlashDecoding++: Faster Large Language Model Inference on GPUs Paper • 2311.01282 • Published Nov 2, 2023 • 38
Liger Kernel: Efficient Triton Kernels for LLM Training Paper • 2410.10989 • Published Oct 14, 2024 • 3
FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference Paper • 2505.22758 • Published May 28, 2025 • 1