-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 261 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 264 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 452 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 305
Subhendu
subhrm
·
AI & ML interests
LLM, Computer Vision
Recent Activity
upvoted a paper about 1 month ago
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs upvoted a paper about 1 month ago
Qwen-Image-2.0 Technical Report liked a dataset about 2 months ago
open-index/hacker-newsOrganizations
None yet