pinned
Running
Agents
7
SGI-Bench Leaderboard
🥇
Scientific General Intelligence of LLMs/vLLMs
None defined yet.
ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research
Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision
Scientific General Intelligence of LLMs/vLLMs
Open, science-focus leaderboards benchmarking LLMs and VLMs
Submit and validate a ResearchClawBench task ZIP
Lightweight harness for tool-using LLM agents.