Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

danielhanchen 
posted an update about 22 hours ago
view post
Post
2601
We’re excited to announce that Unsloth has joined the PyTorch Ecosystem! 🔥🦥

Unsloth is an open-source project that makes training & running models more accurate and faster with less compute. Our mission is to make local AI accessible to everyone. Thanks to all of you for making this possible! 💕

Blog: https://unsloth.ai/blog/pytorch
GitHub: https://github.com/unslothai/unsloth
  • 1 reply
·
HannesVonEssen 
posted an update 3 days ago
spillai 
posted an update 1 day ago
view post
Post
7528
mm-ctx – fast, multimodal context for agents.

LLM-based agents handle text incredibly well, but images, videos, or PDFs with visual content are hard to interpret. mm-ctx gives your CLI agent multi-modal skills.

Try it interactively in Spaces: vlm-run/mm-ctx

Readme: https://vlm-run.github.io/mm/
PyPI: https://pypi.org/project/mm-ctx
SKILL.md: https://github.com/vlm-run/skills/blob/main/skills/mm-cli-skill/SKILL.md

mm-ctx is meant to feel familiar: the UNIX tools we already love (find/cat/grep/wc), rebuilt for file types LLMs can't read natively and designed to work with agents via the CLI.
- mm grep "invoice #1234" ~/Downloads searches across PDFs and returns line-numbered matches
- mm cat <document>.pdf returns a metadata description of the file
- mm cat <photo>.jpg returns a caption of the photo
- mm cat <video>.mp4 returns a caption of the video

A few things we obsessed over:
⚡ Speed: Rust core for the hot paths
🏠 Local-first, BYO model: Uses any OpenAI-compatible endpoint: Ollama, vLLM/SGLang, LMStudio with any multimodal LLM (Gemma4, Qwen3.5, GLM-4.6V).
🔗 Composable: stdin + structured outputs
🤖 Drops into any agent via mm-cli-skills: Claude Code, Codex, Gemini CLI, OpenClaw.

We’d love to hear your feedback! Especially on the CLI and what file types and workflows you would like to see next.
  • 2 replies
·
qgallouedec 
posted an update 4 days ago
view post
Post
9719
Shipped hf-sandbox! 🥡

🧪 Running an eval that executes model-generated C on a few thousand prompts? You probably don't want any of that on your laptop.
Just shipped hf-sandbox, a Modal-style sandbox API on top of Hugging Face Jobs. Spin up an isolated, ephemeral container, run untrusted code, get the result back. No Docker on your laptop, no infra to manage.

Just pip install hf-sandbox.

Early days (v0.1); feedback and issues very welcome:
👉 https://github.com/huggingface/hf-sandbox
  • 1 reply
·
Imosu 
posted an update 1 day ago
view post
Post
1289
# ZeroGPU Hardware Mismatch: Why Am I Getting RTX PRO 6000 Blackwell MIG Instead of the Documented H200?

I recently ran into a surprising issue while debugging a Hugging Face ZeroGPU Space.

According to the Hugging Face ZeroGPU documentation, ZeroGPU is described as using NVIDIA H200-based resources, with configurations such as “large” and “xlarge” offering H200-class memory. However, when I printed the actual GPU information inside my Space, I got something different:

`txt
GPU: NVIDIA RTX PRO 6000 Blackwell Server Edition MIG 2g.48gb
Capability: (12, 0)
Torch: 2.8.0+cu128
CUDA: 12.8

This is not an H200. It appears to be a MIG slice of an RTX PRO 6000 Blackwell Server Edition GPU, with 48GB VRAM.

This difference matters. It is not just a cosmetic hardware-name issue.

In my case, the Space was running Qwen3-TTS and failed with:

CUDA error:
no kernel image is available for execution on the device

The issue appears related to GPU architecture compatibility. The app was using kernels-community/flash-attn3, which is generally aligned with Hopper-class GPUs such as H100/H200, but the actual device exposed to the Space was Blackwell with compute capability 12.0. As a result, CUDA kernels that might work on the expected H200 environment failed on the actual assigned GPU.

To be clear, I am not saying the RTX PRO 6000 Blackwell is a bad GPU. It is a newer architecture and may be powerful in many workloads. But it is not the same as H200, and the software ecosystem compatibility is different. For ML workloads, especially those relying on custom CUDA kernels, the exact GPU architecture matters a lot.

This raises a few questions:

Is Hugging Face ZeroGPU now assigning RTX PRO 6000 Blackwell MIG instances instead of H200 instances?
If yes, why is this not clearly documented?
  • 1 reply
·
kalyan-ks 
posted an update 2 days ago
kanaria007 
posted an update about 22 hours ago
view post
Post
77
✅ Article highlight: *Determinism Profiles, Scheduler Consistency, and Replay Honesty* (art-60-234, v0.1)

TL;DR:
This article argues that determinism is not a binary badge.

A serious system should not just say “this run was deterministic.” It should say *what kind* of determinism claim is being made: exact reproducibility, epsilon-bounded replay, scheduler-stable replay, or a degraded posture due to platform drift. In other words, replay honesty needs profiles, not slogans.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• turns “deterministic enough” into an explicit, auditable claim
• separates exact replay, epsilon-bounded replay, and scheduler stability instead of blurring them
• makes platform drift and topology changes visible instead of silently laundering weaker replay results
• prevents teams from confusing bundle validity with strong DET validity

What’s inside:
• a practical determinism ladder: *EXACT_REPRODUCIBLE*, *EPSILON_BOUNDED*, *SCHEDULER_STABLE*, *PLATFORM_DRIFT_DEGRADED*
• *determinism profiles* that define what replay truth is being claimed
• *epsilon-bound policies* for declared approximate replay
• *scheduler consistency reports* for ordering and partial-order stability
• *DET run comparisons* with explicit replay honesty statements about what matched exactly, approximately, or not at all

Key idea:
Do not ask only:

*“was it deterministic?”*

Ask:

*“under what determinism profile, under what epsilon policy, under what scheduler consistency report, and with what replay honesty statement did this scope remain exact, approximate, scheduler-stable, or degraded?”*
rajkumarrawal 
posted an update 1 day ago
view post
Post
1348
LLMs aren’t just answering questions anymore, they’re learning to evolve. Self evolving AI is the true endgame.

AI has shifted from short tasks to long missions. The breakthrough isn’t just automation, it’s machines learning human methods and applying them at machine speed. From cybersecurity to finance, from OPCs to NPCs, the wave is irreversible.

Read the full article: Self Evolving is the Endgame or final destiny

https://huggingface.co/blog/rajkumarrawal/self-evolving-is-the-endgame-or-final-destiny

What’s your definition of true AGI? Comment below.
  • 1 reply
·
CuarzoAI 
posted an update 3 days ago
view post
Post
106
We just pushed Cuarzo-100K v2.

The headline addition is Mandarin Chinese 🇨🇳 every one of the 99,683 records now has a verified Python ↔ natural language pair across English, Spanish, French, and Mandarin, each generated directly from the AST and independently verified.

v2 also expands the verification schema significantly. Where v1 tracked a single roundtrip check, v2 carries per-language AST equality, compilation, and exact-match checks for all four surfaces.

Cuarzo-AI/cuarzo-100k-v2

If you're working on multilingual code models or evaluation and want to talk about custom volumes or language coverage, reach out at hello@cuarzoai.com.
PhysiQuanty 
posted an update about 12 hours ago
view post
Post
46
Hello 🤗 Are you looking for a female ML engineer who is looking for a male ML engineer and you can't find it on the apps ?
Personally, I'm looking for a physicist I'm encountering the same problem.

❗ Dating apps do not allow us to control the profiles suggested to us based on our mutual search criteria ❗
✅ Resolved by the patent : ⚡ WO2026082672 ⚡
🧬 To see what is actually possible to find, I published a dataset of 59k anonymized public profiles from OkCupid.
SpiceeChat/Dating-App-59k-Anonymized-Profiles