openai/whisper-large-v3 Automatic Speech Recognition ⢠2B ⢠Updated Aug 12, 2024 ⢠5.35M ⢠⢠5.78k
google/pix2struct-widget-captioning-large Visual Question Answering ⢠1B ⢠Updated Apr 10, 2024 ⢠38 ⢠20
Salesforce/blip-image-captioning-large Image-to-Text ⢠0.5B ⢠Updated Feb 3, 2025 ⢠758k ⢠1.48k
mistralai/Mistral-7B-Instruct-v0.1 Text Generation ⢠7B ⢠Updated Jul 24, 2025 ⢠294k ⢠⢠1.83k
Runtime error Agents Featured 272 Edit Video By Editing Text ā 272 Audio-based video editing using AI-generated transcription
Runtime error Agents Featured 307 AudioLDM2 Text2Audio Text2Music Generation š 307 Generate audio and waveform video from text