Audio Qwen/Qwen2-Audio-7B-Instruct Audio-Text-to-Text • 8B • Updated Jan 12, 2025 • 713k • 540
Video/CV LanguageBind/MoE-LLaVA-Phi2-2.7B-4e Text Generation • 6B • Updated Feb 1, 2024 • 135 • 40 LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 4.66k • 7 stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 229k • 3.33k ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75
LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 4.66k • 7
stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 229k • 3.33k
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75
Audio Qwen/Qwen2-Audio-7B-Instruct Audio-Text-to-Text • 8B • Updated Jan 12, 2025 • 713k • 540
Video/CV LanguageBind/MoE-LLaVA-Phi2-2.7B-4e Text Generation • 6B • Updated Feb 1, 2024 • 135 • 40 LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 4.66k • 7 stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 229k • 3.33k ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75
LanguageBind/LanguageBind_Video_FT Zero-Shot Image Classification • Updated Feb 1, 2024 • 4.66k • 7
stabilityai/stable-video-diffusion-img2vid-xt Image-to-Video • Updated Jul 10, 2024 • 229k • 3.33k
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Paper • 2406.04325 • Published Jun 6, 2024 • 75