Audio-Visual Active Speaker Detection A Light Weight Model for Active Speaker Detection Paper • 2303.04439 • Published Mar 8, 2023 Superxixixi/LoCoNet_ASD Image Feature Extraction • Updated Oct 10, 2023 • 22 • 2
Multimodal Text-Image LLMs HuggingFaceM4/idefics-80b-instruct Text Generation • 80B • Updated Oct 12, 2023 • 4k • 188 Vision-CAIR/MiniGPT-4 Updated Apr 19, 2023 • 433 xtuner/llava-llama-3-8b-v1_1-gguf Image-to-Text • 8B • Updated Apr 30, 2024 • 3.49k • 225 ChocoWu/nextgpt_7b_tiva_v0 Text Generation • Updated Oct 15, 2023 • 18 • 30
Audio-Visual Active Speaker Detection A Light Weight Model for Active Speaker Detection Paper • 2303.04439 • Published Mar 8, 2023 Superxixixi/LoCoNet_ASD Image Feature Extraction • Updated Oct 10, 2023 • 22 • 2
Multimodal Text-Image LLMs HuggingFaceM4/idefics-80b-instruct Text Generation • 80B • Updated Oct 12, 2023 • 4k • 188 Vision-CAIR/MiniGPT-4 Updated Apr 19, 2023 • 433 xtuner/llava-llama-3-8b-v1_1-gguf Image-to-Text • 8B • Updated Apr 30, 2024 • 3.49k • 225 ChocoWu/nextgpt_7b_tiva_v0 Text Generation • Updated Oct 15, 2023 • 18 • 30