SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer
Efficient-Large-Model
community
AI & ML interests
None defined yet.
Recent Activity
Efficient Diffusion LLM
-
Efficient-Large-Model/Fast_dLLM_v2_1.5B
2B β’ Updated β’ 5.51k β’ 13 -
Efficient-Large-Model/Fast_dLLM_v2_7B
333k β’ Updated β’ 12.7k β’ 29 -
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper β’ 2509.26328 β’ Published β’ 59 -
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper β’ 2505.22618 β’ Published β’ 46
HuggingFace Transformers can load us.
SANA-1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
-
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Paper β’ 2501.18427 β’ Published β’ 26 -
Efficient-Large-Model/SANA1.5_4.8B_1024px
Text-to-Image β’ Updated β’ 38 β’ β’ 25 -
Efficient-Large-Model/SANA1.5_4.8B_1024px_diffusers
Text-to-Image β’ Updated β’ β’ 19 -
Efficient-Large-Model/SANA1.5_1.6B_1024px
Text-to-Image β’ Updated β’ 527 β’ β’ 4
β‘οΈSana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
-
Efficient-Large-Model/Sana_1600M_1024px
Text-to-Image β’ Updated β’ 405 β’ β’ 221 -
Efficient-Large-Model/Sana_1600M_1024px_BF16
Text-to-Image β’ Updated β’ 124 β’ 14 -
Efficient-Large-Model/Sana_1600M_1024px_BF16_ControlNet_HED
Text-to-Image β’ Updated β’ 45 β’ 1 -
Efficient-Large-Model/Sana_600M_1024px_ControlNet_HED
Text-to-Image β’ Updated β’ 12 β’ 1
π¬ SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
-
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
Paper β’ 2509.24695 β’ Published β’ 53 -
Efficient-Large-Model/SANA-Video_2B_720p
Text-to-Video β’ Updated β’ 70 β’ 24 -
Efficient-Large-Model/SANA-Video_2B_720p_diffusers
Text-to-Video β’ Updated β’ 3 -
Efficient-Large-Model/SANA-Video_2B_480p
Text-to-Video β’ Updated β’ 726 β’ 15
NVILA: Efficient Frontier Visual Language Models
-
NVILA: Efficient Frontier Visual Language Models
Paper β’ 2412.04468 β’ Published β’ 61 -
Efficient-Large-Model/NVILA-8B
Text Generation β’ Updated β’ 652 β’ 7 -
Efficient-Large-Model/NVILA-15B
Text Generation β’ Updated β’ 164 β’ 26 -
Efficient-Large-Model/NVILA-8B-Video
Text Generation β’ Updated β’ 314 β’ 9
Boost AI's Long ability, while keeping Efficient. Models in this collection includes LongVILA, LongVILA-R1, LongLive.
πSANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
-
SanaSprint
π417Ultra fast high quality image generation
-
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 43 -
Efficient-Large-Model/Sana_Sprint_1.6B_1024px
Text-to-Image β’ Updated β’ 39 β’ 19 -
Efficient-Large-Model/Sana_Sprint_0.6B_1024px
Text-to-Image β’ Updated β’ 73 β’ 8
-
Efficient-Large-Model/Llama-3-VILA1.5-8B
Text Generation β’ Updated β’ 720 β’ 37 -
Efficient-Large-Model/VILA1.5-40b
Text Generation β’ Updated β’ 143 β’ 17 -
Efficient-Large-Model/VILA1.5-3b
Text Generation β’ Updated β’ 1.33k β’ 34 -
Efficient-Large-Model/VILA1.5-3b-AWQ
Text Generation β’ Updated β’ 35 β’ 7
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer
π¬ SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
-
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
Paper β’ 2509.24695 β’ Published β’ 53 -
Efficient-Large-Model/SANA-Video_2B_720p
Text-to-Video β’ Updated β’ 70 β’ 24 -
Efficient-Large-Model/SANA-Video_2B_720p_diffusers
Text-to-Video β’ Updated β’ 3 -
Efficient-Large-Model/SANA-Video_2B_480p
Text-to-Video β’ Updated β’ 726 β’ 15
Efficient Diffusion LLM
-
Efficient-Large-Model/Fast_dLLM_v2_1.5B
2B β’ Updated β’ 5.51k β’ 13 -
Efficient-Large-Model/Fast_dLLM_v2_7B
333k β’ Updated β’ 12.7k β’ 29 -
Fast-dLLM v2: Efficient Block-Diffusion LLM
Paper β’ 2509.26328 β’ Published β’ 59 -
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Paper β’ 2505.22618 β’ Published β’ 46
NVILA: Efficient Frontier Visual Language Models
-
NVILA: Efficient Frontier Visual Language Models
Paper β’ 2412.04468 β’ Published β’ 61 -
Efficient-Large-Model/NVILA-8B
Text Generation β’ Updated β’ 652 β’ 7 -
Efficient-Large-Model/NVILA-15B
Text Generation β’ Updated β’ 164 β’ 26 -
Efficient-Large-Model/NVILA-8B-Video
Text Generation β’ Updated β’ 314 β’ 9
HuggingFace Transformers can load us.
Boost AI's Long ability, while keeping Efficient. Models in this collection includes LongVILA, LongVILA-R1, LongLive.
SANA-1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
-
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Paper β’ 2501.18427 β’ Published β’ 26 -
Efficient-Large-Model/SANA1.5_4.8B_1024px
Text-to-Image β’ Updated β’ 38 β’ β’ 25 -
Efficient-Large-Model/SANA1.5_4.8B_1024px_diffusers
Text-to-Image β’ Updated β’ β’ 19 -
Efficient-Large-Model/SANA1.5_1.6B_1024px
Text-to-Image β’ Updated β’ 527 β’ β’ 4
πSANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
-
SanaSprint
π417Ultra fast high quality image generation
-
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Paper β’ 2503.09641 β’ Published β’ 43 -
Efficient-Large-Model/Sana_Sprint_1.6B_1024px
Text-to-Image β’ Updated β’ 39 β’ 19 -
Efficient-Large-Model/Sana_Sprint_0.6B_1024px
Text-to-Image β’ Updated β’ 73 β’ 8
β‘οΈSana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
-
Efficient-Large-Model/Sana_1600M_1024px
Text-to-Image β’ Updated β’ 405 β’ β’ 221 -
Efficient-Large-Model/Sana_1600M_1024px_BF16
Text-to-Image β’ Updated β’ 124 β’ 14 -
Efficient-Large-Model/Sana_1600M_1024px_BF16_ControlNet_HED
Text-to-Image β’ Updated β’ 45 β’ 1 -
Efficient-Large-Model/Sana_600M_1024px_ControlNet_HED
Text-to-Image β’ Updated β’ 12 β’ 1
-
Efficient-Large-Model/Llama-3-VILA1.5-8B
Text Generation β’ Updated β’ 720 β’ 37 -
Efficient-Large-Model/VILA1.5-40b
Text Generation β’ Updated β’ 143 β’ 17 -
Efficient-Large-Model/VILA1.5-3b
Text Generation β’ Updated β’ 1.33k β’ 34 -
Efficient-Large-Model/VILA1.5-3b-AWQ
Text Generation β’ Updated β’ 35 β’ 7