Text-to-Video
Diffusers
Safetensors
WanPipeline

FastWan-QAD-1.3B-SA2

Introduction

FastWan-QAD-1.3B-SA2 is a variant of FastWan-QAD-1.3B that swaps the SageAttention3 FP4 backend for SageAttention2++, trading a small amount of speed for improved visual quality. It generates a 5-second 480p video in approximately 2 seconds on an RTX 5090.

Like all FastWan-QAD models, it is built on Wan-AI/Wan2.1-T2V-1.3B-Diffusers and trained with quantization-aware distillation (QAD) for 3-step inference with NVFP4 linear layers.

Hardware requirement: RTX 5090 (sm100+). NVFP4 linear layers require Blackwell-native support. See FastWan-QAD-FP8-1.3B for RTX 4090 compatibility.


Model Overview

  • 3-step inference via quantization-aware distillation
  • NVFP4 linear layers for Blackwell GPU throughput
  • SageAttention2++ backend for attention computation
  • Trained at 480p (832×480) resolution, 81 frames (5 seconds at 16 fps)
  • No classifier-free guidance at inference time
  • Fast decoding via TAEHV tiny autoencoder

Performance

Model Hardware Generation Time (5s 480p)
FastWan-QAD-1.3B RTX 5090 ~1.78s
FastWan-QAD-1.3B-SA2 RTX 5090 ~2.0s
FastWan-QAD-FP8-1.3B RTX 4090 ~3.4s
TurboDiffusion RTX 5090 6.10s
LightX2V RTX 5090 6.91s

Inference

# Install Tiny Autoencoder
git clone https://github.com/madebyollin/taehv.git
uv pip install -e taehv/

git clone https://github.com/hao-ai-lab/FastVideo.git
cd FastVideo
uv pip install -e .
cd examples/inference/optimizations
python nvfp4_sa2_wan_2_1_3b.py --taehv-checkpoint /path/to/taehv/taew2_1.pth

Training

More details coming soon.


It would be greatly appreciated if you cite our paper:

@article{Zhang2026AttnQAT,
  title={Attn-QAT: 4-Bit Attention With Quantization-Aware Training},
  author={Zhang, Peiyuan and Noto, Matthew and Tan, Wenxuan and Jiang, Chengquan and Lin, Will and Zhou, Wei and Zhang, Hao},
  journal={arXiv preprint arXiv:2603.00040},
  year={2026}
}
Downloads last month
54
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for FastVideo/FastWan-QAD-1.3B-SA2