2026-04-02 12:21:00+08
| Feature | Description |
|---|---|
| Developer | ByteDance (Seed/Doubao Team) |
| Architecture | MMDiT (Multi-Modal Diffusion Transformer) |
| Input Modes | Text-to-Video, Image-to-Video, Video-to-Video |
| Resolution | Up to 1080p / 2K |
| Duration | Supports coherent generation of ~15-20 seconds |