Alibaba Cloud, the cloud computing arm of Alibaba Group, has made its (artificial intelligence) AI models for video generation freely available, aiming to support researchers, developers, and businesses in creating high-quality visuals.

The company has open-sourced four models from its Wan2.1 series, the latest version of its video foundation model Tongyi Wanxiang. These models — T2V-14B, T2V-1.3B, I2V-14B-720P, and I2V-14B-480P — can generate images and videos from text and image inputs. These are available on Alibaba Cloud’s ModelScope and AI platform Hugging Face, where downloads have already exceeded one million within a week.

The Wan2.1 series is the first video generation model to support text effects in both Chinese and English. It aims to produce realistic visuals by handling complex movements, improving pixel quality, and following instructions with high accuracy. The model ranked first on the VBench leaderboard, which evaluates video generation models, with a score of 86.22%.

Each model in the series serves a different purpose. The T2V-14B model focuses on high-quality visuals with dynamic motion, while the T2V-1.3B model balances quality and computing power, making it suitable for research and development. Users with standard laptops can generate a 5-second, 480p video in about four minutes. The I2V-14B-720P and I2V-14B-480P models also support image-to-video generation.

Alibaba Cloud has been releasing open-source AI models since 2023, starting with its Qwen series. More than 100,000 derivative models have been built on Hugging Face, making Qwen one of the most widely used AI model families.

Discover more from Back End News

Subscribe now to keep reading and get access to the full archive.

Continue reading