Back End News

Alibaba Cloud, the cloud computing arm of Alibaba Group, has launched Qwen2.5-Omni-7B, an AI model capable of handling text, images, audio, and video inputs. The model can generate both written text and natural speech responses, making it useful for applications like voice assistants and customer service bots.

The model is designed with a 7-billion parameter structure, balancing efficiency and performance. It is open-sourced on Hugging Face and GitHub, with additional access through Qwen Chat and Alibaba Cloud’s ModelScope. Alibaba Cloud has made more than 200 generative AI (GenAI) models open-source in recent years.

Qwen2.5-Omni-7B’s architecture includes features aimed at improving multimodal processing. Its Thinker-Talker Architecture separates text generation and speech synthesis to prevent interference between different inputs. It also uses TMRoPE (Time-aligned Multimodal RoPE), a technique that synchronizes video and audio for coherent content creation. Also, Block-wise Streaming Processing allows for faster audio responses, which enhances real-time voice interactions.

Alibaba Cloud posts growing adoption of Qwen models by enterprises

Training data sets to improve processing

Alibaba Cloud trained the model using a diverse dataset, including image-text, video-text, audio-text, and multimodal data. This approach improves its ability to process and understand multiple input types simultaneously.

The model was tested on OmniBench, a benchmark designed to evaluate AI models’ ability to interpret and reason across visual, acoustic, and textual inputs. Alibaba Cloud reported that Qwen2.5-Omni-7B performed well in these assessments.

The company has continued expanding its Qwen2.5 series. It introduced Qwen2.5 in September and followed up with Qwen2.5-Max in January, which ranked seventh on Chatbot Arena, performing at a level comparable to other leading AI models. Alibaba Cloud has also open-sourced Qwen2.5-VL and Qwen2.5-1M, which are designed for visual understanding and long-context processing.

Alibaba Cloud unveils new AI model for multimodal applications

ByBack End News

Training data sets to improve processing

Like this:

Related Stories

By Back End News

Related Post

Fintech firms expand offline access to boost PH inclusion

Fortinet: Security training reduces cyber incidents by 67%

IBM brings AI-enabled digital experiences to Masters

Read More

Fintech firms expand offline access to boost PH inclusion

Shopee expands Tatak Pinoy roadshow in VisMin

Asia’s $78B AI boom draws tech leaders to Singapore

Fortinet: Security training reduces cyber incidents by 67%

ByBack End News

Training data sets to improve processing

SHARE

Like this:

Related Stories

By Back End News

Related Post

Read More

Discover more from Back End News