Back End News

Nvidia’s new compact AI model delivers enhanced efficiency

Nvidia teamed up with Mistral AI, to unveil a new artificial intelligence (AI) model, the Mistral-NeMo-Minitron 8B. This model promises to deliver efficiency and performance in spite of its size.

Following the launch of the Mistral NeMo 12B model, the Minitron 8B was developed using width-pruning, a method that can reduce the model’s size while maintaining high accuracy.

The development of the Minitron 8B involved width-pruning the Mistral NeMo 12B base model, followed by a light retraining process utilizing knowledge distillation. Knowledge distillation is a technique where a smaller model, known as the “student,” learns from a larger, more complex “teacher” model. This process allows the smaller model to retain much of the predictive power of the larger model while being faster and more resource-efficient.

Nvidia to team up with industry players to build ‘AI factories’

The method employed by Nvidia, detailed in their paper Compact Language Models via Pruning and Knowledge Distillation, has shown that pruned and distilled models can outperform those trained from scratch.

The Minitron 8B was crafted by fine-tuning the Mistral NeMo 12B model with 127 billion tokens, followed by selective pruning of specific dimensions within the model. The result is a compact, efficient model that demonstrates superior accuracy compared to its predecessors. Nvidia’s iterative pruning and distillation strategy also promises substantial compute cost savings, making it a cost-effective solution for developing a family of models.

Nvidia’s new compact AI model delivers enhanced efficiency

ByBack End News

Like this:

Related Stories

By Back End News

Related Post

Apple expands AI model access in Xcode 27

Sophos reports 89-second response time for cyber threats

ASUS extends warranty, support for school-ready devices

Read More

Identity fraud shifts to large-scale, automated attacks