Fairseq Dense 6.7B
This is the Hugging Face transformers adaptation of the original dense 6.7B parameter model from the paper 'Efficient Large Scale Language Modeling with Mixtures of Experts' by Artetxe et al.
Downloads 123
Release Time : 3/2/2022
Model Overview
A large language model with 6.7 billion parameters, implemented based on the Mixtures of Experts architecture for efficient large-scale language modeling.
Model Features
Large-scale Parameters
With 6.7 billion parameters, capable of handling complex language modeling tasks.
Efficient Architecture
Utilizes the Mixtures of Experts architecture for efficient large-scale language modeling.
Hugging Face Adaptation
Adapted to the Hugging Face transformers framework for ease of use.
Model Capabilities
Text Generation
Language Understanding
Contextual Learning
Use Cases
Natural Language Processing
Open Large Language Model Evaluation
Comprehensive evaluation on the Open Large Language Model Leaderboard.
Overall average score of 36.09
Featured Recommended AI Models