Fairseq-dense-6.7B Open-source Language Model - Can be Freely and Conveniently Applied to Scenarios Such as Text Processing

Fairseq Dense 6.7B

Developed by KoboldAI

This is the Hugging Face transformers adaptation of the original dense 6.7B parameter model from the paper 'Efficient Large Scale Language Modeling with Mixtures of Experts' by Artetxe et al.

Large Language Model

Transformers

English#Large-scale Language Model #6.7 Billion Parameters #Open Evaluation Benchmark

Downloads 123

Release Time : 3/2/2022

Model Overview

A large language model with 6.7 billion parameters, implemented based on the Mixtures of Experts architecture for efficient large-scale language modeling.

Model Features

Large-scale Parameters

With 6.7 billion parameters, capable of handling complex language modeling tasks.

Efficient Architecture

Utilizes the Mixtures of Experts architecture for efficient large-scale language modeling.

Hugging Face Adaptation

Adapted to the Hugging Face transformers framework for ease of use.

Model Capabilities

Text Generation

Language Understanding

Contextual Learning

Use Cases

Natural Language Processing

Open Large Language Model Evaluation

Comprehensive evaluation on the Open Large Language Model Leaderboard.

Overall average score of 36.09

Property	Details
Avg.	36.09
ARC (25-shot)	39.42
HellaSwag (10-shot)	71.26
MMLU (5-shot)	26.91
TruthfulQA (0-shot)	32.73
Winogrande (5-shot)	65.27
GSM8K (5-shot)	0.0
DROP (3-shot)	17.05

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Fairseq Dense 6.7B

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Hugging Face Transformers-compatible Conversion of Dense 6.7B-parameter Model

📚 Documentation

Open LLM Leaderboard Evaluation Results