Bert-L12-h384-A6 Open-Source Lightweight Language Model - Efficient Text Processing Based on BookCorpus Data

Bert L12 H384 A6

Developed by eli4s

A lightweight BERT model pre-trained on the BookCorpus dataset using knowledge distillation technology, with the hidden layer dimension reduced to 384 and 6 attention heads.

Large Language Model

Transformers

#Knowledge Distillation BERT #Lightweight Language Model #Efficient Masked Prediction

Downloads 16

Release Time : 3/2/2022

Model Overview

This model is a lightweight BERT variant pre-trained using knowledge distillation technology, suitable for masked language modeling tasks.

Model Features

Lightweight Design

The hidden layer dimension is reduced to 384 (equivalent to half of BERT), and 6 attention heads are used, keeping the dimension of each head consistent with BERT.

Knowledge Distillation

Pre-trained using knowledge distillation technology and optimized with a multi-loss function.

Random Initialization

The model weights are generated by random initialization.

Model Capabilities

Masked Language Prediction

Text Understanding

Use Cases

Natural Language Processing

Text Completion

Predict the masked words in a sentence.

Multiple candidate words can be generated for selection.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Bert L12 H384 A6

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Eli4s/Bert-L12-h384-A6 Model

🚀 Quick Start

💻 Usage Examples

Basic Usage

Advanced Usage