PrunedBert-L12-h256-A4-finetuned Open-source Lightweight Model - Pretraining Empowers Efficient Information Processing

Prunedbert L12 H256 A4 Finetuned

Developed by eli4s

A lightweight model based on the BERT architecture, pre-trained using knowledge distillation techniques, with a hidden layer dimension of 256 and 4 attention heads.

Large Language Model

Transformers

#Knowledge Distillation BERT #Lightweight Text Prediction #Pruning Initialization

Downloads 16

Release Time : 3/2/2022

Model Overview

This model is a lightweight version of bert-base-uncased after pruning and knowledge distillation fine-tuning, suitable for masked language modeling tasks.

Model Features

Lightweight Design

Hidden layer dimension is 256, only one-third of BERT, making the model more lightweight.

Knowledge Distillation

Knowledge is transferred from the bert-base-uncased model using knowledge distillation techniques, reducing model size while maintaining performance.

Pruning Initialization

Model weights are initialized by pruning the weights of the bert-base-uncased model, optimizing the model structure.

Multi-Loss Fine-Tuning

Knowledge distillation fine-tuning is performed using multiple loss functions to enhance model performance.

Model Capabilities

Masked Language Prediction

Text Completion

Semantic Understanding

Use Cases

Natural Language Processing

Text Completion

Predict masked vocabulary in sentences for automatic text completion.

Accurately predicts masked vocabulary, improving text processing efficiency.

Semantic Analysis

Understand sentence semantics through masked language modeling tasks.

Effectively captures semantic information of sentences, suitable for downstream NLP tasks.

🚀 PrunedBert-L12-h256-A4-finetuned

This model is a pruned and fine - tuned version based on BERT, offering a more lightweight solution while maintaining certain performance through knowledge distillation.

🚀 Quick Start

Load the model & tokenizer

from transformers import AutoModelForMaskedLM, BertTokenizer

model_name = "eli4s/prunedBert-L12-h256-A4-finetuned"
model = AutoModelForMaskedLM.from_pretrained(model_name)
tokenizer = BertTokenizer.from_pretrained(model_name)

Use it on a single sentence

import torch

sentence = "Let's have a [MASK]."

model.eval()
inputs = tokenizer([sentence], padding='longest', return_tensors='pt')
output = model(inputs['input_ids'], attention_mask=inputs['attention_mask'])

mask_index = inputs['input_ids'].tolist()[0].index(103)
masked_token = output['logits'][0][mask_index].argmax(axis=-1)
predicted_token = tokenizer.decode(masked_token)

print(predicted_token)

Predict the n most relevant predictions

top_n = 5

vocab_size = model.config.vocab_size
logits = output['logits'][0][mask_index].tolist()
top_tokens = sorted(list(range(vocab_size)), key=lambda  i:logits[i], reverse=True)[:top_n]

tokenizer.decode(top_tokens)

✨ Features

Knowledge Distillation: This model was pretrained on the bookcorpus dataset using knowledge distillation.
Light - weight Architecture: Although it shares the same architecture as BERT, it has a hidden size of 256 (a third of the hidden size of BERT) and 4 attention heads, which keeps the same head size as BERT.
Weight Initialization: The weights of the model were initialized by pruning the weights of bert - base - uncased.
Fine - tuning: Multiple loss functions were used for knowledge distillation to fine - tune the model.
Same Tokenizer: The tokenizer is the same as the one of the model bert - base - uncased.

🔧 Technical Details

This model is a variant of BERT. It starts with pruning the weights of bert - base - uncased to initialize its own weights. Then, it is pretrained on the bookcorpus dataset. Knowledge distillation is applied during the training process, using multiple loss functions to fine - tune the model. Even though it has a reduced hidden size and the number of attention heads compared to the standard BERT, it still maintains a similar head size, which helps in balancing performance and model size.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご