Distilbert Word2vec 256k MLM 250k

D

Distilbert Word2vec 256k MLM 250k

Developed by vocab-transformers

This model combines word2vec embeddings with the DistilBERT architecture, suitable for natural language processing tasks. The embedding layer is trained on large-scale corpora and remains frozen, while the model is fine-tuned via masked language modeling.

Large Language Model

#Frozen word embeddings #Large-scale corpus pre-training #Efficient MLM fine-tuning

Downloads 21

Release Time : 4/7/2022

Model Overview

A lightweight BERT variant incorporating word2vec embeddings, designed for efficient text representation and language understanding tasks.

Model Features

Efficient embeddings

Utilizes a word2vec embedding matrix trained on 100GB of large-scale corpus data, containing 256k vocabulary entries

Lightweight architecture

Based on DistilBERT's lightweight design, maintaining performance while reducing computational resource requirements

Frozen embeddings

Keeps embedding layer parameters frozen during MLM training to focus on upper structure learning

Model Capabilities

Text representation learning

Language model fine-tuning

Contextual understanding

Use Cases

Natural Language Processing

Text classification

Applicable for document classification, sentiment analysis, and similar tasks

Information retrieval

Improves search relevance ranking and query understanding

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase