bert-base-uncased Open-source English Basic Model - Free to use and supports case-insensitive text processing

Bert Base Uncased

Developed by OWG

A BERT base model for the English language, pre-trained using the Masked Language Modeling (MLM) objective, case-insensitive.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #English text encoding #Case insensitive #Masked language modeling

Downloads 15

Release Time : 3/28/2022

Model Overview

This model is based on the English language and pre-trained with the Masked Language Modeling (MLM) objective. Its principle was first published in a related paper and initially released in the code repository. This is the uncased version: it does not distinguish between cases like 'english' and 'English'.

Model Features

Case insensitive

The model does not distinguish between cases, enabling uniform processing of inputs like 'english' and 'English'.

Pre-trained with MLM

Pre-trained using the Masked Language Modeling (MLM) objective, effectively capturing contextual language information.

Model Capabilities

Text encoding

Language understanding

Context capturing

Use Cases

Natural language processing

Text classification

Used for classification tasks on English texts.

Question answering systems

Serves as a foundational model for building English question answering systems.

🚀 BERT base model (uncased)

A pre - trained English language model using masked language modeling (MLM) objective.

🚀 Quick Start

To start using the model, first download it by cloning the repository:

git clone https://huggingface.co/OWG/bert-base-uncased

Then, you can utilize the model with the following Python code:

from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel
from transformers import BertTokenizer


tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

options = SessionOptions()
options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL

session = InferenceSession("path/to/model.onnx", sess_options=options)
session.disable_fallback()

text = "Replace me by any text you want to encode."
input_ids = tokenizer(text, return_tensors="pt", return_attention_mask=True)

inputs = {k: v.cpu().detach().numpy() for k, v in input_ids.items()}
outputs_name = session.get_outputs()[0].name

outputs = session.run(output_names=[outputs_name], input_feed=inputs)

✨ Features

Pretrained on English: This model is pretrained on English language using a masked language modeling (MLM) objective.
Uncased: It does not distinguish between uppercase and lowercase letters, e.g., it treats "english" and "English" the same.

📦 Installation

You can download the model by cloning the repository:

git clone https://huggingface.co/OWG/bert-base-uncased

💻 Usage Examples

Basic Usage

from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel
from transformers import BertTokenizer


tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")

options = SessionOptions()
options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL

session = InferenceSession("path/to/model.onnx", sess_options=options)
session.disable_fallback()

text = "Replace me by any text you want to encode."
input_ids = tokenizer(text, return_tensors="pt", return_attention_mask=True)

inputs = {k: v.cpu().detach().numpy() for k, v in input_ids.items()}
outputs_name = session.get_outputs()[0].name

outputs = session.run(output_names=[outputs_name], input_feed=inputs)

📚 Documentation

Model description

This is a pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository. This model is uncased, meaning it does not make a difference between "english" and "English".

Original implementation

Follow this link to see the original implementation.

📄 License

This model is licensed under the Apache 2.0 license.

Property	Details
Tags	exbert
License	apache - 2.0
Training Datasets	bookcorpus, wikipedia

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご