đ BERT base model (uncased)
A pre - trained English language model using masked language modeling (MLM) objective.
đ Quick Start
To start using the model, first download it by cloning the repository:
git clone https://huggingface.co/OWG/bert-base-uncased
Then, you can utilize the model with the following Python code:
from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
options = SessionOptions()
options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL
session = InferenceSession("path/to/model.onnx", sess_options=options)
session.disable_fallback()
text = "Replace me by any text you want to encode."
input_ids = tokenizer(text, return_tensors="pt", return_attention_mask=True)
inputs = {k: v.cpu().detach().numpy() for k, v in input_ids.items()}
outputs_name = session.get_outputs()[0].name
outputs = session.run(output_names=[outputs_name], input_feed=inputs)
⨠Features
- Pretrained on English: This model is pretrained on English language using a masked language modeling (MLM) objective.
- Uncased: It does not distinguish between uppercase and lowercase letters, e.g., it treats "english" and "English" the same.
đĻ Installation
You can download the model by cloning the repository:
git clone https://huggingface.co/OWG/bert-base-uncased
đģ Usage Examples
Basic Usage
from onnxruntime import InferenceSession, SessionOptions, GraphOptimizationLevel
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
options = SessionOptions()
options.graph_optimization_level = GraphOptimizationLevel.ORT_ENABLE_ALL
session = InferenceSession("path/to/model.onnx", sess_options=options)
session.disable_fallback()
text = "Replace me by any text you want to encode."
input_ids = tokenizer(text, return_tensors="pt", return_attention_mask=True)
inputs = {k: v.cpu().detach().numpy() for k, v in input_ids.items()}
outputs_name = session.get_outputs()[0].name
outputs = session.run(output_names=[outputs_name], input_feed=inputs)
đ Documentation
Model description
This is a pretrained model on English language using a masked language modeling (MLM) objective. It was introduced in
this paper and first released in
this repository. This model is uncased, meaning it does not make a difference
between "english" and "English".
Original implementation
Follow this link to see the original implementation.
đ License
This model is licensed under the Apache 2.0 license.
Property |
Details |
Tags |
exbert |
License |
apache - 2.0 |
Training Datasets |
bookcorpus, wikipedia |