🚀 legal_t5_small_trans_de_sv Model
A model designed for translating legal text from German to Swedish, offering high - precision translation services in the legal domain.
🚀 Quick Start
The legal_t5_small_trans_de_sv
model is tailored for translating legal text from German to Swedish. It was initially released in this repository.
✨ Features
- Based on the
t5-small
model, trained on a large parallel text corpus.
- A smaller - scaled model with about 60 million parameters, using
dmodel = 512
, dff = 2,048
, 8 - headed attention, and 6 layers each in the encoder and decoder.
- Trained on three parallel corpora from JRC - ACQUIS, EUROPARL, and DCEP.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
Here is how to use this model to translate legal text from German to Swedish in PyTorch:
from transformers import AutoTokenizer, AutoModelWithLMHead, TranslationPipeline
pipeline = TranslationPipeline(
model=AutoModelWithLMHead.from_pretrained("SEBIS/legal_t5_small_trans_de_sv"),
tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "SEBIS/legal_t5_small_trans_de_sv", do_lower_case=False,
skip_special_tokens=True),
device=0
)
de_text = "Betrifft: Leader-Programm"
pipeline([de_text], max_length=512)
📚 Documentation
Model Description
The legal_t5_small_trans_de_sv
model is based on the t5-small
model and trained on a large parallel text corpus. It's a smaller - sized model that scales down the baseline t5
model. It uses dmodel = 512
, dff = 2,048
, 8 - headed attention, and has only 6 layers each in the encoder and decoder, with approximately 60 million parameters.
Intended Uses & Limitations
This model can be used for translating legal texts from German to Swedish.
Training Data
The legal_t5_small_trans_de_sv
model was trained on the [JRC - ACQUIS](https://wt - public.emm4u.eu/Acquis/index_2.2.html), EUROPARL, and [DCEP](https://ec.europa.eu/jrc/en/language - technologies/dcep) datasets, which consist of 5 million parallel texts.
Training Procedure
- Overall Training: The model was trained on a single TPU Pod V3 - 8 for a total of 250K steps, using a sequence length of 512 (batch size 4096). It has approximately 220M parameters in total and was trained using the encoder - decoder architecture.
- Preprocessing: An unigram model was trained with 88M lines of text from the parallel corpus (covering all possible language pairs) to obtain the vocabulary (using byte - pair encoding), which is used with this model.
- Pretraining: Specific details about pretraining are not fully provided in the original document.
Evaluation Results
When used on the translation test dataset, the model achieves the following results:
Property |
Details |
Model Type |
legal_t5_small_trans_de_sv |
BLEU score |
41.69 |
BibTeX entry and citation info
Created by Ahmed Elnaggar/@Elnaggar_AI | [LinkedIn](https://www.linkedin.com/in/prof - ahmed - elnaggar/)