🚀 legal_t5_small_multitask_cs_it Model
A model designed for translating legal text from Czech to Italian, offering efficient and accurate legal language translation.
🚀 Quick Start
The legal_t5_small_multitask_cs_it
model is dedicated to translating legal text from Czech to Italian. It was initially released in this repository. The model is trained in parallel on three parallel corpora with 42 language pairs from JRC-ACQUIS, Europarl, and DCEP, along with an unsupervised task where it follows the prediction task in a masked language model.
✨ Features
- Multitask Learning: Combines unsupervised tasks with translation tasks to achieve multitask learning without pretraining.
- Parallel Training: Trained on multiple parallel corpora to improve translation performance.
💻 Usage Examples
Basic Usage
Here is how to use this model to translate legal text from Czech to Italian in PyTorch:
from transformers import AutoTokenizer, AutoModelWithLMHead, TranslationPipeline
pipeline = TranslationPipeline(
model=AutoModelWithLMHead.from_pretrained("SEBIS/legal_t5_small_multitask_cs_it"),
tokenizer=AutoTokenizer.from_pretrained(pretrained_model_name_or_path = "SEBIS/legal_t5_small_multitask_cs_it", do_lower_case=False,
skip_special_tokens=True),
device=0
)
cs_text = "Příprava Evropské rady (29.-30. října 2009)"
pipeline([cs_text], max_length=512)
📚 Documentation
Model Description
No pretraining is involved in the legal_t5_small_multitask_cs_it
model. Instead, an unsupervised task is added to all translation tasks to realize the multitask learning scenario.
Intended Uses & Limitations
The model can be used for translating legal texts from Czech to Italian.
🔧 Technical Details
Training Data
The legal_t5_small_multitask_cs_it
model (including the supervised task with only the corresponding language pair and the unsupervised task with data from all language pairs) was trained on the JRC-ACQUIS, EUROPARL, and DCEP datasets, which consist of 5 million parallel texts.
Training Procedure
- Hardware: The model was trained on a single TPU Pod V3 - 8.
- Steps: A total of 250K steps.
- Sequence Length: 512 (batch size 4096).
- Parameters: Approximately 220M parameters.
- Architecture: Encoder - decoder architecture.
- Optimizer: AdaFactor with an inverse square root learning rate schedule.
Preprocessing
A unigram model was trained with 88M lines of text from the parallel corpus (of all possible language pairs) to obtain the vocabulary (using byte - pair encoding), which is used with this model.
Evaluation Results
When the model is used for the translation test dataset, it achieves the following results:
Model |
BLEU score |
legal_t5_small_multitask_cs_it |
45.297 |
BibTeX entry and citation info
Created by Ahmed Elnaggar/@Elnaggar_AI | LinkedIn
📄 License
No license information is provided in the original document.