TowerInstruct-7B-v0.1 Open-Source Language Model - Freely Deployable and Capable of Handling Various Translation Tasks

Towerinstruct 7B V0.1

Developed by Unbabel

TowerInstruct-7B is a language model obtained by fine-tuning TowerBase on the TowerBlocks supervised fine-tuning dataset, specifically designed to handle various translation-related tasks.

Large Language Model

Transformers

Supports Multiple Languages#Multilingual translation #Term-aware #Post-editing

Downloads 8,176

Release Time : 1/4/2024

Model Overview

This model is specifically designed to handle various translation-related tasks, including: general machine translation, automatic post-editing, named entity recognition, grammar correction, and paraphrase generation.

Model Features

Multilingual translation support

Supports translation tasks in 10 languages, including sentence/paragraph-level translation, term-aware translation, and context-aware translation.

Diverse task handling

Capable of not only translation tasks but also automatic post-editing, named entity recognition, grammar correction, and paraphrase generation.

Mixed fine-tuning

Fine-tuned on a mix of public translation task datasets, dialogue datasets, and code instructions, enhancing the model's diversity and adaptability.

Model Capabilities

Machine translation

Automatic post-editing

Named entity recognition

Grammar correction

Paraphrase generation

Use Cases

Translation services

Multilingual text translation

Translate text from one supported language to another, such as Portuguese to English.

Provides high-quality translation output

Term-aware translation

Maintain consistency of specific terms during translation.

Improves accuracy in specialized domain translations

Text processing

Grammar correction

Detect and correct grammatical errors in text.

Enhances linguistic quality of text

Paraphrase generation

Generate semantically equivalent but differently phrased versions of given text.

Enriches text expression

🚀 Model Card for TowerInstruct-7B-v0.1

TowerInstruct-7B-v0.1 is a fine - tuned language model designed to handle various translation - related tasks. It offers a wide range of capabilities in multiple languages, providing solutions for translation, post - edition, and more.

📚 Documentation

✨ Features

TowerInstruct-7B is a 7B parameter model fine - tuned on TowerBase using the TowerBlocks supervised fine - tuning dataset.
It can handle multiple translation - related tasks, including general machine translation, automatic post - edition, named - entity recognition, and more.
Supports 10 languages: English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian.

📦 Installation

The installation steps are shown in the code example for running the model. You may need to install transformers from source (for versions <= v4.34) and accelerate.

# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate

💻 Usage Examples

Basic Usage

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="Unbabel/TowerInstruct-v0.1", torch_dtype=torch.bfloat16, device_map="auto")
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=False)
print(outputs[0]["generated_text"])
# <|im_start|>user
# Translate the following text from Portuguese into English.
# Portuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.
# English:<|im_end|>
# <|im_start|>assistant
# A group of researchers has launched a new model for translation-related tasks.

🔧 Technical Details

Model Description

TowerInstruct-7B is a language model that results from fine - tuning TowerBase on the TowerBlocks supervised fine - tuning dataset. TowerInstruct-7B-v0.1 is the first model in the series. The model is trained to handle several translation - related tasks, such as general machine translation (e.g., sentence - and paragraph - level translation, terminology - aware translation, context - aware translation), automatic post - edition, named - entity recognition, gramatical error correction, and paraphrase generation. More details will be released in the upcoming technical report.

Intended uses & limitations

The model was initially fine - tuned on a filtered and preprocessed supervised fine - tuning dataset (TowerBlocks), which contains a diverse range of data sources:

Translation (sentence and paragraph - level)
Automatic Post Edition
Machine Translation Evaluation
Context - aware Translation
Terminology - aware Translation
Multi - reference Translation
Named - entity Recognition
Paraphrase Generation
Synthetic Chat data
Code instructions

The model is not guaranteed to perform for languages other than the 10 languages it supports. It is not intended to be used as a conversational chatbot or code assistant, nor as a document - level translator.

Prompt Format

TowerInstruct-v0.1 was trained using the ChatML prompt templates without any system prompts. An example follows below:

<|im_start|>user
{USER PROMPT}<|im_end|>
<|im_start|>assistant
{MODEL RESPONSE}<|im_end|>
<|im_start|>user
[...]

The prompts for all supervised tasks can be found in TowerBlocks. Multiple prompt templates were used for each task, and the difference in downstream performance should be minimal.

Training Details

Training Data

The training data can be found at the link to TowerBlocks.

Training Hyperparameters

The following hyperparameters were used during training:

Property	Details
total_train_batch_size	256
learning_rate	7e - 06
lr_scheduler_type	cosine
lr_scheduler_warmup_steps	500
weight_decay	0.01
optimizer	Adam with betas=(0.9,0.999) and epsilon = 1e - 08
num_epochs	4
max_seq_length	2048

📄 License

📖 Citation

@misc{tower_llm_2024,
      title={Tower: An Open Multilingual Large Language Model for Translation-Related Tasks}, 
      author={Duarte M. Alves and José Pombal and Nuno M. Guerreiro and Pedro H. Martins and João Alves and Amin Farajian and Ben Peters and Ricardo Rei and Patrick Fernandes and Sweta Agrawal and Pierre Colombo and José G. C. de Souza and André F. T. Martins},
      year={2024},
      eprint={2402.17733},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご