Model Overview
Model Features
Model Capabilities
Use Cases
🚀 Tango-70B-Instruct
Tango-70B-Instruct is a large language model trained to enhance regional Spanish speech performance.
🚀 Quick Start
Tango-70B-Instruct can be used via the HuggingFace Transformers library. You'll need 2 or more 80GB GPUs (NVIDIA Ampere or newer) and at least 150GB of free disk space for the download.
This code has been tested on Transformers v4.44.0, torch v2.4.0, and 2 A100 80GB GPUs. Any setup that supports meta-llama/Llama-3.1-70B-Instruct
should also support this model. If you encounter issues, you can try pip install -U transformers
.
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load base model and tokenizer
base_model_id = "nvidia/Llama-3.1-Nemotron-70B-Instruct-HF"
adapter_model_id = "sandbox-ai/Tango-70b"
# Create quantization config for 4-bit precision
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
# Load tokenizer from base model
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# Load the base model with 4-bit quantization
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
quantization_config=bnb_config,
device_map="auto", # This will automatically handle model sharding
trust_remote_code=True
)
# Load the PEFT adapter
model = PeftModel.from_pretrained(
base_model,
adapter_model_id,
device_map="auto", # This will automatically handle model sharding
)
hola_mundo = """
Bienvenido.
Tu nombre es "Tango", sos la primer IA hecha en LatinoAmérica, basada en un Large Language Model de 70 billones de parámetros y creada en Argentina.
Cuál es la importancia de hacer IA nativa en LatinoAmérica? qué beneficios trae haberte creado, en comparación a depender de las IAs creadas en USA, Francia o China?
"""
# Test prompt
messages = [
{"role": "user", "content": hola_mundo}
]
# Format the input using the chat template
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
# Generate response with memory-efficient settings
with torch.inference_mode():
outputs = model.generate(
inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.95,
pad_token_id=tokenizer.eos_token_id, # Set padding token
attention_mask=torch.ones_like(inputs) # Add attention mask
)
# Decode and print the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
✨ Features
- Regional Spanish Enhancement: Tango-70B-Instruct is trained on a modified variation of spanish/-ir/messirve to improve regional Spanish speech performance.
- Multi - language Support: Supports both English and Spanish.
📦 Model Information
Property | Details |
---|---|
Model Type | Tango-70B-Instruct |
License | llama3.1 |
Supported Languages | English, Spanish |
Inference | false |
Fine - tuning | true |
Tags | nvidia, llama3.1, spanish, tango |
Datasets | spanish-ir/messirve |
Base Model | nvidia/Llama-3.1-Nemotron-70B-Instruct-HF |
Pipeline Tag | text-generation |
Library Name | transformers |
📚 Documentation
Model Overview
Tango-70B-Instruct is a large language model trained by sandbox-ai on a modified variation of spanish/-ir/messirve to improve the regional Spanish speech performance.
See details on the github repo
Terms of use
By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy
Evaluation Metrics
Task | Name | Description | Language | Metric | Task type |
---|---|---|---|---|---|
AQuAS | AQuAS | Abstractive Question - Answering in Spanish | ES | sas_encoder | Abstractive QA |
ARC_ca | ARC_ca | Grade - school level science questions in Catalan | CA | acc | Multi choice QA |
BEC2016eu | BEC2016eu | Basque Election Campaign 2016 Opinion Dataset | EU | f1 | Sentiment Analysis |
Belebele Glg | Belebele Glg | Reading Comprehension in Galician | GL | acc | Reading Comprehension |
BertaQA | BertaQA | Trivia dataset with global and local questions about the Basque Country | EU | acc | Multi choice QA |
BHTCv2 | BHTCv2 | Topic Classification of News Headlines in Basque | EU | f1 | Classification, Topic Classification |
caBREU | caBREU | Article Summarization in Catalan | CA | bleu | Summarization |
CatalanQA | CatalanQA | Extractive QA in Catalan | CA | f1 | Extractive QA |
CatCoLA | CatCoLA | Linguistic Acceptability in Catalan | CA | mcc | Linguistic Acceptability |
ClinDiagnosES | ClinDiagnosES | Diagnosis of clinical cases in Spanish | ES | sas_encoder | Open QA |
ClinTreatES | ClinTreatES | Treatment for clinical cases in Spanish | ES | sas_encoder | Open QA |
COPA_ca | COPA_ca | Choice Of Plausible Alternatives in Catalan | CA | acc | Reasoning |
CoQCat | CoQCat | Conversational Question Answering in Catalan | CA | f1 | Extractive QA |
Crows Pairs Spanish | Crows Pairs Spanish | Bias evaluation using stereotypes | ES | pct_stereotype | Bias Detection |
EpecKorrefBin | EpecKorrefBin | Coreference resolution in Basque | EU | acc | Coreference Resolution, Textual Entailment |
EsCoLA | EsCoLA | Spanish Corpus of Linguistic Acceptability | ES | mcc | Linguistic Acceptability |
EusExams | EusExams | Public Service examinations questions in Basque | EU | acc | Multi choice QA |
EusProficiency | EusProficiency | C1 - level proficiency questions in Basque | EU | acc | Multi choice QA |
EusReading | EusReading | EGA exams reading comprehension in Basque | EU | acc | Multi choice QA |
EusTrivia | EusTrivia | Trivia questions in Basque | EU | acc | Multi choice QA |
Fake News ES | Fake News ES | Fake News Detection in Spanish | ES | acc | Classification |
GalCoLA | GalCoLA | Galician Corpus of Linguistic Acceptability | GL | mcc | Linguistic Acceptability |
HumorQA | HumorQA | White humour joke classification | ES | acc | Classification |
MGSM_ca | MGSM_ca | Grade - school math problems in Catalan | CA | exact_match | Math Reasoning |
MGSM_es | MGSM_es | Grade - school math problems in Spanish | ES | exact_match | Math Reasoning |
MGSM_eu | MGSM_eu | Grade - school math problems in Basque | EU | exact_match | Math Reasoning |
MGSM_gl | MGSM_gl | Grade - school math problems in Galician | GL | exact_match | Math Reasoning |
NoticIA | NoticIA | A Clickbait Article Summarization Dataset in Spanish | ES | rouge1 | Summarization |
OffendES | OffendES | Clasificación de comentarios ofensivos en español | ES | acc | Classification |
OpenBookQA_ca | OpenBookQA_ca | Multi - step reasoning QA in Catalan | CA | acc | Reasoning |
OpenBookQA_gl | OpenBookQA_gl | Multi - step reasoning QA in Galician | GL | acc | Reasoning |
Parafraseja | Parafraseja | Paraphrase identification in Catalan | CA | acc | Paraphrasing |
ParafrasesGL | ParafrasesGL | Paraphrase identification in Galician | GL | acc | Paraphrasing |
PAWS_ca | PAWS_ca | Paraphrase Adversaries from Word Scrambling in Catalan | CA | acc | Paraphrasing |
PAWS-X_es | PAWS-X_es | Paraphrase Adversaries from Word Scrambling in Spanish | ES | acc | Paraphrasing |
PAWS_gl | PAWS_gl | Paraphrase Adversaries from Word Scrambling in Galician | GL | acc | Paraphrasing |
PIQA_ca | PIQA_ca | Physical Interaction QA in Catalan | CA | acc | Reasoning |
QNLIeu | QNLIeu | Textual Entailment in Basque | EU | acc | NLI, Textual Entailment |
RagQuAS | RagQuAS | Retrieval - Augmented - Generation and Question - Answering in Spanish | ES | sas_encoder | Abstractive QA |
SIQA_ca | SIQA_ca | Social Interaction QA in Catalan | CA | acc | Reasoning |
SpaLawEx | SpaLawEx | Spanish Law School Access Exams | ES | acc | Multi choice QA |
SummarizationGL | SummarizationGL | Abstractive Summarization in Galician | GL | bleu | Summarization |
TE-ca | TE-ca | Textual Entailment in Catalan | CA | acc | Textual Entailment |
TELEIA | TELEIA | Test de Español como Lengua Extranjera para Inteligencia Artificial | ES | acc | Multi choice QA |
VaxxStance | VaxxStance | Stance detection on the Antivaxxers movement | EU | f1 | Sentiment Analysis, Stance Detection |
WiCeu | WiCeu | Word sense disambiguation in Basque | EU | acc | Textual Entailment |
WNLI_ca | WNLI_ca | Winograd - schema - type dataset in Catalan | CA | acc | NLI, Textual Entailment |
WNLI ES | WNLI ES | Winograd - schema - type dataset in Spanish | ES | acc | NLI, Textual Entailment |
XCOPA_eu | XCOPA_eu | Choice Of Plausible Alternatives in Basque | EU | acc | Reasoning |
XNLI_ca | XNLI_ca | Cross - lingual Natural Language Inference in Catalan | CA | acc | NLI, Textual Entailment |
XNLI_es | XNLI_es | Cross - lingual Natural Language Inference in Spanish | ES | acc | NLI |
XNLI_eu | XNLI_eu | Cross - lingual Natural Language Inference in Basque | EU | acc | NLI, Textual Entailment |
XQuAD_ca | XQuAD_ca | Cross - lingual Question Answering Dataset in Catalan | CA | f1 | Extractive QA |
XQuAD_es | XQuAD_es | Cross - lingual Question Answering Dataset in Spanish | ES | f1 | Extractive QA |
xStoryCloze_ca | xStoryCloze_ca | Narrative completion in Catalan | CA | acc | Reasoning |
xStoryCloze_es | xStoryCloze_es | Narrative completion in Spanish | ES | acc | Reasoning |
xStoryCloze_eu | xStoryCloze_eu | Narrative completion in Basque | EU | acc | Reasoning |
📄 License
This model is under the llama3.1 license. By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy

