Fastwebmiia 7B
Model Overview
Model Features
Model Capabilities
Use Cases
🚀 FastwebMIIA
FastwebMIIA is a large language model developed by Fastweb, specifically designed for the Italian language and cultural context, offering support for various text - generation tasks.
🚀 Quick Start
This model card offers an overview of FastwebMIIA, the Italian Artificial Intelligence Model developed by Fastweb.
✨ Features
- Multilingual Support: Trained in both Italian and English, catering to a broader user base.
- Optimized Tokenizer: A custom tokenizer optimized for Italian, English, and main programming languages, with a vocabulary of 50,000 tokens.
- Extended Context Window: Supports an extended contextual window of 16k tokens, enabling it to handle long - form content effectively.
- Multiple Access Platforms: Accessible through on - premise low - code platforms and Hugging Face, meeting different deployment needs.
📦 Installation
The model was trained and tested using transformers==4.45.2
. You can use the following code to set up the model:
import transformers
import torch
model_id = "Fastweb/FastwebMIIA-7B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="cuda",
)
messages = [
{"role": "system", "content": "Sei FastwebMIIA, il chatbot italiano sviluppato da Fastweb."},
{"role": "user", "content": "Ciao!"},
]
outputs = pipeline(
messages,
max_new_tokens=256,
repetition_penalty=1.1,
top_p=0.9,
temperature=0.1
)
print(outputs[0]["generated_text"][-1])
# output: {'role': 'assistant', 'content': 'Ciao! Come posso aiutarti oggi?'}
💻 Usage Examples
Basic Usage
import transformers
import torch
model_id = "Fastweb/FastwebMIIA-7B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="cuda",
)
messages = [
{"role": "system", "content": "Sei FastwebMIIA, il chatbot italiano sviluppato da Fastweb."},
{"role": "user", "content": "Ciao!"},
]
outputs = pipeline(
messages,
max_new_tokens=256,
repetition_penalty=1.1,
top_p=0.9,
temperature=0.1
)
print(outputs[0]["generated_text"][-1])
📚 Documentation
Model Overview
FastwebMIIA is a large language model with 7 billion parameters, based on an autoregressive transformer architecture. It is specifically designed and trained for the Italian language and cultural context, using a carefully curated, predominantly Italian corpus, and fully compliant with EU AI Act and national regulations.
Property | Details |
---|---|
Model Developer | Fastweb |
Model Type/Architecture | Based on autoregressive (causal, decoder - only) transformer architectures, incorporating rotary position embeddings and trained using the next - token prediction objective. |
Languages Available | Trained in Italian and English |
Model Released Date | May 29, 2025 |
License | Accessible under a Non - Commercial License for non - commercial research, educational and internal use, and a custom Commercial License for commercial use. |
Model Access
FastwebMIIA can be accessed through multiple platforms:
- On - Premise (Low - Code Tooling): Deployable within enterprise environments for commercial purposes via a low - code platform. Contact Attivazione.FastwebMIIA@fastweb.it for a commercial demo or more information about enterprise deployment.
- Hugging Face: The model weights and configuration files are publicly available on Hugging Face for personal non - professional research and non - commercial professional activities. Users can download, fine - tune, or deploy the model using Hugging Face's tools and hosted infrastructure under the Non - Commercial License.
Hardware and Software
FastwebMIIA was trained on a proprietary NVIDIA H100 GPU cluster, with the training workflow managed by MLDE (Machine Learning Development Environment) and LLMFoundry. There is no guarantee of compatibility with the Licensee's specific environments, operating systems, hardware, or software.
Training Details
Architecture details
Hyperparameter | Value |
---|---|
Number of layers | 32 |
Number of attention heads | 32 |
Head size | 128 |
Number of Key - Value heads | 8 |
Hidden dimension size | 4096 |
Intermediate (MLP) size | 14,336 |
MLP activation function | SiLU |
MLP type | Standard |
Attention dropout | 0.0 |
MLP/Attention bias | No |
Normalization type | RMSNorm |
RMSNorm epsilon | 1e - 5 |
Vocabulary size | 50,270 |
Sequence length (context window) | 16,384 |
Rotary position embedding type | LLaMA v3 - style |
Rotary base (rope theta) | 500,000 |
Rotary scaling factor | 8.0 |
High/Low frequency rope factors | 4.0 / 1.0 |
Weight initialization range | ±0.02 |
Tied word embeddings | No |
Data type | bfloat16 |
Total parameter count | 7.39 billion |
Tokenizer
The tokenizer has a vocabulary size of 50260 and was trained via the Byte - Pair Encoding (BPE) algorithm. It includes 50,000 BPE tokens, 256 tokens representing all byte values, and 4 special tokens (BOS, EOS, PAD, UNK). The training set was a subset of high - quality data in Italian, English, and programming languages.
Fertility
Tokenizer fertility is calculated as the ratio between the number of tokens produced and the number of words in the original text. The following table shows the fertility values calculated on a subset (1%) of the Italian Wikipedia dataset from March 2022:
model | tokens | fertility |
---|---|---|
Almawave/Velvet - 14B | 126976 | 1.537129 |
Fastweb/FastwebMIIA - 7B | 50270 | 1.569404 |
iGeniusAI/Italia - 9B - Instruct - v0.1 | 50003 | 1.589896 |
sapienzanlp/Minerva - 7B - instruct - v1.0 | 51203 | 1.620168 |
google/gemma - 2 - 9b - it | 256000 | 1.708481 |
utter - project/EuroLLM - 9B - Instruct | 128000 | 1.723624 |
mistralai/Ministral - 8B - Instruct - 2410 | 131072 | 1.771119 |
meta - llama/Llama - 3.1 - 8B - Instruct | 128256 | 1.970075 |
microsoft/Phi - 3 - small - 8k - instruct | 100352 | 1.974537 |
Qwen/Qwen2.5 - 7B - Instruct | 151665 | 2.020880 |
ibm - granite/granite - 3.1 - 8b - instruct | 49155 | 2.386821 |
Training Data
FastwebMIIA was pretrained on approximately (1.5\times2\times10^{12}) textual tokens from a combination of publicly available and proprietary sources. The corpus mainly consists of Italian and English content, with a smaller proportion of other languages. Fine - tuning used a mix of open instruction - tuning datasets and synthetic examples generated by Phi family models. No prompt data is stored, and no user data is collected for training. The pretraining data has a cutoff of August 2024, and the data was collected until February 2025.
Limitations and Biases
FastwebMIIA may generate factually inaccurate, misleading, or incomplete responses. It may also reflect social, cultural, or historical biases present in its training data. It should not be considered an authoritative source of information or a replacement for professional judgment.
Intended Use
FastwebMIIA is designed for tasks such as chat - based assistance, content generation, summarization, and information extraction. It is for research, development, and integration into AI applications with proper safeguards.
Out - of - Scope or Prohibited Use
FastwebMIIA must not be used in illegal or fraudulent activities, to generate harmful or deceptive content, or in high - risk domains without human oversight. The Licensee is responsible for the model's use and its outcomes.
Report issues
Report any misuse, unexpected behavior, or concerns about the model's outputs to assistenza.FastwebMIIA@fastweb.it.
Evaluation
The model was evaluated using Hugging Face's [lm - eval framework](https://github.com/EleutherAI/lm - evaluation - harness). The evaluation focused on Italian - specific benchmarks:
- HellaSwag IT: A multiple - choice task for reasoning and text completion in Italian.
- ARC IT (AI2 Reasoning Challenge): A multiple - choice benchmark for science questions in Italian.
- ARC Challenge MT IT: A multilingual adaptation of the ARC Challenge for Italian.
- MMLU IT: The Massive Multitask Language Understanding dataset in Italian.
- Global MMLU IT: An extended version of MMLU in Italian.
- XCOPA IT: A multilingual benchmark for causal reasoning in Italian.
General Knowledge Benchmarks scores
Tasks | Metric | Score 5 - shot | Score 0 - shot |
---|---|---|---|
arc_challenge_mt_it | acc_norm | 0.5 | 0.4317 |
arc_it | acc_norm | 0.5158 | 0.4559 |
global_mmlu_it | acc | 0.615 | 0.5525 |
hellaswag_it | acc_norm | 0.6453 | 0.6453 |
m_mmlu_it | acc | 0.5707 | 0.5293 |
xcopa_it | acc | 0.784 | 0.774 |
Model Updates
New versions of the model will be published on this page. The Licensee is responsible for using the latest version to avoid potential issues.
📄 License
FastwebMIIA is accessible under a Non - Commercial License explicitly allowing for non - commercial research, educational and internal use, and under a custom Commercial License for any commercial use of the Model. You need to accept the [FastwebMIIA's Non - Commercial License](https://www.fastweb.it/grandi - aziende/artificial - intelligence/Non - Commercial - License.pdf), the [Acceptable Use Policy](https://www.fastweb.it/grandi - aziende/artificial - intelligence/fastweb - miia/documentazione - trasparenza - ai/Acceptable%20Use%20Policy.pdf) (AUP), and the other attached documents to access the model.
⚠️ Important Note
This repository is publicly accessible, but you have to accept the conditions to access its files and content. By downloading, accessing, and using the model as specified, you fully accept the [FastwebMIIA's Non - Commercial License](https://www.fastweb.it/grandi - aziende/artificial - intelligence/Non - Commercial - License.pdf), the [Acceptable Use Policy](https://www.fastweb.it/grandi - aziende/artificial - intelligence/fastweb - miia/documentazione - trasparenza - ai/Acceptable%20Use%20Policy.pdf) (AUP), and the other attached documents. If you do not agree to the terms and conditions in this license and the related documents, you must not download or use the model and should delete any copies you may already have.
💡 Usage Tip
When using FastwebMIIA, no prompt data is stored, and user inputs are not recorded, ensuring that no personally identifiable information (PII) is collected, and user data is not used for training purposes.

