
Model Overview
Model Features
Model Capabilities
Use Cases
🚀 Llama 2 13b Chat Norwegian
Llama-2-13b-chat-norwegian is a variant of Meta's Llama 2 13b Chat model. It's finetuned on a mix of Norwegian datasets, enabling it to understand and generate text in Norwegian, which is valuable for NLP tasks in the Norwegian language.
🚀 Quick Start
Llama-2-13b-chat-norwegian is ready to use for text generation tasks in Norwegian. You can start leveraging its capabilities right away after setting up the appropriate environment.
✨ Features
- Norwegian Language Support: This model is specifically tuned to understand and generate text in Norwegian, making it suitable for various Norwegian NLP applications.
- Finetuned on Diverse Datasets: It's finetuned on a combination of Norwegian datasets, including norwegian - alpaca and machine - translated data from OpenOrca, along with a small subset of custom - made instructional data.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
The model can be used in a text - generation pipeline. For example, in a Python environment with the appropriate Hugging Face libraries:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "RuterNorway/Llama-2-13b-chat-norwegian"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "This is a test input"
input_ids = tokenizer(input_text, return_tensors='pt').input_ids
output = model.generate(input_ids)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
Advanced Usage
For more complex tasks such as chat - based interactions, you can use the appropriate prompt templates provided by the model:
# Using the Llama2 Chat prompt template
prompt = "<s>[INST] <<SYS>> You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. Please answer in the same language as the user. <</SYS>> This is a test question[/INST]"
input_ids = tokenizer(prompt, return_tensors='pt').input_ids
output = model.generate(input_ids)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
📚 Documentation
Data
- Norwegian alpaca: A key dataset used for finetuning.
- 15k Norwegian OpenOrcra (to be released): Another dataset that contributes to the model's training.
- Small subset of custom - made instructional data: Adds specific knowledge and patterns to the model.
Intended Use
This model is intended for commercial and research use in Norwegian and can be used as an assistant - like chat.
Prompt Template
- Llama2 Chat Prompt Format:
<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information. Please answer in the same language as the user.
<</SYS>>
This is a test question[/INST] This is a answer </s><s>
See the original implementation here.
- Alpaca Prompt Format:
### Instruction:
Summarize following text.
### Input:
Text to be summarized
### Response:
Why this model?
As a Norwegian company, we recognize the urgent need for powerful language models tailored to specific languages. Our goal is to democratize information, promote innovation, and create a more inclusive digital ecosystem by providing this open - source Norwegian model. We hope it will serve as a foundational resource for future specialized Norwegian models and strengthen the Norwegian NLP community.
Limitations
- Knowledge Limitation: It's an LLM, not a knowledge model, and can't be expected to have more information about Norway than the base model.
- Task - Specific Performance: Generally performs better on summarization, question - answering, and chat tasks than on tasks requiring in - depth knowledge of Norway, specific domains, or free - form answering.
- Data Quality: The training data is machine - translated and may contain grammatical and other errors.
- Prompt Tuning: The model is released as is and usually requires prompt tuning for optimal results.
License
Llama 2 is licensed under the LLAMA 2 [Community License](https://ai.meta.com/resources/models - and - libraries/llama - downloads/), Copyright © Meta Platforms, Inc. All Rights Reserved. See the original [model card](https://huggingface.co/meta - llama/Llama - 2 - 13b) for more information. Also, from [norwegian - alpaca](https://huggingface.co/NbAiLab/norwegian - alpaca), note that "the current version uses OpenAI's gpt - 3.5 - turbo; hence, this dataset cannot be used to create models that compete in any way against OpenAI."
Disclaimer
- As - Is Availability: The model is available "as is", and Ruter As takes no responsibility for further use.
- Ethical Considerations: Although the safeguards implemented by Meta seem to work as expected during testing, developers should refer to the Ethical Considerations and Limitations from the original model card:
Llama 2 is a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios.
For these reasons, as with all LLMs, Llama 2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts.
Therefore, before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to their specific applications of the model.
Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible - use - guide/
🔧 Technical Details
No specific technical details (more than 50 words) are provided in the original document.
📄 License
Llama 2 is licensed under the LLAMA 2 [Community License](https://ai.meta.com/resources/models - and - libraries/llama - downloads/), Copyright © Meta Platforms, Inc. All Rights Reserved. See the original [model card](https://huggingface.co/meta - llama/Llama - 2 - 13b) for more information. Also, from [norwegian - alpaca](https://huggingface.co/NbAiLab/norwegian - alpaca), note that "the current version uses OpenAI's gpt - 3.5 - turbo; hence, this dataset cannot be used to create models that compete in any way against OpenAI."
Credits
This model was developed at Ruters AI Lab, which is part of Ruters Data & AI division.

