GPT-SW3-6.7B-V2-INSTRUCT Open Source Model - Free Support for Text Generation Tasks in 6 Languages

Gpt Sw3 6.7b V2 Instruct

Developed by AI-Sweden-Models

GPT-SW3 is a large Nordic language model developed by AI Sweden, supporting text generation tasks in 5 Nordic languages and English

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Nordic multilingual generation #Instruction fine-tuning optimization #Large model research validation

Downloads 2,149

Release Time : 4/28/2023

Model Overview

GPT-SW3 is a large language model based on the Transformer architecture, specifically optimized for Nordic languages, capable of generating coherent text and performing instruction tasks

Model Features

Nordic language support

Language model specifically optimized for Swedish, Norwegian, Danish, and Icelandic

Instruction fine-tuning

Fine-tuned on instruction data to better understand and execute user instructions

Multilingual capability

Supports both English and major Nordic languages, suitable for cross-language application scenarios

Model Capabilities

Text generation

Instruction execution

Multilingual processing

Programming code generation

Use Cases

Content creation

Nordic language content generation

Generate marketing copy, product descriptions, and other content for the Nordic market

Can generate high-quality text that conforms to local language conventions

Education

Language learning assistance

Help students learning Nordic languages practice writing and reading comprehension

🚀 GPT-SW3 - A Collection of Large Language Models

GPT-SW3 is a collection of large decoder-only pretrained transformer language models. It can generate coherent text in multiple languages and programming languages, and can be instructed to perform various text tasks.

🚀 Quick Start

To access the model from Python, since this is a private repository, you need to log in with your access token using huggingface-cli login. Refer to the HuggingFace Quick Start Guide for more details.

The following code snippet demonstrates how to load the tokenizer and model, and use the GPU if available:

import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

# Initialize Variables
model_name = "AI-Sweden-Models/gpt-sw3-6.7b-v2-instruct"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
prompt = "Träd är fina för att"

# Initialize Tokenizer & Model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
model.eval()
model.to(device)

Generating text using the generate method:

input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"].to(device)

generated_token_ids = model.generate(
    inputs=input_ids,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.6,
    top_p=1,
)[0]

generated_text = tokenizer.decode(generated_token_ids)

Using the HuggingFace pipeline is a convenient alternative:

generator = pipeline('text-generation', tokenizer=tokenizer, model=model, device=device)
generated = generator(prompt, max_new_tokens=100, do_sample=True, temperature=0.6, top_p=1)[0]["generated_text"]

✨ Features

Multilingual Capability: GPT-SW3 can generate coherent text in 5 different languages (Swedish, Norwegian, Danish, Icelandic, English) and 4 programming languages.
Instruction Following: It can be instructed to perform text tasks that it has not been explicitly trained for by casting them as text generation tasks.

📚 Documentation

Model Description

GPT-SW3 is a collection of large decoder-only pretrained transformer language models developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. It has been trained on a dataset containing 320B tokens in multiple languages and programming code, using the NeMo Megatron GPT implementation with a causal language modeling (CLM) objective. The instruct models were finetrained on instruction data in both chat and raw text formats.

Intended Use

GPT-SW3 is an autoregressive large language model capable of generating text in multiple languages and programming languages. It can be used for text generation tasks and can be instructed to perform other text-related tasks.

Limitations

Like other large language models, GPT-SW3 has limitations. It may have issues with bias, safety, generation diversity, and hallucination. The model may overrepresent some viewpoints, contain stereotypes, generate inappropriate language, make errors, produce incorrect information, generate irrelevant or repetitive outputs, and create content that may not be suitable for all settings.

Compliance

The release of GPT-SW3 includes model weights, a configuration file, a tokenizer file, and a vocabulary file. None of these files contain personally identifiable information (PII) or copyrighted material.

Model Details

Property	Details
Developer	AI Sweden in collaboration with RISE and the WASP WARA for Media and Language
Release Date	2022-12-20
Model Version	Second generation of GPT-SW3
Model Type	Large decoder-only transformer language model
Training Algorithm	Trained with the NeMo Megatron GPT implementation
License	LICENSE
Contact	nlu@ai.se

Intended Use Details

Primary Uses: Pre-release for research and evaluation of large language models for Nordic languages.
Intended Users: Organizations and individuals in the Nordic NLP ecosystem who can contribute to model validation and testing and provide feedback.
Out-of-scope Use Cases: See the modified RAIL license.

Data, Limitations, and Recommendations

Training Data Selection: Based on a combination of breadth and availability. See the datasheet for more details.
Limitations: Similar to other large language models, it has issues with bias, safety, generation diversity, and hallucination.
Recommendations: Indirect users should be aware of LLM-generated content. Users should be aware of risks and limitations and include appropriate disclaimers or blocking interfaces. Models pretrained with the LLM should have an updated model card. Users should provide feedback mechanisms.

Datasheet

Motivation: To train Swedish large language models, a high-quality large-scale Swedish dataset was needed. Since no such dataset existed, data in Nordic and English languages were collected.
Creator: The NLU research group at AI Sweden, which consists of researchers and developers from AI Sweden and RISE.
Funding: Funded by the Swedish Innovation Agency (Vinnova) through several grants, including 2019-02996 and 2022-00949.

Composition

The dataset is a filtered and deduplicated collection of textual documents categorized by language and document type, including sources from books, articles, code, conversational data, math, miscellaneous sources, and web data.

📄 License

The model is released under the LICENSE.

⚠️ Important Note

GPT-SW3 has limitations in terms of bias, safety, generation diversity, and hallucination. The model may generate inappropriate or incorrect content.

💡 Usage Tip

Indirect users should be made aware when the content they're working with is created by the LLM. Users should be aware of Risks and Limitations, and include an appropriate age disclaimer or blocking interface as necessary.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご