Loquace-7B-Mistral Open Source Large Language Model - Free Use to Boost Democratized AI Conversations in Italy

Loquace 7B Mistral

Developed by cosimoiaia

Loquace is an Italian-speaking, instruction-fine-tuned large language model aimed at promoting the democratization of AI and LLMs in Italy.

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Italian instruction fine-tuning #Low-cost training #RAG optimization

Downloads 17

Release Time : 12/6/2023

Model Overview

An Italian large language model fine-tuned with QLoRa and Mistral-7B-Instruct, excelling in instruction following, prompt engineering, and RAG setups.

Model Features

Low-cost efficient training

Training completed in just 4 hours on a single 3090 GPU, costing approximately 1 euro

Italian language optimization

Specifically fine-tuned for Italian instructions, responding well to prompt engineering

Fully open-source

Model, dataset, and training code all publicly available

Excellent RAG performance

Outstanding performance in retrieval-augmented generation setups

Model Capabilities

Italian text generation

Instruction following

Question answering

Recipe generation

Mathematical calculations

Use Cases

Education

Historical figure Q&A

Answering questions about Italian historical figures

Can accurately describe the life of Dante Alighieri

Culinary

Regional recipe generation

Generating Puglia-style recipes

Can creatively combine local ingredients to design new dishes

Mathematics

Basic arithmetic

Performing simple mathematical calculations

Can correctly solve basic problems like the square root of 144

🚀 Loquace-7B-Mistral v0.1

Loquace is an Italian-speaking, instruction-finetuned Large Language Model. It aims to democratize AI and LLM in the Italian landscape, allowing users to train on their own datasets with minimal resources.

✨ Features

Italian Instruction Following: It excels at following instructions in Italian.
Prompt Engineering Responsiveness: Responds well to prompt-engineering techniques.
RAG Compatibility: Performs effectively in a RAG (Retrieval Augmented Generation) setup.
Cost-Effective Training: Trained on the Loquace-102K dataset using QLoRa and Mistral-7B-Instruct as the base. The training took only 4 hours on a 3090 GPU on Genesis Cloud, costing a little over 1 euro.
Truly Open Source: The model, dataset, and code to replicate the results are fully open-sourced.
Garage Creation: Developed in a garage in southern Italy.

📦 Installation

The related code for fine-tuning can be found at: https://github.com/cosimoiaia/Loquace

The 8-bit quantized GGUF version for CPU inference of Loquace can be found here.

Here is a list of clients and libraries known to support GGUF:

llama.cpp: The source project for GGUF, offering a CLI and a server option.
text-generation-webui: A widely used web UI with numerous features and powerful extensions, supporting GPU acceleration.
KoboldCpp: A fully-featured web UI with GPU acceleration across all platforms and GPU architectures, especially suitable for storytelling.
LM Studio: An easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration.
LoLLMS Web UI: A great web UI with many interesting and unique features, including a full model library for easy model selection.
Faraday.dev: An attractive and user-friendly character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
ctransformers: A Python library with GPU acceleration, LangChain support, and an OpenAI-compatible AI server.
llama-cpp-python: A Python library with GPU acceleration, LangChain support, and an OpenAI-compatible API server.
candle: A Rust ML framework focusing on performance, including GPU support, and ease of use.

Previous releases of the Loquace family:

The Loquace family began in early 2023 to demonstrate the feasibility of fine-tuning a Large Language Model in a different language. You can find other family members here:

https://huggingface.co/cosimoiaia/Loquace-70m - Based on pythia-70m
https://huggingface.co/cosimoiaia/Loquace-410m - Based on pythia-410m
https://huggingface.co/cosimoiaia/Loquace-7B - Based on Falcon-7B
https://huggingface.co/cosimoiaia/Loquace-12B - Based on pythia-12B
https://huggingface.co/cosimoiaia/Loquace-20B - Based on gpt-neox-20B

These models, despite not yielding excellent results, prove that cost-effective fine-tuning is possible in a short time, with a relatively raw dataset, and extremely low resources.

💻 Usage Examples

Basic Usage

from transformers import MistralForCausalLM, AutoTokenizer


def generate_prompt(instruction):    
   prompt = f"""### Instruction: {instruction}
   
### Response:
"""
   return prompt

model_name = "cosimoiaia/Loquace-7B-Mistral"

model = MistralForCausalLM.from_pretrained(
   model_name,
   device_map="auto",
   torch_dtype=torch.bfloat16                
)

model.config.use_cache = True


tokenizer = AutoTokenizer.from_pretrained(model_name, add_eos_token=False)

prompt = generate_prompt("Chi era Dante Alighieri?")
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, do_sample = True, num_beams = 2, top_k=50, top_p= 0.95, max_new_tokens=2046, early_stopping = True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("Response:")[1].strip())

Advanced Usage

> Pretendi di essere Dante Alighieri e raccontami la tua vita.
 Sono Dante Alighieri, un poeta e scrittore italiano del XIV secolo. Nacqui a Firenze nel 1265 e ho trascorso gran parte della mia vita viaggiando in Europa e scrivendo poesie. Ho anche scritto la Divina Commedia, una delle opere letterarie più famose di tutti i tempi.

> Quanto fa 7 x 14 - 98 ?
 7 x 14 = 98, quindi la risposta è 0.

> La radice quadrata di 144 ?
 La radice quadrata di 144 è 12.

> Inventa una ricetta pugliese con la polenta.
 La ricetta è chiamata "Polenta al burro di arachidi e formaggio". Per prepararla, inizia bollendo la polenta secondo le istruzioni sulla confezione. Mentre la polenta si cucina, mescola insieme 1/2 tazza di burro di arachidi, 1/4 di tazza di formaggio grattugiato e 1/4 di tazza di parmigiano grattugiato. Una volta che la polenta è cotta, versala su una teglia da forno e distribuire il composto di burro di arachidi e formaggio sopra la polenta. Metti in forno a 350 gradi per 15 minuti o fino a quando la parte superiore è dorata. Servi caldo con un'insalata di pomodoro e basilico fresco.

📚 Documentation

Limitations

Loquace-7B may not handle complex or nuanced queries well and may struggle with ambiguous or poorly formatted inputs.
The model may generate factually incorrect or nonsensical responses. It should be used with caution, and outputs should be carefully verified.

📄 License

This project is licensed under the Apache-2.0 license.

Model Author

Cosimo Iaia cosimo.iaia@gmail.com

Model Card for Loquace-7B-Mistral (Italian Version translated from Loquace)

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご