đ TinyLlama/TinyLlama-1.1B-Chat-v0.6-GGUF
Quantized GGUF model files for TinyLlama-1.1B-Chat-v0.6 from TinyLlama. This project provides quantized models to optimize resource usage and enhance performance.
đ Quick Start
Model Information
Property |
Details |
Base Model |
TinyLlama/TinyLlama-1.1B-Chat-v0.6 |
Model Creator |
TinyLlama |
Model Name |
TinyLlama-1.1B-Chat-v0.6 |
Pipeline Tag |
text-generation |
Quantized By |
afrideva |
Tags |
gguf, ggml, quantized, q2_k, q3_k_m, q4_k_m, q5_k_m, q6_k, q8_0 |
License |
apache-2.0 |
Datasets |
cerebras/SlimPajama-627B, bigcode/starcoderdata, OpenAssistant/oasst_top1_2023-08-25 |
Inference |
false |
Language |
en |
Quantized Model Files
⨠Features
Original Model Goals
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, it can achieve this within a span of "just" 90 days using 16 A100 - 40G GPUs đđ. The training started on 2023 - 09 - 01.
Model Advantages
- Compatibility: It adopted exactly the same architecture and tokenizer as Llama 2, which means TinyLlama can be plugged and played in many open - source projects built upon Llama.
- Compactness: TinyLlama is compact with only 1.1B parameters, allowing it to cater to a multitude of applications demanding a restricted computation and memory footprint.
This Model's Training
đģ Usage Examples
Basic Usage
import torch
from transformers import pipeline
pipe = pipeline("text-generation", model="TinyLlama/TinyLlama-1.1B-Chat-v0.6", torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{
"role": "system",
"content": "You are a friendly chatbot who always responds in the style of a pirate",
},
{"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
đ Documentation
Prerequisites
You will need transformers>=4.34
. Do check the TinyLlama github page for more information.
Original Model Card
You can find more details about the original model at TinyLlama-1.1B.
Model Training Details
This model is a chat - finetuned version based on TinyLlama/TinyLlama-1.1B-intermediate-step-955k-2T. The training process involves multiple steps and datasets, as described above.
đ License
This project is under the apache - 2.0
license.