SummLlama3.2-3B-GGUF Open-Source Summary Generation Model - Multi-Quantized Versions for Different Hardware Requirements

Summllama3.2 3B GGUF

Developed by tensorblock

SummLlama3.2-3B is a 3.2B-parameter summary generation model optimized based on the Llama3 architecture, offering multiple quantization versions to accommodate different hardware requirements.

Large Language Model #Text Summarization #Lightweight Model #Multi-format Quantization

Downloads 95

Release Time : 11/20/2024

Model Overview

A lightweight language model focused on text summarization tasks, providing quantization options from Q2_K to Q8_0 to balance performance and resource consumption.

Model Features

Multi-level Quantization Options

Offers 12 quantization levels from Q2_K (1.36GB) to Q8_0 (3.42GB) to meet deployment needs under different hardware conditions

Optimized Prompt Template

Uses structured prompt templates to clearly distinguish system instructions from user input, improving summary generation accuracy

Lightweight and Efficient

With 3.2B parameters, it reduces computational resource requirements while maintaining quality, making it suitable for edge device deployment

Model Capabilities

Text Summary Generation

Long Text Compression

Key Information Extraction

Use Cases

Content Processing

News Summarization

Automatically generates core content summaries of news articles

Retains over 90% of key information from the original text

Meeting Minutes

Extracts decision points and action items from meeting records

Research Assistance

Paper Summarization

Automatically generates concise summaries of academic papers

library_name: transformers base_model: DISLab/SummLlama3.2-3B pipeline_tag: summarization tags:

TensorBlock
GGUF

Feedback and support: TensorBlock's Twitter/X, Telegram Group and Discord server

DISLab/SummLlama3.2-3B - GGUF

This repo contains GGUF format model files for DISLab/SummLlama3.2-3B.

The files were quantized using machines provided by TensorBlock, and they are compatible with llama.cpp as of commit b4011.

Our projects

Awesome MCP Servers	TensorBlock Studio

A comprehensive collection of Model Context Protocol (MCP) servers.	A lightweight, open, and extensible multi-LLM interaction studio.
👀 See what we built 👀	👀 See what we built 👀

## Prompt template

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutting Knowledge Date: December 2023
Today Date: 20 Nov 2024

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Model file specification

Filename	Quant type	File Size	Description
SummLlama3.2-3B-Q2_K.gguf	Q2_K	1.364 GB	smallest, significant quality loss - not recommended for most purposes
SummLlama3.2-3B-Q3_K_S.gguf	Q3_K_S	1.543 GB	very small, high quality loss
SummLlama3.2-3B-Q3_K_M.gguf	Q3_K_M	1.687 GB	very small, high quality loss
SummLlama3.2-3B-Q3_K_L.gguf	Q3_K_L	1.815 GB	small, substantial quality loss
SummLlama3.2-3B-Q4_0.gguf	Q4_0	1.917 GB	legacy; small, very high quality loss - prefer using Q3_K_M
SummLlama3.2-3B-Q4_K_S.gguf	Q4_K_S	1.928 GB	small, greater quality loss
SummLlama3.2-3B-Q4_K_M.gguf	Q4_K_M	2.019 GB	medium, balanced quality - recommended
SummLlama3.2-3B-Q5_0.gguf	Q5_0	2.270 GB	legacy; medium, balanced quality - prefer using Q4_K_M
SummLlama3.2-3B-Q5_K_S.gguf	Q5_K_S	2.270 GB	large, low quality loss - recommended
SummLlama3.2-3B-Q5_K_M.gguf	Q5_K_M	2.322 GB	large, very low quality loss - recommended
SummLlama3.2-3B-Q6_K.gguf	Q6_K	2.644 GB	very large, extremely low quality loss
SummLlama3.2-3B-Q8_0.gguf	Q8_0	3.422 GB	very large, extremely low quality loss - not recommended

Downloading instruction

Command line

Firstly, install Huggingface Client

pip install -U "huggingface_hub[cli]"

Then, downoad the individual model file the a local directory

huggingface-cli download tensorblock/SummLlama3.2-3B-GGUF --include "SummLlama3.2-3B-Q2_K.gguf" --local-dir MY_LOCAL_DIR

If you wanna download multiple model files with a pattern (e.g., *Q4_K*gguf), you can try:

huggingface-cli download tensorblock/SummLlama3.2-3B-GGUF --local-dir MY_LOCAL_DIR --local-dir-use-symlinks False --include='*Q4_K*gguf'

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご