xlangai_Jedi-3B-1080p-GGUF Open Source Model - Freely Achieve Image-to-Text Generation Function

Xlangai Jedi 3B 1080p GGUF

Developed by bartowski

Jedi-3B-1080p is a 3B-parameter model developed by xlangai, quantized using llama.cpp, suitable for image-text generation tasks.

Large Language Model EnglishOpen Source License:Apache-2.0 #Multiple quantization options #Image-text generation #Low-resource optimization

Downloads 148

Release Time : 6/1/2025

Model Overview

This is a quantized version of xlangai's Jedi-3B-1080p model, processed using llama.cpp, primarily designed for image-text generation tasks.

Model Features

Multiple quantization options

Offers various quantization levels from BF16 to Q2_K to meet different hardware and performance requirements.

High-quality quantization

Some quantized models (e.g., Q6_K_L, Q5_K_M) use high-quality quantization methods, approaching the performance of the original model.

Supports multiple running methods

Can be run in LM Studio or directly using llama.cpp or llama.cpp-based projects.

Model Capabilities

Image-text generation

Text generation

Multi-turn dialogue

Use Cases

Content generation

Image caption generation

Generates detailed text descriptions based on input images.

Multi-turn dialogue

Supports multi-turn dialogues based on system prompts and user input.

🚀 Llamacpp imatrix quantified version of Jedi-3B-1080p by xlangai

This project is a quantified version of xlangai's Jedi-3B-1080p model, which uses specific tools and datasets for quantification processing and can run in multiple environments, providing users with rich choices for different hardware conditions and needs.

🚀 Quick start

This project uses<a ref=“ https://github.com/ggerganov/llama.cpp/ Llama. cppRelease version<a ref= https://github.com/ggerganov/llama.cpp/releases/tag/b5524 Quantify b5524. Original model address: https://huggingface.co/xlangai/Jedi-3B-1080p All quantitative models use the imatrix option and adopt data from [here]（ https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8 ）The dataset. You can find it in LM Studio（ https://lmstudio.ai/ ）Running these quantitative models in the middle can also be done directly using [llama. cpp]（ https://github.com/ggerganov/llama.cpp ）Or any Llama.cpp based project to run.

✨ Main characteristics

-* * Multiple quantization types * : Provides a wide range of quantization types to choose from, such as bf16, Q8_0, Q6_KL, etc., to meet different performance and quality requirements. - * Specific weight processing * *: Some quantization models (such as Q3_KXL, Q4_KL, etc.) use special quantization methods to quantize the embedding and output weights into Q8_0 to improve performance. -Online repackaging: Some quantitative models support online repackaging and can automatically optimize performance based on hardware.

📦 installation guide

###Download using hugginface cli Firstly, ensure that you have installed huggingface cli:

pip install -U "huggingface_hub[cli]"

Then, you can specify the specific files to download:

huggingface-cli download bartowski/xlangai_Jedi-3B-1080p-GGUF --include "xlangai_Jedi-3B-1080p-Q4_K_M.gguf" --local-dir ./

If the model size exceeds 50GB, it will be split into multiple files. To download them all to your local folder, please run:

huggingface-cli download bartowski/xlangai_Jedi-3B-1080p-GGUF --include "xlangai_Jedi-3B-1080p-Q8_0/*" --local-dir ./

You can specify a new local directory (such as xlangai_Jedi-3B-1080p-Q8-0), or download them all to the current directory (./).

💻 Example usage

###Prompt format

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

📚 Detailed documentation

###Download file selection

File Name	Quantization Type	File Size	Segmentation Status	Description
[Jedi-3B-1080p-bf16.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-bf16.gguf ）	BF16	6.18GB	false	Complete BF16 weight.
[Jedi-3B-1080p-Q8_0.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q8_0.gguf ）	Q8-0	3.29GB	false	Extremely high quality, usually not required, but for maximum available quantization.
[Jedi-3B-1080p-Q6_K_L.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q6_K_L.gguf ）	Q6_K_L	2.61GB	false	Quantify the embedding and output weights to Q8_0. Very high quality, almost perfect, * recommended *.
[Jedi-3B-1080p-Q6_K.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q6_K.gguf ）	Q6_K	2.54GB	false	Very high quality, almost perfect, * recommended *.
[Jedi-3B-1080p-Q5_K_L.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q5_K_L.gguf ）	Q5_K_L	2.30GB	false	Quantify the embedding and output weights to Q8_0. High quality, * recommended *.
[Jedi-3B-1080p-Q5_K_M.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q5_K_M.gguf ）	Q5_K_M	2.22GB	false	High quality, * recommended *.
[Jedi-3B-1080p-Q5_K_S.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q5_K_S.gguf ）	Q5_K_S	2.17GB	false	High quality, * recommended *.
[Jedi-3B-1080p-Q4_K_L.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q4_K_L.gguf ）	Q4_K_L	2.01GB	false	Quantify the embedding and output weights to Q8_0. Good quality, * recommended *.
[Jedi-3B-1080p-Q4_1.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q4_1.gguf ）	Q4_1	2.00GB	false	Old format, performance similar to Q4_K_S, but with improved token count per watt on Apple silicon chips.
[Jedi-3B-1080p-Q4_K_M.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q4_K_M.gguf ）	Q4_K_M	1.93GB	false	Good quality, default size for most use cases, * recommended *.
[Jedi-3B-1080p-Q4_K_S.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q4_K_S.gguf ）	Q4_K_S	1.83GB	false	Slightly lower quality, but saves more space, * recommended *.
[Jedi-3B-1080p-Q4_0.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q4_0.gguf ）	Q4-0	1.83GB	false	Old format, supports online repackaging for ARM and AVX CPU inference.
[Jedi-3B-1080p-IQ4_NL.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-IQ4_NL.gguf ）	IQ4_SL	1.83GB	false	Similar to IQ4_XS, but slightly larger. Support online repackaging for ARM CPU inference.
[Jedi-3B-1080p-Q3_K_XL.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q3_K_XL.gguf ）	Q3_KXL	1.78GB	false	Quantify the embedding and output weights to Q8_0. Low quality but usable, suitable for low memory situations.
[Jedi-3B-1080p-IQ4_XS.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-IQ4_XS.gguf ）IQ4_XS	1.74GB	false	Good quality, smaller than Q4_K_S, similar performance, * recommended *.
[Jedi-3B-1080p-Q3_K_L.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q3_K_L.gguf ）	Q3_K_L	1.71GB	false	Low quality but usable, suitable for low memory situations.
[Jedi-3B-1080p-Q3_K_M.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q3_K_M.gguf ）	Q3_K_M	1.59GB	false	Low quality.
[Jedi-3B-1080p-IQ3_M.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-IQ3_M.gguf ）IQ3_S	1.49GB	false	Low to medium quality, new method, performance comparable to Q3_K_M.
[Jedi-3B-1080p-Q3_K_S.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q3_K_S.gguf ）	Q3_K_S	1.45GB	false	Low quality, not recommended.
[Jedi-3B-1080p-IQ3_XS.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-IQ3_XS.gguf ）IQ3_XS	1.39GB	false	Low quality, new method, good performance, slightly better than Q3_K_S.
[Jedi-3B-1080p-Q2_K_L.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q2_K_L.gguf ）	Q2_K_L	1.35GB	false	Quantify the embedding and output weights to Q8_0. The quality is very low, but surprisingly usable.
[Jedi-3B-1080p-IQ3_XXS.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-IQ3_XXS.gguf ）	IQ3_XXS	1.28GB	false	Low quality, new method, good performance, comparable to Q3 quantization.
[Jedi-3B-1080p-Q2_K.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-Q2_K.gguf ）	Q2_K	1.27GB	false	Very low quality, but surprisingly usable.
[Jedi-3B-1080p-IQ2_M.gguf]( https://huggingface.co/bartowski/xlangai_Jedi-3B-1080p-GGUF/blob/main/xlangai_Jedi-3B-1080p-IQ2_M.gguf ）	IQ_2M	1.14GB	false	Relatively low quality, using the most advanced technology, surprisingly usable.
###Embedding/outputting weights
Partial quantization models (such as Q3_KXL, Q4_KL, etc.) use standard quantization methods to quantize the embedding and output weights to Q8-0 instead of the usual default values.
###ARM/AVX Information
Previously, you would download Q4_0_4_4/4_8/8_8, and the weights of these models would be interleaved in memory to improve the performance of ARM and AVX machines by loading more data at once.
However, there is now a weight processing method called "online repackaging", as detailed in [this PR]（ https://github.com/ggerganov/llama.cpp/pull/9921 ）If you use Q4-0 and your hardware can benefit from heavy repackaging, it will automatically repackage in real-time.
Build version from llama. cpp [b4282]（ https://github.com/ggerganov/llama.cpp/releases/tag/b4282 ）At the beginning, you will not be able to run the Q4_0_X_X file and will need to use Q4_0 instead.
In addition, if you want slightly better quality, you can use IQ4-NL, thank you [this PR]（ https://github.com/ggerganov/llama.cpp/pull/10541 ）It will also repackage weights for ARM, but currently only supports 4_4. The loading time may be longer, but the overall speed will increase.
###Which file should I choose?
Firstly, you need to determine how large a model you can run. To do this, you need to know how much RAM and/or VRAM you have.
-If you want the model to run as fast as possible, you need to put the entire model into the GPU's memory. Choose a quantization model with a file size 1-2GB smaller than the total GPU memory.
-If you pursue the absolute highest quality, add up the system memory and GPU memory, and then choose a quantization model with a file size 1-2GB smaller than the sum.
Next, you need to decide whether to use "I-quant" or "K-quant".
-If you don't want to think too much, choose K-quant. The format of these models is "QX_K_X", such as Q5_K_M.
-If you want to delve deeper, you can check out this very useful feature chart: [llama. cpp feature matrix]（ https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix ）.
Generally speaking, if your goal is quantification below Q4 and you are using cuBLAS (Nvidia) or rocBLAS (AMD), you should consider I-quant. The format of these models is IQX_X, such as IQ3_S. They are newer models that provide better performance at the same size.
These I-quant can also be used on CPUs, but they are slower than the corresponding K-quant, so you need to make a trade-off between speed and performance.

🔧 Technical Details

###Quantitative tools Use<a ref=“ https://github.com/ggerganov/llama.cpp/ Llama. cppRelease version<a ref= https://github.com/ggerganov/llama.cpp/releases/tag/b5524 Quantify b5524. ###Quantitative dataset All quantitative models use the imatrix option and adopt data from [here]（ https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8 ）The dataset. ###Online repackaging Partial quantitative models support online repackaging, please refer to [this PR] for details（ https://github.com/ggerganov/llama.cpp/pull/9921 ）.

📄 permit

This project adopts Apache-2.0 license. ##Thank you Thank you to Kalomaze and Dampf for their assistance in creating the Imatrix calibration dataset. Thank you ZeroWOW for providing inspiration in embedding/outputting experiments. Thank you LM Studio for sponsoring this project. If you want to support my work, please visit my ko fi page: https://ko-fi.com/bartowski

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご