Model Selection

4-bit Efficient Inference

# 4-bit Efficient Inference

Llama 2 7b Hf 4bit 64rank

The LoftQ (LoRA Fine-tuning Aware Quantization) model provides a quantized backbone network and LoRA adapters, specifically designed for LoRA fine-tuning to improve the fine-tuning performance and efficiency of large language models during the quantization process.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase