Meta-Llama-3.1-405B-Instruct-GGUF開源大模型 - 支持多語言的指令跟隨任務

首頁

Meta Llama 3.1 405B Instruct GGUF

由MaziyarPanahi開發

Meta-Llama-3.1-405B-Instruct 是一個基於 Llama 3.1 架構的 4050 億參數大型語言模型，專為指令跟隨任務優化，支持多種語言。

大型語言模型支持多種語言#超大規模參數 #多語言文本生成 #低資源量化

下載量 189.43k

發布時間 : 7/24/2024

模型概述

該模型是一個量化後的 GGUF 格式版本，適用於文本生成任務，特別擅長遵循指令生成高質量的文本內容。

模型特點

量化支持

提供 GGUF 格式的量化版本，支持 2 位和 3 位量化，便於在資源有限的設備上運行。

多語言支持

支持包括英語、德語、法語、意大利語、葡萄牙語、印地語、西班牙語和泰語在內的多種語言。

指令跟隨

專為指令跟隨任務優化，能夠根據用戶指令生成高質量的文本內容。

模型能力

文本生成

指令跟隨

多語言支持

使用案例

教育

生成教學材料

根據教師指令生成適合學生學習的教學材料。

生成的教學材料內容準確、結構清晰。

內容創作

創意寫作

根據用戶提供的主題或指令生成創意文本。

生成的文本富有創意，符合用戶要求。

🚀 [MaziyarPanahi/Meta-Llama-3.1-405B-Instruct-GGUF]

本項目提供了 meta-llama/Meta-Llama-3.1-405B-Instruct 模型的 GGUF 格式文件，可用於文本生成任務。

🚀 快速開始

模型信息

模型創建者：meta-llama
原始模型：meta-llama/Meta-Llama-3.1-405B-Instruct

示例運行

以下是使用 llama.cpp/llama-cli 運行模型的示例命令：

llama.cpp/llama-cli -m Meta-Llama-3.1-405B-Instruct.Q2_K.gguf-00001-of-00009.gguf -p "write 10 sentences ending with the word apple." -n 1024 -t 40

運行結果如下：

system_info: n_threads = 40 / 80 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 |
sampling:
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temperature
generate: n_ctx = 131072, n_batch = 2048, n_predict = 1024, n_keep = 1


write 10 sentences ending with the word apple.
1. I love to eat a crunchy, juicy apple.
2. The teacher gave the student a shiny, red apple.
3. The farmer plucked a ripe, delicious apple.
4. My favorite snack is a sweet, tasty apple.
5. The child picked a fresh, green apple.
6. The cafeteria served a healthy, sliced apple.
7. The vendor sold a crisp, autumn apple.
8. The artist painted a still life with a golden apple.
9. The baby took a big bite of a soft, mealy apple.
10. The family enjoyed a basket of fresh, orchard apple. [end of text]

llama_print_timings:        load time = 1068588.13 ms
llama_print_timings:      sample time =    2262.60 ms /   136 runs   (   16.64 ms per token,    60.11 tokens per second)
llama_print_timings: prompt eval time =  339484.02 ms /    11 tokens (30862.18 ms per token,     0.03 tokens per second)
llama_print_timings:        eval time = 33458013.45 ms /   135 runs   (247837.14 ms per token,     0.00 tokens per second)
llama_print_timings:       total time = 33800561.08 ms /   146 tokens
Log end

💻 使用示例

基礎用法

llama.cpp/llama-cli -m Meta-Llama-3.1-405B-Instruct.Q2_K.gguf-00001-of-00009.gguf -p "write 10 sentences ending with the word apple." -n 1024 -t 40

高級用法

可根據實際需求調整命令中的參數，如 n_predict（預測的 token 數量）、n_threads（線程數）等，以滿足不同的應用場景。

📚 詳細文檔

關於 GGUF

GGUF 是由 llama.cpp 團隊在 2023 年 8 月 21 日引入的一種新格式，它是 GGML 的替代方案，目前 llama.cpp 已不再支持 GGML 格式。

以下是已知支持 GGUF 格式的客戶端和庫：

llama.cpp：GGUF 的源項目，提供了 CLI 和服務器選項。
llama-cpp-python：一個支持 GPU 加速、LangChain 和 OpenAI 兼容 API 服務器的 Python 庫。
LM Studio：一個易於使用且功能強大的本地 GUI，支持 Windows 和 macOS（Silicon），並提供 GPU 加速。截至 2023 年 11 月 27 日，Linux 版本處於測試階段。
text-generation-webui：最廣泛使用的 Web UI，具有許多功能和強大的擴展，支持 GPU 加速。
KoboldCpp：一個功能齊全的 Web UI，支持所有平臺和 GPU 架構的 GPU 加速，尤其適合故事創作。
GPT4All：一個免費開源的本地運行 GUI，支持 Windows、Linux 和 macOS，並提供全 GPU 加速。
LoLLMS Web UI：一個具有許多有趣和獨特功能的 Web UI，包括一個完整的模型庫，方便模型選擇。
Faraday.dev：一個美觀且易於使用的基於角色的聊天 GUI，支持 Windows 和 macOS（Silicon 和 Intel），並提供 GPU 加速。
candle：一個專注於性能的 Rust ML 框架，支持 GPU 並易於使用。
ctransformers：一個支持 GPU 加速、LangChain 和 OpenAI 兼容 AI 服務器的 Python 庫。截至 2023 年 11 月 27 日，ctransformers 已有很長時間未更新，不支持許多最新模型。