🚀 最快的文本嵌入模型:tabularisai/all-MiniLM-L2-v2
本模型是從 sentence-transformers/all-MiniLM-L12-v2 蒸餾而來,與最小的 all-MiniLM-L6-v2 模型相比,推理速度幾乎快了 2 倍,同時在 CPU 和 GPU 上都能保持較高的準確性。
🚀 快速開始
本模型可用於文本嵌入和檢索增強生成(RAG)等任務,下面將詳細介紹其使用方法。
💻 使用示例
基礎用法
檢索增強生成(RAG)示例
可將此模型用作 RAG 管道中的檢索器:
from sentence_transformers import SentenceTransformer, util
import faiss
import numpy as np
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2")
documents = [
"Renewable energy comes from natural sources.",
"Solar panels convert sunlight into electricity.",
"Wind turbines harness wind power.",
"Fossil fuels are non-renewable sources of energy.",
"Hydropower uses water to generate electricity."
]
doc_embeddings = model.encode(documents, convert_to_numpy=True)
dim = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(doc_embeddings)
query = "What are the benefits of renewable energy?"
query_embedding = model.encode([query], convert_to_numpy=True)
D, I = index.search(query_embedding, k=3)
print("Query:", query)
print("\nTop 3 similar documents:")
for rank, idx in enumerate(I[0]):
print(f"{rank+1}. {documents[idx]} (score: {D[0][rank]:.4f})")
高級用法
句子嵌入示例
首先安裝庫:
pip install -U sentence-transformers
然後加載模型並對句子進行編碼:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tabularisai/all-MiniLM-L2-v2")
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
📄 許可證
本項目採用 Apache-2.0 許可證。