Dewey En Beta
杜威是一种新型的长上下文嵌入模型,基于ModernBERT架构,支持128k上下文窗口,在长文档检索任务中表现优异。
下载量 447
发布时间 : 3/23/2025
模型简介
杜威模型专注于提升长文档场景下的检索性能,采用指令式训练方法使嵌入与任务对齐,支持单向量和多向量表示,具有灵活的文本分块机制。
模型特点
超长上下文支持
支持128k tokens的超长上下文处理能力
多向量表示
支持类似Colbert的多向量表示,但向量数量更少(仅为token数的0.5%)
高效编码
受益于ModernBERT架构优势,即使在长文本编码时也能保持高效
灵活分块
支持完全自定义的文本分块策略,可适应不同应用场景
模型能力
长文档检索
语义相似度计算
文本分类
文本聚类
使用案例
信息检索
长文档检索
在包含超长文档的数据库中进行高效检索
在LongEmbed基准测试中取得0.86分,超越多个商业模型
语义分析
语义相似度计算
计算文本之间的语义相似度
在短文本评估(MTEB-eng-v2)中表现优异,超越多个7B规模模型
🚀 Dewey长上下文嵌入模型:技术报告
本技术报告介绍了Dewey,这是一种新颖的长上下文嵌入模型,旨在提升长文档场景下的检索性能。该模型基于ModernBERT架构,结合基于指令的训练方法,使嵌入与特定任务需求相匹配。Dewey具有128k的上下文窗口、多向量表示和灵活的分块机制,在LongEmbed基准测试中取得了优异的成绩。
🚀 快速开始
本模型由Richinfo合作发布,采用了一种新颖的训练方法。虽然尚未完全理解其底层原理,但已取得了不错的效果。因此,我们决定将该模型开源,并希望有人能对其进行测试并提供反馈!
技术报告链接:https://arxiv.org/abs/2503.20376
该模型的核心训练方法将在NovaSearch团队开源的RAG-Retrieval仓库中实现,欢迎star!
本模型基于answerdotai/ModernBERT-large,感谢他们的分享!
✨ 主要特性
- 长文本支持:最大长度为128k,参数大小为395M,仅支持英文。
- 多向量表示:支持单向量和多向量(类似于Colbert,但向量数量更少,仅为标记数量的0.5%)。
- 短文本性能优异:在短文本评估(MTEB-eng-v2)中取得了令人印象深刻的结果,未使用MTEB训练集,甚至超越了几个7B大小的模型。
- 长文本表现出色:在长文本评估LongEmbed中,单向量超越了许多大型和商业模型。若使用多向量,平均得分排名第一。目前,我们的得分为0.86,而当前第一名的得分为0.79。
- 编码速度快:得益于ModernBert的架构优势,长文本的编码速度仍然非常快。
- 灵活的多向量组合:多向量可以理解为跨度或分块级别,而非标记级别,因此可以根据自己的场景完全自定义分块方式,非常灵活。
📦 安装指南
本项目未提供具体安装命令,可参考以下步骤:
- 克隆项目仓库:
git clone https://github.com/NovaSearch-Team/RAG-Retrieval.git
cd RAG-Retrieval
- 安装依赖:
pip install -r requirements.txt
💻 使用示例
基础用法
我们建议结合模型架构图阅读以下内容。
请仔细阅读modeling_dewey_v1.py
和custom_st.py
,这些代码易于阅读,会对你有很大帮助!
# 单向量使用示例
import os
# os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
import torch
from sentence_transformers import SentenceTransformer
RETRIEVE_Q_PROMPT = "<|START_INSTRUCTION|>Answer the question<|END_INSTRUCTION|>"
RETRIEVE_P_PROMPT = "<|START_INSTRUCTION|>Candidate document<|END_INSTRUCTION|>"
model = SentenceTransformer(
"infgrad/dewey_en_beta",
trust_remote_code=True,
model_kwargs={
"torch_dtype": torch.bfloat16,
"attn_implementation": "flash_attention_2"
},
config_kwargs={"single_vector_type": "mean"}
).cuda().bfloat16().eval()
# the choice of single_vector_type:
## for short text (<1k): cls_add_mean
## for long text (>1k): mean
# the max length of model is 128*1024
model.max_seq_length = 32 * 1024
query_vectors = model.encode(
sentences=[f"{RETRIEVE_Q_PROMPT}What is a computer composed of?", f"{RETRIEVE_Q_PROMPT}why the sky is blue"]
)
passage_vectors = model.encode(
sentences=[
f"{RETRIEVE_P_PROMPT}Central processing unit (CPU), memory (RAM), storage (hard drive or SSD), input/output devices (keyboard, mouse, monitor), and a motherboard",
f"{RETRIEVE_P_PROMPT}Shorter wavelengths of light, such as blue and violet, are scattered more by gases and particles in Earth's atmosphere.",
]
)
print(query_vectors @ passage_vectors.T)
# the output is:
# [[0.52512825 0.19771025]
# [0.17617573 0.5918883 ]]
高级用法
自动分块获取多向量
import os
import numpy as np
# os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
from pydantic import BaseModel
from typing import Optional, List
from transformers import AutoTokenizer, AutoModel
class TextSpan(BaseModel):
s: int
e: int
text: Optional[str] = None
module_name: str
RETRIEVE_Q_PROMPT = "<|START_INSTRUCTION|>Answer the question<|END_INSTRUCTION|>"
RETRIEVE_P_PROMPT = "<|START_INSTRUCTION|>Candidate document<|END_INSTRUCTION|>"
model = AutoModel.from_pretrained(
"infgrad/dewey_en_beta",
trust_remote_code=True,
attn_implementation="flash_attention_2"
).cuda().bfloat16()
model.tokenizer = AutoTokenizer.from_pretrained("infgrad/dewey_en_beta")
max_seq_length = 32 * 1024
q_list = ["why the sky is blue"]
p_list = [
"""
I’ve been trying to understand why the sky changes colors, and I think I understand most of it, but something in the online explanations doesn’t make it clear for me:
I’ve read:
sky is blue because blue light gets scattered the most during the day.
in the evening it turns red because now even more of the blue light gets scattered
So a few questions:
The scattering of light during the day: does it mean that blue light gets reflected off air particles and reaches our eyes, while the rest of the frequencies pass through and reach the ground?
Surely some of the other frequencies also get scattered during the day, just in much smaller amounts?
So during the evening blue light gets scattered even more, to the point where even less of it reaches the eyes?
And so it gets red because now we can see the lower frequencies being scattered without blue overshadowing them?\
Trying to word it myself: during the day only the highest frequencies get filtered, but during the evening also lower frequencies get filtered, because now the “light strainer” (air) is just catching more of it?\
It gets darker in the evening without a good ability to see colors because there’s is no blue and so on light to reflect off of objects?\
Is it ok to speak about light as a frequency? Or it’s only correct to say “wave length”?
Blue light is scattered in all directions by the tiny molecules of air in Earth's atmosphere. Blue is scattered more than other colors because it travels as shorter, smaller waves.
This is why we see a blue sky most of the time. Closer to the horizon, the sky fades to a lighter blue or white.
"""
]
# query should be a single vector, so we set chunk_size as -1 to avoid chunk.
# If chunk size is -1, the model will return an array with shape of (2,2048) consisting of cls_vector and mean_vector(mean of all token embeddings).
query_vectors = model.encode(
sentences=q_list,
use_cuda=True,
show_progress_bar=True,
chunk_size=-1,
chunk_overlap=32,
convert_to_tensor=False,
max_seq_length=max_seq_length,
batch_size=8,
normalize_embeddings=True,
prompt=RETRIEVE_Q_PROMPT,
fast_chunk=False
)[0]
# query vector do not need multi vector, we only use mean as final single vector
pred = [vecs[1:2, :] for vecs in query_vectors]
# spans_list contail each chunk's span, you can use span to get text
spans_list: List[List[TextSpan]]
passage_vectors_list: List[np.ndarray]
passage_vectors_list, spans_list = model.encode(
sentences=p_list,
use_cuda=True,
show_progress_bar=True,
chunk_size=64,
chunk_overlap=8,
convert_to_tensor=False,
max_seq_length=max_seq_length,
batch_size=8,
normalize_embeddings=True,
prompt=RETRIEVE_P_PROMPT,
fast_chunk=True, # if fast_chunk is true, directly chunk on input ids, else using RecursiveCharacterTextSplitter
)
# spans_list stores each passage's spans, passage_vectors_list stores each passage's vectors so len(spans_list) == len(p_list) and len(spans_list) == len(passage_vectors_list)
# for a passage's spans and vectors, each span corresponds to a vector (1*2048). So, len(spans_list[idx]) == len(passage_vectors_list[idx])
print((query_vectors[0] @ passage_vectors_list[0].T).max())
# output 0.7331543
# get each chunk's content
for spans, passage in zip(spans_list, p_list):
text_ids = model.tokenizer.encode(RETRIEVE_P_PROMPT + passage)
for span in spans:
s, e = span.s, span.e
chunk_text = model.tokenizer.decode(
text_ids[s:e],
skip_special_tokens=True,
clean_up_tokenization_spaces=True
).strip()
手动分块获取多向量
import os
import numpy as np
# os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
from pydantic import BaseModel
from typing import Optional, List
from transformers import AutoTokenizer, AutoModel
class TextSpan(BaseModel):
s: int
e: int
text: Optional[str] = None
module_name: str
prompt = "<|START_INSTRUCTION|>Candidate document<|END_INSTRUCTION|>"
# load model
model = AutoModel.from_pretrained(
"infgrad/dewey_en_beta",
trust_remote_code=True,
attn_implementation="flash_attention_2"
)
model.tokenizer = AutoTokenizer.from_pretrained("infgrad/dewey_en_beta")
max_seq_length = 32 * 1024
# chunk text
passage = "this sentence 1. this sentence 2. this sentence 3"
chunks = ["this sentence 1. this sentence 2.", "this sentence 2. this sentence 3"]
prompt_length = len(model.tokenizer.tokenize(prompt))
text_spans = [
# s=0, e=1 means that this vector is cls vector, so the module_name is cls_linear, otherwise the module_name is chunk_linear
TextSpan(s=0, e=1, module_name="cls_linear")
]
for chunk in chunks:
s = passage.find(chunk)
e = s + len(chunk)
text_spans.append(
TextSpan(
# add 1, as there is a [CLS] token at the beginning of text.
s=1 + prompt_length + len(model.tokenizer.tokenize(passage[:s])),
e=1 + prompt_length + len(model.tokenizer.tokenize(passage[:e])),
module_name="chunk_linear"
)
)
spans_list: List[List[TextSpan]]
passage_vectors_list: List[np.ndarray]
passage_vectors_list, _ = model.encode(
sentences=[passage],
use_cuda=False,
show_progress_bar=True,
chunk_size=64,
chunk_overlap=12,
convert_to_tensor=False,
max_seq_length=max_seq_length,
batch_size=8,
normalize_embeddings=True,
prompt=prompt,
fast_chunk=True,
batch_text_spans=[text_spans]
)
print(passage_vectors_list[0].shape, passage_vectors_list[0][:, 2])
# the output is (3, 2048) [0.01461297 0.02085092 0.0022509 ]
📚 详细文档
MTEB(eng, v2)
- 评估链接:http://mteb-leaderboard.hf.space/?benchmark_name=MTEB%28eng%2C+v2%29
- 复现脚本:https://huggingface.co/infgrad/dewey_en_beta/blob/main/scripts/evaluate/run_evaluate_mteb_dewey_en_beta.py
模型 | 零样本 | 参数数量 | 维度 | 最大标记数 | 任务平均得分 | 任务类型平均得分 | 分类 | 聚类 | 成对分类 | 重排序 | 检索 | 语义文本相似度 | 摘要 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gemini-embedding-exp-03-07 | 95% | 未知 | 3072 | 8192 | 73.3 | 67.67 | 90.05 | 59.39 | 87.7 | 48.59 | 64.35 | 85.29 | 38.28 |
jasper_en_vision_language_v1 | 56% | 1B | 8960 | 131072 | 71.41 | 66.65 | 90.27 | 60.52 | 88.14 | 50 | 56.05 | 84.37 | 37.19 |
gte-Qwen2-7B-instruct | NA | 7B | 3584 | 32768 | 70.72 | 65.77 | 88.52 | 58.97 | 85.9 | 50.47 | 58.09 | 82.69 | 35.74 |
stella_en_1.5B_v5 | 56% | 1B | 8960 | 131072 | 69.43 | 65.32 | 89.38 | 57.06 | 88.02 | 50.19 | 52.42 | 83.27 | 36.91 |
SFR-Embedding-2_R | 85% | 7B | 4096 | 32768 | 69.82 | 65.31 | 90.54 | 59.39 | 88.09 | 48.99 | 53.75 | 80.86 | 35.54 |
Linq-Embed-Mistral | 95% | 7B | 4096 | 32768 | 69.8 | 65.29 | 83 | 54.07 | 88.44 | 49.44 | 60.14 | 84.69 | 37.26 |
NV-Embed-v2 | 56% | 7B | 4096 | 32768 | 69.81 | 65 | 87.19 | 47.66 | 88.69 | 49.61 | 62.84 | 83.82 | 35.21 |
SFR-Embedding-Mistral | 85% | 7B | 4096 | 32768 | 69.31 | 64.94 | 80.47 | 54.93 | 88.59 | 50.15 | 59.33 | 84.77 | 36.32 |
stella_en_400M_v5 | 56% | 435M | 4096 | 8192 | 69.39 | 64.84 | 88.25 | 57.65 | 87.17 | 49.6 | 52.73 | 83.93 | 34.53 |
text-embedding-004 | 95% | 未知 | 768 | 2048 | 69.53 | 64.82 | 86.03 | 51.52 | 87.65 | 48.48 | 59.06 | 84.84 | 36.12 |
text-embedding-005 | 95% | 未知 | 768 | 2048 | 69.6 | 64.77 | 86.03 | 51.91 | 87.62 | 48.84 | 58.77 | 85.18 | 35.05 |
e5-mistral-7b-instruct | 95% | 7B | 4096 | 32768 | 67.97 | 64 | 79.85 | 51.44 | 88.42 | 49.78 | 57.62 | 84.32 | 36.57 |
text-multilingual-embedding-002 | 95% | 未知 | 768 | 2048 | 67.67 | 63.52 | 84.65 | 50.41 | 86.6 | 47.48 | 54.7 | 83.94 | 36.84 |
NV-Embed-v1 | 56% | 7B | 4096 | 32768 | 68.32 | 63.37 | 84.11 | 49.5 | 87.05 | 49.16 | 60.13 | 82.2 | 31.4 |
infgrad/dewey_en_beta | 95% | 395M | 2048 | 131072 | 0.68 | 63.30 | 81.83 | 51.75 | 86.82 | 46.35 | 56.32 | 84.21 | 35.79 |
gte-Qwen2-1.5B-instruct | NA | 1B | 8960 | 32768 | 67.2 | 63.26 | 85.84 | 53.54 | 87.52 | 49.25 | 50.25 | 82.51 | 33.94 |
GritLM-7B | 95% | 7B | 4096 | 4096 | 67.07 | 63.22 | 81.25 | 50.82 | 87.29 | 49.59 | 54.95 | 83.03 | 35.65 |
GritLM-8x7B | 95% | 57B | 4096 | 4096 | 66.16 | 62.42 | 79.98 | 51.48 | 85.23 | 49.22 | 52.46 | 82.93 | 35.65 |
text-embedding-3-large | NA | 未知 | 3072 | 8191 | 66.43 | 62.15 | 79.15 | 48.9 | 85.81 | 47.45 | 57.98 | 81.44 | 34.31 |
mxbai-embed-large-v1 | 100% | 335M | 1024 | 512 | 66.26 | 62.04 | 79.1 | 47.48 | 87.2 | 48.05 | 55.4 | 84.42 | 32.63 |
GIST-large-Embedding-v0 | 80% | 335M | 1024 | 512 | 66.25 | 61.96 | 78.91 | 48.84 | 86.7 | 48.76 | 54.52 | 84.44 | 31.52 |
bge-large-en-v1.5 | 100% | 335M | 1024 | 512 | 65.89 | 61.87 | 78.34 | 48.01 | 87.13 | 48.26 | 55.44 | 82.79 | 33.13 |
UAE-Large-V1 | 100% | 335M | 1024 | 512 | 66.4 | 61.85 | 79.08 | 47.86 | 87.25 | 48.35 | 55.91 | 84.37 | 30.13 |
LongEmbed
- 评估链接:http://mteb-leaderboard.hf.space/?benchmark_name=LongEmbed
- 复现脚本:https://huggingface.co/infgrad/dewey_en_beta/blob/main/scripts/evaluate/run_evaluate_long_embed.py
模型 | 零样本 | 参数数量 | 嵌入维度 | 最大标记数 | 任务平均得分 | 任务类型平均得分 | 检索 |
---|---|---|---|---|---|---|---|
infgrad/dewey_en_beta-MultiVectors | 100% | 395M | 2048 | 131072 | 86.59 | 86.59 | 86.59 |
voyage-multilingual-2 | 100% | 未知 | 1024 | 32000 | 79.17 | 79.17 | 79.17 |
voyage-law-2 | 100% | 未知 | 1024 | 16000 | 78.85 | 78.85 | 78.85 |
infgrad/dewey_en_beta-SingleVector | 100% | 395M | 2048 | 131072 | 77.98 | 77.98 | 77.98 |
voyage-3 | 100% | 未知 | 1024 | 32000 | 74.06 | 74.06 | 74.06 |
inf-retriever-v1 | 100% | 7B | 3584 | 32768 | 73.19 | 73.19 | 73.19 |
LoCoV1
- 评估链接:
- https://huggingface.co/datasets/hazyresearch/LoCoV1-Queries
- https://huggingface.co/datasets/hazyresearch/LoCoV1-Documents
- 复现脚本:https://huggingface.co/infgrad/dewey_en_beta/blob/main/scripts/evaluate/run_evaluate_loco.py
- 评估指标:NDCG@10
数据集名称 | bge-m3-8k | gte-modernbert-base-8k | Linq-Embed-Mistral-4k | Linq-Embed-Mistral-8k | SFR-Embedding-Mistral-8k | e5-mistral-7b-instruct-8k | dewey_en_beta-8k | dewey_en_beta_64k | dewey_en_beta_64k-multi-vectors |
---|---|---|---|---|---|---|---|---|---|
2wikimqa_test | 0.9271 | 0.8658 | 0.8884 | 0.9067 | 0.8965 | 0.8901 | 0.8953 | 0.9051 | 0.9775 |
courtlistener_HTML_test | 0.1933 | 0.2349 | 0.3551 | 0.3670 | 0.3647 | 0.3543 | 0.3415 | 0.3616 | 0.4775 |
courtlistener_Plain_Text_test | 0.1888 | 0.2478 | 0.3675 | 0.3761 | 0.3679 | 0.3579 | 0.3377 | 0.3485 | 0.4426 |
gov_report_test | 0.9869 | 0.9750 | 0.9832 | 0.9837 | 0.9816 | 0.9823 | 0.9855 | 0.9883 | 0.9853 |
legal_case_reports_test | 0.3702 | 0.4476 | 0.5398 | 0.5432 | 0.5319 | 0.4850 | 0.5474 | 0.5875 | 0.6534 |
multifieldqa_test | 0.9373 | 0.9341 | 0.9345 | 0.9327 | 0.9450 | 0.9321 | 0.9687 | 0.9564 | 0.9754 |
passage_retrieval_test | 0.4493 | 0.5271 | 0.3470 | 0.3407 | 0.2902 | 0.3248 | 0.7562 | 0.7389 | 0.8550 |
qasper_abstract_test | 1.0000 | 0.9806 | 0.9982 | 0.9982 | 0.9973 | 0.9965 | 0.9973 | 0.9982 | 0.9982 |
qasper_title_test | 0.9860 | 0.8892 | 0.9838 | 0.9833 | 0.9861 | 0.9812 | 0.9742 | 0.9742 | 0.9840 |
qmsum_test | 0.6668 | 0.6307 | 0.6816 | 0.7237 | 0.7169 | 0.7148 | 0.7438 | 0.7613 | 0.8154 |
stackoverflow_test | 0.9634 | 0.9087 | 0.9760 | 0.9760 | 0.9766 | 0.9690 | 0.9362 | 0.9369 | 0.9443 |
summ_screen_fd_test | 0.9320 | 0.9379 | 0.9747 | 0.9635 | 0.9656 | 0.9580 | 0.9796 | 0.9821 | 0.9788 |
平均 | 0.7168 | 0.7150 | 0.7525 | 0.7579 | 0.7517 | 0.7455 | 0.7886 | 0.7949 | 0.8406 |
🔧 技术细节
本模型基于answerdotai/ModernBERT-large,采用了一种新颖的训练方法。该模型具有以下技术特点:
- 长上下文处理:支持最大长度为128k的文本,能够处理长文档。
- 多向量表示:支持单向量和多向量表示,多向量表示可以提高检索的准确性。
- 灵活的分块机制:可以根据需要自定义分块方式,以适应不同的应用场景。
- 快速编码:得益于ModernBert的架构优势,模型的编码速度非常快。
📄 许可证
本模型采用MIT许可证。
🔧 局限性
- 语言限制:仅支持英文文本。
- 短文本性能:在短文本任务上,性能可能不如传统的短文本嵌入模型。
- 模型阶段:该模型仍处于alpha或beta阶段,可能会有一些意外的行为。
📖 引用
@misc{zhang2025deweylongcontextembedding,
title={Dewey Long Context Embedding Model: A Technical Report},
author={Dun Zhang and Panxiang Zou and Yudong Zhou},
year={2025},
eprint={2503.20376},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2503.20376},
}
Jina Embeddings V3
Jina Embeddings V3 是一个多语言句子嵌入模型,支持超过100种语言,专注于句子相似度和特征提取任务。
文本嵌入
Transformers 支持多种语言

J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
基于MS Marco段落排序任务训练的交叉编码器模型,用于信息检索中的查询-段落相关性评分
文本嵌入 英语
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
基于蒸馏技术的稀疏检索模型,专为OpenSearch优化,支持免推理文档编码,在搜索相关性和效率上优于V1版本
文本嵌入
Transformers 英语

O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
基于PubMedBERT的生物医学实体表征模型,通过自对齐预训练优化语义关系捕捉
文本嵌入 英语
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large 是一个强大的句子转换器模型,专注于句子相似度和文本嵌入任务,在多个基准测试中表现出色。
文本嵌入 英语
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 是一个英文句子转换器模型,专注于句子相似度任务,在多个文本嵌入基准测试中表现优异。
文本嵌入
Transformers 支持多种语言

G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base 是一个多语言的句子嵌入模型,支持超过50种语言,适用于句子相似度计算等任务。
文本嵌入
Transformers 支持多种语言

G
Alibaba-NLP
1.2M
246
Polybert
polyBERT是一个化学语言模型,旨在实现完全由机器驱动的超快聚合物信息学。它将PSMILES字符串映射为600维密集指纹,以数值形式表示聚合物化学结构。
文本嵌入
Transformers

P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
基于土耳其语BERT的句子嵌入模型,专为语义相似度任务优化
文本嵌入
Transformers 其他

B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
基于BAAI/bge-small-en-v1.5模型微调的文本嵌入模型,通过MEDI数据集与MTEB分类任务数据集训练,优化了检索任务的查询编码能力。
文本嵌入
Safetensors 英语
G
avsolatorio
945.68k
29
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98