Engessay Grading ML
模型简介
本模型主要用于英语作文自动评分,特别是针对第二语言(L2)学习者撰写的文章。训练数据集采用英语学习者洞察、能力与技能评估语料库(ELLIPSE Corpus)。
模型特点
多维度评分
提供衔接性、句法、词汇、短语运用、语法和规范六个维度的评分
高准确性
在测试集上表现优秀:平均准确率=0.91,平均F1分数=0.9,二次加权卡帕系数(QWK)=0.85
专业训练数据
使用约6,500份专业英语教师评分的英语学习者写作样本训练
模型能力
英语作文评分
多维度文本分析
第二语言学习评估
使用案例
教育
EFL教师评分辅助
帮助英语作为外语的教师快速评估学生作文
提供六个维度的详细评分,节省评分时间
学习者自评
英语学习者可以自我评估写作能力
了解自己在不同维度的写作水平
🚀 英语作文自动评分模型
本模型主要用于英语作文的自动评分,尤其适用于英语作为第二语言(L2)的学习者所写的作文。该模型使用的训练数据集是英语语言学习者洞察、能力和技能评估(ELLIPSE)语料库。这一免费资源包含约 6500 份英语学习者的写作样本,每份样本都有整体语言能力的综合评分,以及关于衔接、句法、词汇、措辞、语法和写作规范的分析评分。这些分数是由多位专业英语教师按照严格程序评估得出的。训练数据集确保了我们的模型具有很高的实用性和准确性,与专业评分标准高度契合。
该模型在包含约 980 篇英语作文的测试数据集上的表现,可通过以下指标概括:“平均准确率”为 0.91,“平均 F1 分数”为 0.9,平均二次加权卡帕系数(QWK)为 0.85。
输入一篇作文后,模型会输出六个分数,分别对应衔接、句法、词汇、措辞、语法和写作规范。每个分数的范围是 1 到 5,分数越高表示作文在该方面的能力越强。这些维度从多个角度综合评估输入作文的质量。该模型是英语作为外语(EFL)教师和研究人员的宝贵工具,对英语 L2 学习者和家长自我评估写作技能也很有帮助。
你可以在这个 应用程序 中输入作文以获取分数。
🚀 快速开始
测试模型
你可以运行以下代码,或者将作文粘贴到 API 接口来测试模型:
- 如果你希望输出值在 1 到 5 之间,请使用以下 Python 代码:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")
new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
# Define the path to your text file
#file_path = 'path/to/yourfile.txt'
# Read the content of the file
#with open(file_path, 'r', encoding='utf-8') as file:
# new_text = file.read()
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()
# Perform the prediction
with torch.no_grad():
outputs = model(**encoded_input)
predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy()
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]
# Scale predictions from the raw output to the range [1, 5]
scaled_scores = 1 + 4 * (predicted_scores - np.min(predicted_scores)) / (np.max(predicted_scores) - np.min(predicted_scores))
# Round scores to the nearest 0.5
rounded_scores = np.round(scaled_scores * 2) / 2
for item, score in zip(item_names, rounded_scores):
print(f"{item}: {score:.1f}")
# Example output:
# cohesion: 3.5
# syntax: 3.5
# vocabulary: 4.0
# phraseology: 4.0
# grammar: 4.0
# conventions: 3.5
- 如果你期望输出值在 1 到 10 之间,请使用以下代码:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")
new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()
with torch.no_grad():
outputs = model(**encoded_input)
predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy() # Convert to numpy array
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]
# Scale predictions from 1 to 10 and round to the nearest 0.5
scaled_scores = 2.25 * predicted_scores - 1.25
rounded_scores = [round(score * 2) / 2 for score in scaled_scores] # Round to nearest 0.5
for item, score in zip(item_names, rounded_scores):
print(f"{item}: {score:.1f}")
# Example output:
# cohesion: 6.5
# syntax: 7.0
# vocabulary: 7.5
# phraseology: 7.5
# grammar: 7.5
# conventions: 7.0
✨ 主要特性
- 精准评分:基于专业英语教师评估的训练数据集,模型具有高实用性和准确性,能紧密模拟专业评分标准。
- 多维度评估:输入作文后,模型会输出衔接、句法、词汇、措辞、语法和写作规范六个维度的分数,从多个角度综合评估作文质量。
- 广泛适用:该模型对英语作为外语(EFL)的教师、研究人员、英语 L2 学习者及其家长都具有很高的价值。
📦 模型信息
属性 | 详情 |
---|---|
模型类型 | 英语作文自动评分模型 |
训练数据 | 英语语言学习者洞察、能力和技能评估(ELLIPSE)语料库,包含约 6500 份英语学习者的写作样本,由专业英语教师按照严格程序评分 |
💻 使用示例
基础用法
# 示例一(A1 水平)
new_text ="Dear Mauro, Thank you for agreeing to take a care of my house and my pets in my absence. This is my daily routine. Every day I water the plants, I walk the my dog in the morning and in the evening. I feed food it twice a day, I check water's dog twice a week. I take out trash every Friday. I sweep the floor and clean house on Monday and on Wednesday. In your free time you can watch TV and play video games. In the fridge I left coca cola and ice-cream for you Have a nice week. "
## 输出
cohesion: 5.0
syntax: 5.0
vocabulary: 5.5
phraseology: 5.0
grammar: 5.0
conventions: 6.0
# 示例二(C1 水平)
new_text = " Dear Mr. Tromps It was so good to hear from you and your group of international buyers are visiting our company next month. And in response to your question, I would like to recommend some suggestions about business etiquette in my country. Firstly, you'll need to make hotel's reservations with anticipation, especially when the group is numerous. There are several five starts hotels in the commercial center of the Guayaquil city, very close to our offices. Business appointments well in advance and don't be late. Usually, at those meetings the persons exchange presentation cards. Some places include tipping by services in restaurant bills, but if any not the tip is 10% of the bill. The people is very kind here, surely you'll be invited to a meal at a house, you can take a small gift as flowers, candy or wine. Finally, remember it's a beautiful summer here, especially in our city is always warm, then you might include appropriate clothes for this weather. If you have any questions, please just let me know. Have you a nice and safe trip. Sincerely, JG Marketing Dpt. LP Representations."
## 输出
cohesion: 8.0
syntax: 8.0
vocabulary: 8.0
phraseology: 8.5
grammar: 8.5
conventions: 8.5
📚 详细文档
引用说明
如果您使用此模型,请引用以下论文:
@article{sun2024automatic,
title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
author={Kun Sun and Rong Wang},
year={2024},
journal={ArXiv},
url={https://arxiv.org/abs/2406.01198}
}
📄 许可证
本模型采用 MIT 许可证。
Distilbert Base Uncased Finetuned Sst 2 English
Apache-2.0
基于DistilBERT-base-uncased在SST-2情感分析数据集上微调的文本分类模型,准确率91.3%
文本分类 英语
D
distilbert
5.2M
746
Xlm Roberta Base Language Detection
MIT
基于XLM-RoBERTa的多语言检测模型,支持20种语言的文本分类
文本分类
Transformers 支持多种语言

X
papluca
2.7M
333
Roberta Hate Speech Dynabench R4 Target
该模型通过动态生成数据集来改进在线仇恨检测,专注于从最差案例中学习以提高检测效果。
文本分类
Transformers 英语

R
facebook
2.0M
80
Bert Base Multilingual Uncased Sentiment
MIT
基于bert-base-multilingual-uncased微调的多语言情感分析模型,支持6种语言的商品评论情感分析
文本分类 支持多种语言
B
nlptown
1.8M
371
Emotion English Distilroberta Base
基于DistilRoBERTa-base微调的英文文本情感分类模型,可预测埃克曼六种基本情绪及中性类别。
文本分类
Transformers 英语

E
j-hartmann
1.1M
402
Robertuito Sentiment Analysis
基于RoBERTuito的西班牙语推文情感分析模型,支持POS(积极)/NEG(消极)/NEU(中性)三类情感分类
文本分类 西班牙语
R
pysentimiento
1.0M
88
Finbert Tone
FinBERT是一款基于金融通讯文本预训练的BERT模型,专注于金融自然语言处理领域。finbert-tone是其微调版本,用于金融情感分析任务。
文本分类
Transformers 英语

F
yiyanghkust
998.46k
178
Roberta Base Go Emotions
MIT
基于RoBERTa-base的多标签情感分类模型,在go_emotions数据集上训练,支持28种情感标签识别。
文本分类
Transformers 英语

R
SamLowe
848.12k
565
Xlm Emo T
XLM-EMO是一个基于XLM-T模型微调的多语言情感分析模型,支持19种语言,专门针对社交媒体文本的情感预测。
文本分类
Transformers 其他

X
MilaNLProc
692.30k
7
Deberta V3 Base Mnli Fever Anli
MIT
基于MultiNLI、Fever-NLI和ANLI数据集训练的DeBERTa-v3模型,擅长零样本分类和自然语言推理任务
文本分类
Transformers 英语

D
MoritzLaurer
613.93k
204
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98