Engessay_grading_ML开源英语作文评分模型 - 为二语学习者提供六维自动评分

首页

Engessay Grading ML

由 KevSun 开发

用于英语作文自动评分的模型，特别针对第二语言学习者，提供六个维度的评分。

文本分类

Transformers

开源协议:MIT #英语作文评分 #多维度评估 #L2学习者

下载量 1,498

发布时间 : 5/8/2024

模型简介

本模型主要用于英语作文自动评分，特别是针对第二语言（L2）学习者撰写的文章。训练数据集采用英语学习者洞察、能力与技能评估语料库（ELLIPSE Corpus）。

模型特点

多维度评分

提供衔接性、句法、词汇、短语运用、语法和规范六个维度的评分

高准确性

在测试集上表现优秀：平均准确率=0.91，平均F1分数=0.9，二次加权卡帕系数（QWK）=0.85

专业训练数据

使用约6,500份专业英语教师评分的英语学习者写作样本训练

模型能力

英语作文评分

多维度文本分析

第二语言学习评估

使用案例

教育

EFL教师评分辅助

帮助英语作为外语的教师快速评估学生作文

提供六个维度的详细评分，节省评分时间

学习者自评

英语学习者可以自我评估写作能力

了解自己在不同维度的写作水平

🚀 英语作文自动评分模型

本模型主要用于英语作文的自动评分，尤其适用于英语作为第二语言（L2）的学习者所写的作文。该模型使用的训练数据集是英语语言学习者洞察、能力和技能评估（ELLIPSE）语料库。这一免费资源包含约 6500 份英语学习者的写作样本，每份样本都有整体语言能力的综合评分，以及关于衔接、句法、词汇、措辞、语法和写作规范的分析评分。这些分数是由多位专业英语教师按照严格程序评估得出的。训练数据集确保了我们的模型具有很高的实用性和准确性，与专业评分标准高度契合。

该模型在包含约 980 篇英语作文的测试数据集上的表现，可通过以下指标概括：“平均准确率”为 0.91，“平均 F1 分数”为 0.9，平均二次加权卡帕系数（QWK）为 0.85。

输入一篇作文后，模型会输出六个分数，分别对应衔接、句法、词汇、措辞、语法和写作规范。每个分数的范围是 1 到 5，分数越高表示作文在该方面的能力越强。这些维度从多个角度综合评估输入作文的质量。该模型是英语作为外语（EFL）教师和研究人员的宝贵工具，对英语 L2 学习者和家长自我评估写作技能也很有帮助。

你可以在这个应用程序中输入作文以获取分数。

🚀 快速开始

测试模型

你可以运行以下代码，或者将作文粘贴到 API 接口来测试模型：

如果你希望输出值在 1 到 5 之间，请使用以下 Python 代码：

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")

new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."

# Define the path to your text file
#file_path = 'path/to/yourfile.txt'

# Read the content of the file
#with open(file_path, 'r', encoding='utf-8') as file:
#    new_text = file.read()

encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()

# Perform the prediction
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()

predicted_scores = predictions.numpy()  
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]

# Scale predictions from the raw output to the range [1, 5]
scaled_scores = 1 + 4 * (predicted_scores - np.min(predicted_scores)) / (np.max(predicted_scores) - np.min(predicted_scores))

# Round scores to the nearest 0.5
rounded_scores = np.round(scaled_scores * 2) / 2

for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

# Example output:
# cohesion: 3.5
# syntax: 3.5
# vocabulary: 4.0
# phraseology: 4.0
# grammar: 4.0
# conventions: 3.5

如果你期望输出值在 1 到 10 之间，请使用以下代码：

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")

new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)

model.eval()
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy()  # Convert to numpy array
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]

# Scale predictions from 1 to 10 and round to the nearest 0.5
scaled_scores = 2.25 * predicted_scores - 1.25
rounded_scores = [round(score * 2) / 2 for score in scaled_scores]  # Round to nearest 0.5

for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

# Example output:
# cohesion: 6.5
# syntax: 7.0
# vocabulary: 7.5
# phraseology: 7.5
# grammar: 7.5
# conventions: 7.0

✨ 主要特性

精准评分：基于专业英语教师评估的训练数据集，模型具有高实用性和准确性，能紧密模拟专业评分标准。
多维度评估：输入作文后，模型会输出衔接、句法、词汇、措辞、语法和写作规范六个维度的分数，从多个角度综合评估作文质量。
广泛适用：该模型对英语作为外语（EFL）的教师、研究人员、英语 L2 学习者及其家长都具有很高的价值。

📦 模型信息

属性	详情
模型类型	英语作文自动评分模型
训练数据	英语语言学习者洞察、能力和技能评估（ELLIPSE）语料库，包含约 6500 份英语学习者的写作样本，由专业英语教师按照严格程序评分

💻 使用示例

基础用法

# 示例一（A1 水平）
new_text ="Dear Mauro, Thank you for agreeing to take a care of my house and my pets in my absence. This is my daily routine. Every day I water the plants, I walk the my dog in the morning and in the evening. I feed food it twice a day, I check water's dog twice a week. I take out trash every Friday. I sweep the floor and clean house on Monday and on Wednesday. In your free time you can watch TV and play video games.  In the fridge I left coca cola and ice-cream for you  Have a nice week. "
## 输出
cohesion: 5.0
syntax: 5.0
vocabulary: 5.5
phraseology: 5.0
grammar: 5.0
conventions: 6.0

# 示例二（C1 水平）
new_text = " Dear Mr. Tromps It was so good to hear from you and your group of international buyers are visiting our company next month. And in response to your question, I would like to recommend some suggestions about business etiquette in my country. Firstly, you'll need to make hotel's reservations with anticipation, especially when the group is numerous. There are several five starts hotels in the commercial center of the Guayaquil city, very close to our offices. Business appointments well in advance and don't be late. Usually, at those meetings the persons exchange presentation cards. Some places include tipping by services in restaurant bills, but if any not the tip is 10% of the bill. The people is very kind here, surely you'll be invited to a meal at a house, you can take a small gift as flowers, candy or wine. Finally, remember it's a beautiful summer here, especially in our city is always warm, then you might include appropriate clothes for this weather. If you have any questions, please just let me know. Have you a nice and safe trip.  Sincerely,  JG Marketing Dpt. LP Representations."
## 输出
cohesion: 8.0
syntax: 8.0
vocabulary: 8.0
phraseology: 8.5
grammar: 8.5
conventions: 8.5

📚 详细文档

引用说明

如果您使用此模型，请引用以下论文：

@article{sun2024automatic,
  title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
  author={Kun Sun and Rong Wang},
  year={2024},
  journal={ArXiv},
  url={https://arxiv.org/abs/2406.01198}
}