Engessay_grading_ML開源英語作文評分模型 - 為二語學習者提供六維自動評分

首頁

Engessay Grading ML

由KevSun開發

用於英語作文自動評分的模型，特別針對第二語言學習者，提供六個維度的評分。

文本分類

Transformers

開源協議:MIT #英語作文評分 #多維度評估 #L2學習者

下載量 1,498

發布時間 : 5/8/2024

模型概述

本模型主要用於英語作文自動評分，特別是針對第二語言（L2）學習者撰寫的文章。訓練數據集採用英語學習者洞察、能力與技能評估語料庫（ELLIPSE Corpus）。

模型特點

多維度評分

提供銜接性、句法、詞彙、短語運用、語法和規範六個維度的評分

高準確性

在測試集上表現優秀：平均準確率=0.91，平均F1分數=0.9，二次加權卡帕係數（QWK）=0.85

專業訓練數據

使用約6,500份專業英語教師評分的英語學習者寫作樣本訓練

模型能力

英語作文評分

多維度文本分析

第二語言學習評估

使用案例

教育

EFL教師評分輔助

幫助英語作為外語的教師快速評估學生作文

提供六個維度的詳細評分，節省評分時間

學習者自評

英語學習者可以自我評估寫作能力

瞭解自己在不同維度的寫作水平

🚀 英語作文自動評分模型

本模型主要用於英語作文的自動評分，尤其適用於英語作為第二語言（L2）的學習者所寫的作文。該模型使用的訓練數據集是英語語言學習者洞察、能力和技能評估（ELLIPSE）語料庫。這一免費資源包含約 6500 份英語學習者的寫作樣本，每份樣本都有整體語言能力的綜合評分，以及關於銜接、句法、詞彙、措辭、語法和寫作規範的分析評分。這些分數是由多位專業英語教師按照嚴格程序評估得出的。訓練數據集確保了我們的模型具有很高的實用性和準確性，與專業評分標準高度契合。

該模型在包含約 980 篇英語作文的測試數據集上的表現，可通過以下指標概括：“平均準確率”為 0.91，“平均 F1 分數”為 0.9，平均二次加權卡帕係數（QWK）為 0.85。

輸入一篇作文後，模型會輸出六個分數，分別對應銜接、句法、詞彙、措辭、語法和寫作規範。每個分數的範圍是 1 到 5，分數越高表示作文在該方面的能力越強。這些維度從多個角度綜合評估輸入作文的質量。該模型是英語作為外語（EFL）教師和研究人員的寶貴工具，對英語 L2 學習者和家長自我評估寫作技能也很有幫助。

你可以在這個應用程序中輸入作文以獲取分數。

🚀 快速開始

測試模型

你可以運行以下代碼，或者將作文粘貼到 API 接口來測試模型：

如果你希望輸出值在 1 到 5 之間，請使用以下 Python 代碼：

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")

new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."

# Define the path to your text file
#file_path = 'path/to/yourfile.txt'

# Read the content of the file
#with open(file_path, 'r', encoding='utf-8') as file:
#    new_text = file.read()

encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()

# Perform the prediction
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()

predicted_scores = predictions.numpy()  
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]

# Scale predictions from the raw output to the range [1, 5]
scaled_scores = 1 + 4 * (predicted_scores - np.min(predicted_scores)) / (np.max(predicted_scores) - np.min(predicted_scores))

# Round scores to the nearest 0.5
rounded_scores = np.round(scaled_scores * 2) / 2

for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

# Example output:
# cohesion: 3.5
# syntax: 3.5
# vocabulary: 4.0
# phraseology: 4.0
# grammar: 4.0
# conventions: 3.5

如果你期望輸出值在 1 到 10 之間，請使用以下代碼：

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")

new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)

model.eval()
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy()  # Convert to numpy array
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]

# Scale predictions from 1 to 10 and round to the nearest 0.5
scaled_scores = 2.25 * predicted_scores - 1.25
rounded_scores = [round(score * 2) / 2 for score in scaled_scores]  # Round to nearest 0.5

for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

# Example output:
# cohesion: 6.5
# syntax: 7.0
# vocabulary: 7.5
# phraseology: 7.5
# grammar: 7.5
# conventions: 7.0

✨ 主要特性

精準評分：基於專業英語教師評估的訓練數據集，模型具有高實用性和準確性，能緊密模擬專業評分標準。
多維度評估：輸入作文後，模型會輸出銜接、句法、詞彙、措辭、語法和寫作規範六個維度的分數，從多個角度綜合評估作文質量。
廣泛適用：該模型對英語作為外語（EFL）的教師、研究人員、英語 L2 學習者及其家長都具有很高的價值。

📦 模型信息

屬性	詳情
模型類型	英語作文自動評分模型
訓練數據	英語語言學習者洞察、能力和技能評估（ELLIPSE）語料庫，包含約 6500 份英語學習者的寫作樣本，由專業英語教師按照嚴格程序評分

💻 使用示例

基礎用法

# 示例一（A1 水平）
new_text ="Dear Mauro, Thank you for agreeing to take a care of my house and my pets in my absence. This is my daily routine. Every day I water the plants, I walk the my dog in the morning and in the evening. I feed food it twice a day, I check water's dog twice a week. I take out trash every Friday. I sweep the floor and clean house on Monday and on Wednesday. In your free time you can watch TV and play video games.  In the fridge I left coca cola and ice-cream for you  Have a nice week. "
## 輸出
cohesion: 5.0
syntax: 5.0
vocabulary: 5.5
phraseology: 5.0
grammar: 5.0
conventions: 6.0

# 示例二（C1 水平）
new_text = " Dear Mr. Tromps It was so good to hear from you and your group of international buyers are visiting our company next month. And in response to your question, I would like to recommend some suggestions about business etiquette in my country. Firstly, you'll need to make hotel's reservations with anticipation, especially when the group is numerous. There are several five starts hotels in the commercial center of the Guayaquil city, very close to our offices. Business appointments well in advance and don't be late. Usually, at those meetings the persons exchange presentation cards. Some places include tipping by services in restaurant bills, but if any not the tip is 10% of the bill. The people is very kind here, surely you'll be invited to a meal at a house, you can take a small gift as flowers, candy or wine. Finally, remember it's a beautiful summer here, especially in our city is always warm, then you might include appropriate clothes for this weather. If you have any questions, please just let me know. Have you a nice and safe trip.  Sincerely,  JG Marketing Dpt. LP Representations."
## 輸出
cohesion: 8.0
syntax: 8.0
vocabulary: 8.0
phraseology: 8.5
grammar: 8.5
conventions: 8.5

📚 詳細文檔

引用說明

如果您使用此模型，請引用以下論文：

@article{sun2024automatic,
  title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
  author={Kun Sun and Rong Wang},
  year={2024},
  journal={ArXiv},
  url={https://arxiv.org/abs/2406.01198}
}