Engessay Grading ML
模型概述
本模型主要用於英語作文自動評分,特別是針對第二語言(L2)學習者撰寫的文章。訓練數據集採用英語學習者洞察、能力與技能評估語料庫(ELLIPSE Corpus)。
模型特點
多維度評分
提供銜接性、句法、詞彙、短語運用、語法和規範六個維度的評分
高準確性
在測試集上表現優秀:平均準確率=0.91,平均F1分數=0.9,二次加權卡帕係數(QWK)=0.85
專業訓練數據
使用約6,500份專業英語教師評分的英語學習者寫作樣本訓練
模型能力
英語作文評分
多維度文本分析
第二語言學習評估
使用案例
教育
EFL教師評分輔助
幫助英語作為外語的教師快速評估學生作文
提供六個維度的詳細評分,節省評分時間
學習者自評
英語學習者可以自我評估寫作能力
瞭解自己在不同維度的寫作水平
🚀 英語作文自動評分模型
本模型主要用於英語作文的自動評分,尤其適用於英語作為第二語言(L2)的學習者所寫的作文。該模型使用的訓練數據集是英語語言學習者洞察、能力和技能評估(ELLIPSE)語料庫。這一免費資源包含約 6500 份英語學習者的寫作樣本,每份樣本都有整體語言能力的綜合評分,以及關於銜接、句法、詞彙、措辭、語法和寫作規範的分析評分。這些分數是由多位專業英語教師按照嚴格程序評估得出的。訓練數據集確保了我們的模型具有很高的實用性和準確性,與專業評分標準高度契合。
該模型在包含約 980 篇英語作文的測試數據集上的表現,可通過以下指標概括:“平均準確率”為 0.91,“平均 F1 分數”為 0.9,平均二次加權卡帕係數(QWK)為 0.85。
輸入一篇作文後,模型會輸出六個分數,分別對應銜接、句法、詞彙、措辭、語法和寫作規範。每個分數的範圍是 1 到 5,分數越高表示作文在該方面的能力越強。這些維度從多個角度綜合評估輸入作文的質量。該模型是英語作為外語(EFL)教師和研究人員的寶貴工具,對英語 L2 學習者和家長自我評估寫作技能也很有幫助。
你可以在這個 應用程序 中輸入作文以獲取分數。
🚀 快速開始
測試模型
你可以運行以下代碼,或者將作文粘貼到 API 接口來測試模型:
- 如果你希望輸出值在 1 到 5 之間,請使用以下 Python 代碼:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")
new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
# Define the path to your text file
#file_path = 'path/to/yourfile.txt'
# Read the content of the file
#with open(file_path, 'r', encoding='utf-8') as file:
# new_text = file.read()
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()
# Perform the prediction
with torch.no_grad():
outputs = model(**encoded_input)
predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy()
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]
# Scale predictions from the raw output to the range [1, 5]
scaled_scores = 1 + 4 * (predicted_scores - np.min(predicted_scores)) / (np.max(predicted_scores) - np.min(predicted_scores))
# Round scores to the nearest 0.5
rounded_scores = np.round(scaled_scores * 2) / 2
for item, score in zip(item_names, rounded_scores):
print(f"{item}: {score:.1f}")
# Example output:
# cohesion: 3.5
# syntax: 3.5
# vocabulary: 4.0
# phraseology: 4.0
# grammar: 4.0
# conventions: 3.5
- 如果你期望輸出值在 1 到 10 之間,請使用以下代碼:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")
new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()
with torch.no_grad():
outputs = model(**encoded_input)
predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy() # Convert to numpy array
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]
# Scale predictions from 1 to 10 and round to the nearest 0.5
scaled_scores = 2.25 * predicted_scores - 1.25
rounded_scores = [round(score * 2) / 2 for score in scaled_scores] # Round to nearest 0.5
for item, score in zip(item_names, rounded_scores):
print(f"{item}: {score:.1f}")
# Example output:
# cohesion: 6.5
# syntax: 7.0
# vocabulary: 7.5
# phraseology: 7.5
# grammar: 7.5
# conventions: 7.0
✨ 主要特性
- 精準評分:基於專業英語教師評估的訓練數據集,模型具有高實用性和準確性,能緊密模擬專業評分標準。
- 多維度評估:輸入作文後,模型會輸出銜接、句法、詞彙、措辭、語法和寫作規範六個維度的分數,從多個角度綜合評估作文質量。
- 廣泛適用:該模型對英語作為外語(EFL)的教師、研究人員、英語 L2 學習者及其家長都具有很高的價值。
📦 模型信息
屬性 | 詳情 |
---|---|
模型類型 | 英語作文自動評分模型 |
訓練數據 | 英語語言學習者洞察、能力和技能評估(ELLIPSE)語料庫,包含約 6500 份英語學習者的寫作樣本,由專業英語教師按照嚴格程序評分 |
💻 使用示例
基礎用法
# 示例一(A1 水平)
new_text ="Dear Mauro, Thank you for agreeing to take a care of my house and my pets in my absence. This is my daily routine. Every day I water the plants, I walk the my dog in the morning and in the evening. I feed food it twice a day, I check water's dog twice a week. I take out trash every Friday. I sweep the floor and clean house on Monday and on Wednesday. In your free time you can watch TV and play video games. In the fridge I left coca cola and ice-cream for you Have a nice week. "
## 輸出
cohesion: 5.0
syntax: 5.0
vocabulary: 5.5
phraseology: 5.0
grammar: 5.0
conventions: 6.0
# 示例二(C1 水平)
new_text = " Dear Mr. Tromps It was so good to hear from you and your group of international buyers are visiting our company next month. And in response to your question, I would like to recommend some suggestions about business etiquette in my country. Firstly, you'll need to make hotel's reservations with anticipation, especially when the group is numerous. There are several five starts hotels in the commercial center of the Guayaquil city, very close to our offices. Business appointments well in advance and don't be late. Usually, at those meetings the persons exchange presentation cards. Some places include tipping by services in restaurant bills, but if any not the tip is 10% of the bill. The people is very kind here, surely you'll be invited to a meal at a house, you can take a small gift as flowers, candy or wine. Finally, remember it's a beautiful summer here, especially in our city is always warm, then you might include appropriate clothes for this weather. If you have any questions, please just let me know. Have you a nice and safe trip. Sincerely, JG Marketing Dpt. LP Representations."
## 輸出
cohesion: 8.0
syntax: 8.0
vocabulary: 8.0
phraseology: 8.5
grammar: 8.5
conventions: 8.5
📚 詳細文檔
引用說明
如果您使用此模型,請引用以下論文:
@article{sun2024automatic,
title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
author={Kun Sun and Rong Wang},
year={2024},
journal={ArXiv},
url={https://arxiv.org/abs/2406.01198}
}
📄 許可證
本模型採用 MIT 許可證。
Distilbert Base Uncased Finetuned Sst 2 English
Apache-2.0
基於DistilBERT-base-uncased在SST-2情感分析數據集上微調的文本分類模型,準確率91.3%
文本分類 英語
D
distilbert
5.2M
746
Xlm Roberta Base Language Detection
MIT
基於XLM-RoBERTa的多語言檢測模型,支持20種語言的文本分類
文本分類
Transformers 支持多種語言

X
papluca
2.7M
333
Roberta Hate Speech Dynabench R4 Target
該模型通過動態生成數據集來改進在線仇恨檢測,專注於從最差案例中學習以提高檢測效果。
文本分類
Transformers 英語

R
facebook
2.0M
80
Bert Base Multilingual Uncased Sentiment
MIT
基於bert-base-multilingual-uncased微調的多語言情感分析模型,支持6種語言的商品評論情感分析
文本分類 支持多種語言
B
nlptown
1.8M
371
Emotion English Distilroberta Base
基於DistilRoBERTa-base微調的英文文本情感分類模型,可預測埃克曼六種基本情緒及中性類別。
文本分類
Transformers 英語

E
j-hartmann
1.1M
402
Robertuito Sentiment Analysis
基於RoBERTuito的西班牙語推文情感分析模型,支持POS(積極)/NEG(消極)/NEU(中性)三類情感分類
文本分類 西班牙語
R
pysentimiento
1.0M
88
Finbert Tone
FinBERT是一款基於金融通訊文本預訓練的BERT模型,專注於金融自然語言處理領域。finbert-tone是其微調版本,用於金融情感分析任務。
文本分類
Transformers 英語

F
yiyanghkust
998.46k
178
Roberta Base Go Emotions
MIT
基於RoBERTa-base的多標籤情感分類模型,在go_emotions數據集上訓練,支持28種情感標籤識別。
文本分類
Transformers 英語

R
SamLowe
848.12k
565
Xlm Emo T
XLM-EMO是一個基於XLM-T模型微調的多語言情感分析模型,支持19種語言,專門針對社交媒體文本的情感預測。
文本分類
Transformers 其他

X
MilaNLProc
692.30k
7
Deberta V3 Base Mnli Fever Anli
MIT
基於MultiNLI、Fever-NLI和ANLI數據集訓練的DeBERTa-v3模型,擅長零樣本分類和自然語言推理任務
文本分類
Transformers 英語

D
MoritzLaurer
613.93k
204
精選推薦AI模型
Llama 3 Typhoon V1.5x 8b Instruct
專為泰語設計的80億參數指令模型,性能媲美GPT-3.5-turbo,優化了應用場景、檢索增強生成、受限生成和推理任務
大型語言模型
Transformers 支持多種語言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型,專為邊緣設備推理設計,體積僅為Cosmo-3B模型的2%左右。
對話系統
Transformers 英語

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基於RoBERTa架構的中文抽取式問答模型,適用於從給定文本中提取答案的任務。
問答系統 中文
R
uer
2,694
98