Engessay_grading_MLオープンソース英語作文採点モデル - 二語学習者に六次元自動採点を提供する

ホーム

Engessay Grading ML

KevSunによって開発

英語作文の自動採点用モデルで、特に第二言語学習者向けに6つの次元で採点を提供します。

テキスト分類

Transformers

オープンソースライセンス:MIT #英語作文採点 #多面的評価 #L2学習者

ダウンロード数 1,498

リリース時間 : 5/8/2024

モデル概要

このモデルは主に英語作文の自動採点に使用され、特に第二言語（L2）学習者が書いた文章を対象としています。トレーニングデータセットには英語学習者の洞察、能力とスキル評価コーパス（ELLIPSE Corpus）を使用しています。

モデル特徴

多面的採点

結束性、構文、語彙、フレーズの使用、文法、規範の6つの次元で採点を提供

高精度

テストセットで優れた性能：平均精度=0.91、平均F1スコア=0.9、二次加重カッパ係数（QWK）=0.85

専門的なトレーニングデータ

約6,500件の専門英語教師が採点した英語学習者のライティングサンプルを使用してトレーニング

モデル能力

英語作文採点

多面的テキスト分析

第二言語学習評価

使用事例

教育

EFL教師の採点補助

外国語として英語を教える教師が学生の作文を迅速に評価するのを支援

6つの次元の詳細な採点を提供し、採点時間を節約

学習者自己評価

英語学習者が自分のライティング能力を自己評価

異なる次元でのライティングレベルを理解

🚀 英語エッセイ自動採点モデル

このモデルは、主に英語エッセイの自動採点を目的として設計されており、特に第二言語（L2）学習者が書いたエッセイに適しています。

🚀 クイックスタート

このモデルは、英語エッセイの自動採点を行います。エッセイを入力すると、結束性、構文、語彙、語句表現、文法、表記規則の6つの次元で採点され、各スコアは1から5の範囲で出力されます。

✨ 主な機能

英語エッセイの多次元自動採点が可能で、結束性、構文、語彙、語句表現、文法、表記規則の6つの次元で評価します。
訓練データセットにはELLIPSEコーパスを使用しており、高い実用性と精度を保証します。
テストデータセットでの性能が高く、平均精度0.91、平均F1スコア0.9、平均二次加重カッパ（QWK）0.85を達成しています。

📦 インストール

モデルをテストするには、以下のコードを実行するか、エッセイをAPIインターフェースに貼り付けてください。

1から5の出力値を得る場合

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")

new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."

# Define the path to your text file
#file_path = 'path/to/yourfile.txt'

# Read the content of the file
#with open(file_path, 'r', encoding='utf-8') as file:
#    new_text = file.read()

encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)
model.eval()

# Perform the prediction
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()

predicted_scores = predictions.numpy()  
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]

# Scale predictions from the raw output to the range [1, 5]
scaled_scores = 1 + 4 * (predicted_scores - np.min(predicted_scores)) / (np.max(predicted_scores) - np.min(predicted_scores))

# Round scores to the nearest 0.5
rounded_scores = np.round(scaled_scores * 2) / 2

for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

# Example output:
# cohesion: 3.5
# syntax: 3.5
# vocabulary: 4.0
# phraseology: 4.0
# grammar: 4.0
# conventions: 3.5

1から10の出力値を得る場合

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("Kevintu/Engessay_grading_ML")
tokenizer = AutoTokenizer.from_pretrained("KevSun/Engessay_grading_ML")

new_text = "The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus is a freely available corpus of ~6,500 ELL writing samples that have been scored for overall holistic language proficiency as well as analytic proficiency scores related to cohesion, syntax, vocabulary, phraseology, grammar, and conventions. In addition, the ELLIPSE corpus provides individual and demographic information for the ELL writers in the corpus including economic status, gender, grade level (8-12), and race/ethnicity. The corpus provides language proficiency scores for individual writers and was developed to advance research in corpus and NLP approaches to assess overall and more fine-grained features of proficiency."
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)

model.eval()
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy()  # Convert to numpy array
item_names = ["cohesion", "syntax", "vocabulary", "phraseology", "grammar", "conventions"]

# Scale predictions from 1 to 10 and round to the nearest 0.5
scaled_scores = 2.25 * predicted_scores - 1.25
rounded_scores = [round(score * 2) / 2 for score in scaled_scores]  # Round to nearest 0.5

for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

# Example output:
# cohesion: 6.5
# syntax: 7.0
# vocabulary: 7.5
# phraseology: 7.5
# grammar: 7.5
# conventions: 7.0

💻 使用例

基本的な使用法

以下は、エッセイを入力して採点を受ける基本的な使用例です。

# コードは上記のインストールセクションのコードと同じです

高度な使用法

以下は、異なるレベルのエッセイを入力して採点を受ける例です。

# the first example (A1 level)

new_text ="Dear Mauro, Thank you for agreeing to take a care of my house and my pets in my absence. This is my daily routine. Every day I water the plants, I walk the my dog in the morning and in the evening. I feed food it twice a day, I check water's dog twice a week. I take out trash every Friday. I sweep the floor and clean house on Monday and on Wednesday. In your free time you can watch TV and play video games.  In the fridge I left coca cola and ice-cream for you  Have a nice week. "
##ouput
cohesion: 5.0
syntax: 5.0
vocabulary: 5.5
phraseology: 5.0
grammar: 5.0
conventions: 6.0


# the second example (C1 level)

new_text = " Dear Mr. Tromps It was so good to hear from you and your group of international buyers are visiting our company next month. And in response to your question, I would like to recommend some suggestions about business etiquette in my country. Firstly, you'll need to make hotel's reservations with anticipation, especially when the group is numerous. There are several five starts hotels in the commercial center of the Guayaquil city, very close to our offices. Business appointments well in advance and don't be late. Usually, at those meetings the persons exchange presentation cards. Some places include tipping by services in restaurant bills, but if any not the tip is 10% of the bill. The people is very kind here, surely you'll be invited to a meal at a house, you can take a small gift as flowers, candy or wine. Finally, remember it's a beautiful summer here, especially in our city is always warm, then you might include appropriate clothes for this weather. If you have any questions, please just let me know. Have you a nice and safe trip.  Sincerely,  JG Marketing Dpt. LP Representations."
##output:
cohesion: 8.0
syntax: 8.0
vocabulary: 8.0
phraseology: 8.5
grammar: 8.5
conventions: 8.5

📚 ドキュメント

モデルの概要

このモデルは、英語エッセイの自動採点を行うために開発されました。訓練データセットには、English Language Learner Insight, Proficiency, and Skills Evaluation (ELLIPSE) Corpusを使用しており、約6,500の英語学習者の作文サンプルが含まれています。各サンプルは、全体的な言語能力と、結束性、構文、語彙、語句表現、文法、表記規則に関する分析スコアで評価されています。

性能指標

テストデータセット（約980の英語エッセイを含む）でのモデルの性能は、以下の指標で要約されます：

指標	値
平均精度	0.91
平均F1スコア	0.9
平均二次加重カッパ（QWK）	0.85

出力形式

エッセイを入力すると、モデルは結束性、構文、語彙、語句表現、文法、表記規則の6つの次元に対応するスコアを出力します。各スコアは1から5または1から10の範囲で、スコアが高いほど、エッセイ内での能力が高いことを示します。

利用方法

このアプリにエッセイを入力することで、スコアを取得できます。また、上記のPythonコードを使用して、ローカルでモデルをテストすることもできます。

🔧 技術詳細

訓練データセット

訓練データセットには、ELLIPSEコーパスを使用しています。このコーパスは、英語学習者の約6,500の作文サンプルを含み、各サンプルは、複数の専門の英語教師による厳格な手順に従った評価によってスコア付けされています。これにより、モデルは高い実用性と精度を獲得し、専門的な採点基準に近い性能を発揮します。

モデルのアーキテクチャ

モデルは、AutoModelForSequenceClassificationを使用して構築されており、Kevintu/Engessay_grading_MLから事前学習されたモデルをロードします。

📄 ライセンス

このモデルはMITライセンスの下で提供されています。

引用

このモデルを使用する場合は、以下の論文を引用してください。

@article{sun2024automatic,
  title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
  author={Kun Sun and Rong Wang},
  year={2024},
  journal={ArXiv},
  url={https://arxiv.org/abs/2406.01198}
}