đ IELTS Essay Scoring Model
We've developed a language model that can automatically score IELTS essays, leveraging a large dataset rated by human raters to offer accurate and efficient evaluations.
đ Quick Start
We've trained a language model to automatically score IELTS (International English Language Testing System) essays. The model uses a substantial training dataset rated by human raters, which includes 18,000 real IELTS exam essays and their official scores.
Our model's scoring results are evaluated in five dimensions: task achievement, coherence and cohesion, vocabulary, grammar, and overall, following the official IELTS standards. The OVERALL score is the composite score of the IELTS essays.
In the test dataset, the model achieved impressive results: Accuracy = 0.82, F1 Score = 0.81. Based on the current results, our model can roughly replace human raters for IELTS essays to some extent. However, we'll continue to optimize it to enhance its accuracy and effectiveness.
đģ Usage Examples
Basic Usage
The following is the code to implement the model for scoring new IELTS essays. In the example below, an essay from the test dataset with an overall score of 8.0 is used. Our model grades the essay as 8.5, which is very close to the score given by the human rater.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
import numpy as np
model_path = "KevSun/IELTS_essay_scoring"
model = AutoModelForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
new_text = (
"It is important for all towns and cities to have large public spaces such as squares and parks. "
"Do you agree or disagree with this statement? It is crucial for all metropolitan cities and towns to "
"have some recreational facilities like parks and squares because of their numerous benefits. A number of "
"arguments surround my opinion, and I will discuss it in upcoming paragraphs. To commence with, the first "
"and the foremost merit is that it is beneficial for the health of people because in morning time they can "
"go for walking as well as in the evenings, also older people can spend their free time with their loved ones, "
"and they can discuss about their daily happenings. In addition, young people do lot of exercise in parks and "
"gardens to keep their health fit and healthy, otherwise if there is no park they glue with electronic gadgets "
"like mobile phones and computers and many more. Furthermore, little children get best place to play, they play "
"with their friends in parks if any garden or square is not available for kids then they use roads and streets "
"for playing it can lead to serious incidents. Moreover, parks have some educational value too, in schools, "
"students learn about environment protection in their studies and teachers can take their pupils to parks because "
"students can see those pictures so lively which they see in their school books and they know about importance "
"and protection of trees and flowers. In recapitulate, parks holds immense importance regarding education, health "
"for people of every society, so government should build parks in every city and town."
)
encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=512)
model.eval()
with torch.no_grad():
outputs = model(**encoded_input)
predictions = outputs.logits.squeeze()
predicted_scores = predictions.numpy()
normalized_scores = (predicted_scores / predicted_scores.max()) * 9
rounded_scores = np.round(normalized_scores * 2) / 2
item_names = ["Task Achievement", "Coherence and Cohesion", "Vocabulary", "Grammar", "Overall"]
for item, score in zip(item_names, rounded_scores):
print(f"{item}: {score:.1f}")
đ Documentation
If you use this model, please cite this paper:
@article{sun2024automatic,
title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
author={Kun Sun and Rong Wang},
year={2024},
journal={ArXiv},
url={https://arxiv.org/abs/2406.01198}
}
đ License
This project is licensed under the MIT license.