IELTS_essay_scoring Open-source IELTS Essay Scoring Model - Multi-dimensional Automatic Scoring with an Accuracy of 0.82

IELTS Essay Scoring

Developed by KevSun

This model is trained on a large dataset of manually scored IELTS essays and can automatically score IELTS essays across multiple dimensions with an accuracy of 0.82.

Large Language Model

Transformers

EnglishOpen Source License:MIT #IELTS essay scoring #Multi-dimensional scoring #High accuracy

Downloads 68

Release Time : 5/17/2024

Model Overview

A Transformer-based sequence classification model designed to automatically evaluate IELTS essays across five dimensions: Task Achievement, Coherence and Cohesion, Lexical Resource, Grammatical Range and Accuracy, and Overall Score.

Model Features

Multi-dimensional Scoring

Scores essays according to official IELTS criteria across five dimensions: Task Achievement, Coherence and Cohesion, Lexical Resource, Grammatical Range and Accuracy, and Overall Score.

High Accuracy

Achieves an accuracy of 0.82 and an F1 score of 0.81 on test datasets, closely matching human scoring standards.

Large-scale Training Data

Trained on 18,000 authentic IELTS essays with official scores, covering diverse writing styles and topics.

Model Capabilities

IELTS essay scoring

Multi-dimensional text evaluation

English writing quality analysis

Use Cases

Educational Assessment

IELTS Essay Automated Scoring

Used to replace or assist human scorers in evaluating IELTS essays, improving scoring efficiency and consistency.

Model scores closely match human scores. For example, an essay scored 8.5 by the model received 8.0 from human scorers.

Language Learning

Writing Proficiency Assessment

Helps English learners understand their writing level and identify areas for improvement.

🚀 IELTS Essay Scoring Model

We've developed a language model that can automatically score IELTS essays, leveraging a large dataset rated by human raters to offer accurate and efficient evaluations.

🚀 Quick Start

We've trained a language model to automatically score IELTS (International English Language Testing System) essays. The model uses a substantial training dataset rated by human raters, which includes 18,000 real IELTS exam essays and their official scores.

Our model's scoring results are evaluated in five dimensions: task achievement, coherence and cohesion, vocabulary, grammar, and overall, following the official IELTS standards. The OVERALL score is the composite score of the IELTS essays.

In the test dataset, the model achieved impressive results: Accuracy = 0.82, F1 Score = 0.81. Based on the current results, our model can roughly replace human raters for IELTS essays to some extent. However, we'll continue to optimize it to enhance its accuracy and effectiveness.

💻 Usage Examples

Basic Usage

The following is the code to implement the model for scoring new IELTS essays. In the example below, an essay from the test dataset with an overall score of 8.0 is used. Our model grades the essay as 8.5, which is very close to the score given by the human rater.

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
import numpy as np

# Load the pre-trained model and tokenizer
model_path = "KevSun/IELTS_essay_scoring"
model = AutoModelForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

# Example text to be evaluated, the essay with the score by human rater (= 8.5) in the test dataset.

new_text = (
    "It is important for all towns and cities to have large public spaces such as squares and parks. "
    "Do you agree or disagree with this statement? It is crucial for all metropolitan cities and towns to "
    "have some recreational facilities like parks and squares because of their numerous benefits. A number of "
    "arguments surround my opinion, and I will discuss it in upcoming paragraphs. To commence with, the first "
    "and the foremost merit is that it is beneficial for the health of people because in morning time they can "
    "go for walking as well as in the evenings, also older people can spend their free time with their loved ones, "
    "and they can discuss about their daily happenings. In addition, young people do lot of exercise in parks and "
    "gardens to keep their health fit and healthy, otherwise if there is no park they glue with electronic gadgets "
    "like mobile phones and computers and many more. Furthermore, little children get best place to play, they play "
    "with their friends in parks if any garden or square is not available for kids then they use roads and streets "
    "for playing it can lead to serious incidents. Moreover, parks have some educational value too, in schools, "
    "students learn about environment protection in their studies and teachers can take their pupils to parks because "
    "students can see those pictures so lively which they see in their school books and they know about importance "
    "and protection of trees and flowers. In recapitulate, parks holds immense importance regarding education, health "
    "for people of every society, so government should build parks in every city and town."
)


encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=512)


model.eval()

# Perform the prediction
with torch.no_grad():
    outputs = model(**encoded_input)

predictions = outputs.logits.squeeze()


predicted_scores = predictions.numpy()  

# Normalize the scores
normalized_scores = (predicted_scores / predicted_scores.max()) * 9  # Scale to 9


rounded_scores = np.round(normalized_scores * 2) / 2

item_names = ["Task Achievement", "Coherence and Cohesion", "Vocabulary", "Grammar", "Overall"]


for item, score in zip(item_names, rounded_scores):
    print(f"{item}: {score:.1f}")

##the output:
#Task Achievement: 9.0
#Coherence and Cohesion: 7.5
#Vocabulary: 8.0
#Grammar: 7.5
#Overall: 8.5

📚 Documentation

If you use this model, please cite this paper:

@article{sun2024automatic,
  title={Automatic Essay Multi-dimensional Scoring with Fine-tuning and Multiple Regression},
  author={Kun Sun and Rong Wang},
  year={2024},
  journal={ArXiv},
  url={https://arxiv.org/abs/2406.01198}
}

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご