modernBERT-content-regression Open-source Text Regression Model - Accurately Predict the Click-through Rate of Email Texts

Home

Modernbert Content Regression

Developed by Forecast-ing

A text regression model based on ModernBERT for predicting the click-through rate (CTR) of email text content.

Large Language Model

Transformers

Open Source License:Apache-2.0 #Email CTR Prediction #Text Regression #Few-shot Optimization

Downloads 150

Release Time : 1/9/2025

Model Overview

This project explores the use of ModernBERT for text regression tasks to predict content engagement metrics (such as email click-through rates). Through hyperparameter tuning, the model demonstrates excellent performance in regression tasks.

Model Features

Text Regression Capability

Capable of predicting engagement metrics (e.g., click-through rate) for text content such as emails.

Few-shot Adaptation

Performs well on a small dataset of only 548 samples, demonstrating ModernBERT's adaptability to small datasets.

Hyperparameter Tuning

Specially tuned to optimize performance in regression tasks.

Model Capabilities

Text Feature Extraction

Regression Prediction

Content Engagement Scoring

Use Cases

Marketing

Email Click-through Rate Prediction

Predicts the click-through rate of marketing email content to help optimize email copywriting.

Compared to the Catboost baseline model, the RMSE metric improved from 1.597 to 1.569.

Content Optimization

Content Engagement Scoring

Provides engagement scores for generated content to predict its potential effectiveness.

🚀 ModernBERT Engagement Content Regression

This project explores using ModernBERT for the text regression task of predicting engagement metrics for text content, specifically the click - through rate (CTR) of email text content.

🚀 Quick Start

Install dependencies and activate venv

uv sync
source .venv/bin/activate

The following values need to be defined in the .env file:

HUGGINGFACE_TOKEN

Run notebook for model fitting

uv run --with jupyter jupyter lab

✨ Features

This project explores using ModernBERT for text regression to predict the click - through rate (CTR) of email text content. It includes hyperparameter tuning of ModernBERT and comparison with a benchmark model.

📦 Installation

Install dependencies and activate venv

uv sync
source .venv/bin/activate

Ensure the following values are defined in the .env file:

HUGGINGFACE_TOKEN

💻 Usage Examples

Run notebook for model fitting

uv run --with jupyter jupyter lab

📚 Documentation

What is this?

This project explores using ModernBERT for the text regression task of predicting engagement metrics for text content. In this case, we predict the click - through rate (CTR) of email text content.

We will explore ModernBERT's hyperparameter tuning and how to use it for regression. We will also compare the results to a benchmark model.

This type of task is complex, as we can remember the quote:

Half my advertising is wasted; the trouble is, I don't know which half -John Wanamaker

In this experiment, we exclude other relevant factors, such as the time the email is sent, the day of the week, the recipient, etc.

Links for the project:

Model - ModernBERT - Engagement - Content - Regression
Training notebook - Training Notebook
Demo - Demo Space

This work is indebted to the work of many community members and blog posts:

ModernBERT Announcement
Fine - tune classifier with ModernBERT in 2025
How to set up Trainer for a regression
Additional thanks to the creators of ModernBERT!

Our dataset

We will be using a dataset of 548 emails where we have the text of the email and the CTR we are trying to predict as labels.

We look forward to ModernBERT's improvements, allowing us to fine - tune models for each potential user’s email dataset. The variability of email data and its small size pose interesting regression challenges.

Benchmarking

We will start by using the Catboost library as a simple benchmark for text regression. For both the benchmark and the ModernBERT run, we are using 'rmse' as the metric. We receive the following results:

Metric	Value
MSE	2.552100633998035
RMSE	1.5975295408843102
MAE	1.1439370629666958
R²	0.30127932054387174
SMAPE	37.63064694052479

ModernBERT Model Performance

After running hyperparameter tuning for ModernBERT, we get the following results:

Metric	Value
MSE	2.4624056816101074
RMSE	1.5692054300218654
MAE	1.182181715965271
R²	0.325836181640625
SMAPE	56.61447048187256

We see improvements in all metrics except for SMAPE. We believe that ModernBERT would scale even better with a larger dataset; as 500 examples is very low for fine - tuning, and we are thus happy with the performance of this evaluation.

Who are we?

At Forecast.ing we are building a platform to help users create more enriching content by automatically researching trends and generating campaign ideas with AgenticAI. We generate the content and then create fine - tuned scores of how likely we think that content will succeed.

🔧 Technical Details

This project uses ModernBERT for text regression. We first use the Catboost library as a benchmark for text regression. For both the benchmark and the ModernBERT run, we use 'rmse' as the metric. We also perform hyperparameter tuning on ModernBERT.

📄 License

This project is under the Apache - 2.0 license.

Conclusion

We see that ModernBERT is a powerful model for text regression. We believe that with a larger dataset, we would see even better results. We are excited to see the future of ModernBERT and how it will be used for text regression. If interested, you can contact us at robin@forecast.ing

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご