mamba_nli_ensemble Open-source Natural Language Inference Classifier - Free Deployment Aids Text Inference Judgment

Home

Mamba Nli Ensemble

Developed by patrickmlml

A binary natural language inference classifier fine-tuned on the COMP34812 dataset based on the Mamba state space model

Text Classification

PyTorch

Supports Multiple Languages#Mamba Architecture NLI #Binary Text Inference #State Space Fine-tuning

Downloads 15

Release Time : 3/19/2025

Model Overview

This model is a classifier for binary natural language inference tasks, capable of determining whether an entailment relationship exists between a premise and a hypothesis.

Model Features

Efficient Architecture

Utilizes the Mamba state space model, offering higher efficiency compared to traditional Transformer architectures

Binary Classification

Focuses on determining binary entailment relationships between premises and hypotheses

Fast Training

Training completes in just 1 hour and 17 minutes on a T4 GPU

Model Capabilities

Text Classification

Natural Language Inference

Entailment Relationship Judgment

Use Cases

Natural Language Processing

Text Entailment Detection

Determines whether a given premise entails a specific hypothesis

Achieves 82.4% accuracy on the test set

🚀 Model Card for 11128093-11066053-nli

A binary Natural Language Inference classifier fine-tuned on the provided COMP34812 dataset using the Mamba state space model.

🚀 Quick Start

This model is a binary Natural Language Inference classifier. It's fine - tuned on the COMP34812 dataset with the Mamba state space model, aiming to solve binary NLI tasks.

✨ Features

Extends the state - spaces/mamba - 130m architecture for binary NLI tasks.
Uses a custom classification head.
Fine - tuned on the COMP34812 NLI dataset.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Model Details

Model Description

This model extends the state - spaces/mamba - 130m architecture for binary NLI tasks (entailment vs. non - entailment). It uses a custom classification head and was fine - tuned on the COMP34812 NLI dataset.

Developed by: Patrick Mermelstein Lyons and Dev Soneji
Language(s): English
Model type: Supervised
Model architecture: Non - Transformers (Selective State Spaces)
Finetuned from model [optional]: state - spaces/mamba - 130m

Model Resources

Repository: https://huggingface.co/state - spaces/mamba - 130m
Paper or documentation: https://arxiv.org/pdf/2312.00752.pdf

Training Details

Training Data

The COMP34812 NLI train dataset (closed - source task - specific dataset). 24.4K pairs of premise - hypothesis pairs, each with a binary entailment label.

Training Procedure

Training Hyperparameters

learning_rate: 5e - 5
train_batch_size: 4
eval_batch_size: 16
num_train_epochs: 5
lr_scheduler_type: cosine
warmup_ratio: 0.1

Speeds, Sizes, Times

total training time: 1 hour 17 minutes
number of epochs: 5
model size: ~500MB

Evaluation

Testing Data & Metrics

Testing Data

The COMP34812 NLI dev dataset (closed - source task - specific dataset). 6.7K pairs of premise - hypothesis pairs, each with a binary entailment label.

Metrics

Accuracy
Matthews Correlation Coefficient (MCC)

Results

The model achieved an accuracy of 82.4% and an MCC of 0.649.

Technical Specifications

Hardware

GPU: NVIDIA T4 (Google Colab)
VRAM: 15.0 GB
RAM: 12.7 GB
Disk: 2 GB for model and data

Software

Python 3.10+
PyTorch
HuggingFace Transformers
mamba - ssm
datasets, evaluate, accelerate

Bias, Risks, and Limitations

The model is limited to binary entailment detection and is trained exclusively on the COMP34812 dataset. Generalization outside of this dataset is untested. Sentence pairs longer than 128 tokens will be trunacted.

Additional Information

Model checkpoints and tokenizer available at https://huggingface.co/patrickmlml/mamba_nli_ensemble. Hyperparameters were determined by closely following referenced literature.

📄 License

The model is licensed under cc - by - 4.0.

📋 Information Table

Property	Details
Model Type	Supervised
Training Data	The COMP34812 NLI train dataset (closed - source task - specific dataset). 24.4K pairs of premise - hypothesis pairs, each with a binary entailment label.
Metrics	Matthews Correlation, Accuracy
Base Model	`state - spaces/mamba - 130m`
Tags	text - classification, nli, mamba

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご