HeCross Open-Source Hebrew Cross-Encoder Model - Free Deployment to Boost Zero-Shot Classification Tasks

Home

Hecross

Developed by HeTree

This is a cross-encoder model for Hebrew language, supporting zero-shot classification tasks.

Text Classification

Transformers

OtherOpen Source License:Apache-2.0 #Hebrew language processing #Zero-shot classification #Cross-encoder

Downloads 22

Release Time : 2/18/2024

Model Overview

This model is primarily used for Hebrew text processing tasks, especially suitable for zero-shot classification scenarios. It can classify text even without specific category training data.

Model Features

Hebrew language support

Model specifically optimized for Hebrew text

Zero-shot classification

Can classify without specific category training data

Cross-encoding capability

Can simultaneously encode two texts and calculate their relevance score

Model Capabilities

Text classification

Relevance scoring

Zero-shot learning

Use Cases

Customer service

Automatic ticket classification

Automatically classify customer inquiries to different departments

Improves ticket processing efficiency

Content management

News classification

Automatically classify Hebrew news into different topics

Improves content organization efficiency

🚀 Hebrew Cross-Encoder Model

This Hebrew Cross-Encoder Model is designed for zero-shot classification tasks, offering efficient and accurate text classification capabilities.

🚀 Quick Start

You can quickly start using this model through the following steps.

✨ Features

Zero-Shot Classification: It can perform classification tasks without the need for specific training data for each class.
Multi-Library Support: It can be used with both the SentenceTransformers library and the Transformers library.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

from sentence_transformers import CrossEncoder
model = CrossEncoder('HeTree/HeCross')

# Scores (already after sigmoid)
scores = model.predict([('כמה אנשים חיים בברלין?', 'ברלין מונה 3,520,031 תושבים רשומים בשטח של 891.82 קמ"ר.'), 
                        ('כמה אנשים חיים בברלין?', 'העיר ניו יורק מפורסמת בזכות מוזיאון המטרופוליטן לאומנות.')])
print(scores)

Advanced Usage

You can use the model also directly with Transformers library (without SentenceTransformers library):

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np

# Function that applies sigmoid to a score
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

model = AutoModelForSequenceClassification.from_pretrained('HeTree/HeCross')
tokenizer = AutoTokenizer.from_pretrained('HeTree/HeCross')
features = tokenizer(['כמה אנשים חיים בברלין?', 'כמה אנשים חיים בברלין?'],
                     ['ברלין מונה 3,520,031 תושבים רשומים בשטח של 891.82 קמ"ר.', 'העיר ניו יורק מפורסמת בזכות מוזיאון המטרופוליטן לאומנות.'],
                     padding=True, truncation=True, return_tensors="pt")
model.eval()
with torch.no_grad():
    scores = sigmoid(model(**features).logits)
    print(scores)

Zero-Shot Classification Usage

This model can also be used for zero-shot-classification:

from transformers import pipeline
classifier = pipeline("zero-shot-classification", model='HeTree/HeCross')
sent = "בשבוע שעבר שדרגתי את גרסת  הטלפון שלי ."
candidate_labels = ["נייד לשיחות", "אתר", "חיוב חשבון", "גישה לחשבון בנק"]
res = classifier(sent, candidate_labels)
print(res)

📚 Documentation

No detailed documentation is provided in the original document.

🔧 Technical Details

No technical details are provided in the original document.

📄 License

This project is licensed under the apache-2.0 license.

Citing

If you use HeCross in your research, please cite Mevaker: Conclusion Extraction and Allocation Resources for the Hebrew Language.

@article{shalumov2024mevaker,
      title={Mevaker: Conclusion Extraction and Allocation Resources for the Hebrew Language}, 
      author={Vitaly Shalumov and Harel Haskey and Yuval Solaz},
      year={2024},
      eprint={2403.09719},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Property	Details
Model Type	Hebrew Cross-Encoder Model
Training Data	HeTree/MevakerConcTree
Pipeline Tag	zero-shot-classification
License	apache-2.0

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご