resume-job-matcher-all-MiniLM-L6-v2 open-source model - Accurately calculate sentence similarity and efficiently extract features

Resume Job Matcher All MiniLM L6 V2

Developed by anass1209

A sentence embedding model based on the MiniLM-L6-v2 architecture, specifically designed for calculating sentence similarity and feature extraction.

Text Embedding

Safetensors

#Resume Semantic Matching #NLP Embedding Optimization #Cosine Similarity Evaluation

Downloads 124

Release Time : 4/16/2025

Model Overview

This model converts sentences into high-dimensional vector representations for calculating semantic similarity between sentences, suitable for tasks such as resume matching and information retrieval.

Model Features

Efficient Sentence Embedding

Capable of quickly converting sentences into high-quality vector representations, suitable for real-time applications.

Optimized Similarity Calculation

Trained using cosine similarity loss, specifically optimized for semantic similarity calculation between sentences.

Lightweight Architecture

Based on the MiniLM architecture, it reduces model complexity while maintaining high performance.

Model Capabilities

Sentence Vectorization

Semantic Similarity Calculation

Feature Extraction

Text Matching

Use Cases

Human Resources

Resume Matching

Semantically match job seekers' resumes with job descriptions to improve recruitment efficiency.

Pearson cosine similarity reached 0.537

Information Retrieval

Document Similarity Search

Find semantically similar documents in large-scale document libraries.

🚀 SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a Sentence Transformer model fine-tuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences and paragraphs to a 384-dimensional dense vector space, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

✨ Features

Semantic Understanding: Effectively captures the semantic meaning of sentences and paragraphs, enabling accurate similarity comparisons.
Versatile Applications: Can be used in various natural language processing tasks such as semantic search, text classification, and clustering.
Fine-tuned Model: Built upon the sentence-transformers/all-MiniLM-L6-v2 base model, fine-tuned for specific tasks.

📦 Installation

First, install the Sentence Transformers library:

pip install -U sentence-transformers

💻 Usage Examples

Basic Usage

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("anass1209/resume-job-matcher-all-MiniLM-L6-v2")
# Run inference
sentences = [
    'Developed and maintained core backend services using Python and Django, focusing on scalability and efficiency. Implemented RESTful APIs for data retrieval and manipulation.  Worked extensively with PostgreSQL for data storage and retrieval.  Responsible for optimizing database queries and improving API response times.  Experience with model fine-tuning for semantic search and document retrieval using pre-trained embedding models like Sentence Transformers or similar libraries, specifically for improving the relevance of search results and document matching within the web application.  Experience using vector databases (e.g., ChromaDB, Weaviate) preferred.',
    '## Senior Backend Engineer\n\n*   **ABC Corp** | 2020 - Present\n*   Led development of a new REST API for user authentication and profile management using Python and Django.\n*   Managed a PostgreSQL database, optimizing queries and schema design for improved performance, resulting in a 20% reduction in average API response time.\n*   Improved system scalability through efficient code design and load balancing techniques.\n*   Experience using pre-trained embedding models (BERT) for natural language processing tasks to improve search accuracy, with focus on keyphrase extraction and content similarity comparison for the recommendations engine. Proficient in Flask.',
    "PhD in Computer Science, University of California, Berkeley (2018-2023). Dissertation: 'Adversarial Robustness in NLP for Cybersecurity Applications.' Focused on fine-tuning BERT for malware detection and social engineering attacks. Proficient in Python, TensorFlow, and AWS. Published in top-tier NLP and security conferences. Experienced with large datasets and model evaluation metrics.\n\nMaster of Science in Cybersecurity, Johns Hopkins University (2016-2018). Relevant coursework included Machine Learning, Data Mining, and Network Security. Developed a system for anomaly detection using a recurrent neural network (RNN). Familiar with Python and cloud computing platforms. Good understanding of NLP concepts, but limited experience fine-tuning transformer models. Strong understanding of Information Security Principles.\n\nBachelor of Science in Computer Engineering, Carnegie Mellon University (2012-2016). Relevant coursework: Artificial Intelligence, Database Management, and Software Engineering. Project experience: Developed a web application using Python. No direct experience with fine-tuning NLP models, but a strong foundation in programming and data structures.  Familiar with cloud infrastructure concepts. Possess CISSP certification.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

📚 Documentation

Model Details

Model Description

Property	Details
Model Type	Sentence Transformer
Base model	sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length	256 tokens
Output Dimensionality	384 dimensions
Similarity Function	Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

🔧 Technical Details

The model is based on the Sentence Transformers framework, which uses a pre-trained Transformer model (BertModel in this case) for encoding sentences. The pooling layer aggregates the token embeddings into a single sentence embedding, and the normalization layer normalizes the embeddings to have a unit length. This allows for efficient computation of cosine similarity between sentences.

📄 License

No license information provided in the original document.

Evaluation

Metrics

Semantic Similarity

Datasets: dev_evaluation and test_evaluation
Metrics:
- Pearson Cosine: 0.5378933775375572
- Spearman Cosine: 0.6213226022358173

The model's performance on these metrics indicates its ability to capture semantic similarity between sentences. A higher value for these metrics suggests better performance in semantic similarity tasks.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご