BERT-Base Turkish Cased NLI Mean FAQ MNR Open-Source Model - Powering Answer Retrieval in Turkish Q&A Systems

Bert Base Turkish Cased Nli Mean Faq Mnr

Developed by mys

This is a BERT model fine-tuned for Turkish FAQ retrieval tasks, capable of mapping questions and answers into 768-dimensional vectors for answer retrieval in Q&A systems.

Text Embedding

Transformers

#Turkish FAQ Retrieval #Q&A Matching Optimization #Enterprise Car Rental Scenario

Downloads 13

Release Time : 3/2/2022

Model Overview

The model is based on dbmdz/bert-base-turkish-cased and has undergone two-stage fine-tuning for natural language inference and FAQ retrieval tasks. Special tokens <Q> and <A> are added to distinguish between question and answer inputs.

Model Features

Special Token Handling

Added <Q> and <A> special tokens to distinguish between question and answer inputs, improving matching accuracy.

Dual Fine-tuning

Pre-trained on natural language inference tasks first, then fine-tuned for FAQ retrieval tasks.

Efficient Retrieval

Achieves fast matching by calculating cosine similarity between question and answer vectors.

Model Capabilities

Turkish Text Understanding

Semantic Similarity Calculation

FAQ Answer Retrieval

Q&A System Support

Use Cases

Customer Service

Automated FAQ Response System

Used in enterprise customer service systems to automatically answer frequently asked questions.

Example shows accurate matching for car rental-related questions.

Education

Learning Q&A Bot

Helps students quickly find answers to course-related questions.

🚀 {MODEL_NAME}

This project is a fine - tuned model for FAQ retrieval. It maps questions and answers to 768 - dimensional vectors, which can be used in FAQ - style chatbots and answer retrieval in question - answering pipelines. Google supported this work by providing Google Cloud credit. Thank you Google for supporting the open source! 🎉

🚀 Quick Start

This model is a finetuned version of mys/bert-base-turkish-cased-nli-mean for FAQ retrieval. The base model mys/bert-base-turkish-cased-nli-mean is itself a finetuned version of dbmdz/bert-base-turkish-cased for NLI.

✨ Features

Maps questions and answers to 768 - dimensional vectors.
Suitable for FAQ - style chatbots and answer retrieval in question - answering pipelines.
Trained on the Turkish subset of clips/mqa dataset after cleaning/filtering with a Multiple Negatives Symmetric Ranking loss.
Two special tokens (<Q> for questions and <A> for answers) were added to the tokenizer before finetuning, and the model embeddings were resized.

💻 Usage Examples

Basic Usage

questions = [
    "Merhaba",
    "Nasılsın?",
    "Bireysel araç kiralama yapıyor musunuz?",
    "Kurumsal araç kiralama yapıyor musunuz?"
]

answers = [
    "Merhaba, size nasıl yardımcı olabilirim?",
    "İyiyim, teşekkür ederim. Size nasıl yardımcı olabilirim?",
    "Hayır, sadece Kurumsal Araç Kiralama operasyonları gerçekleştiriyoruz. Size başka nasıl yardımcı olabilirim?",
    "Evet, kurumsal araç kiralama hizmetleri sağlıyoruz. Size nasıl yardımcı olabilirim?"
]


questions = ["<Q>" + q for q in questions]
answers = ["<A>" + a for a in answers]


def answer_faq(model, tokenizer, questions, answers, return_similarities=False):
    q_len = len(questions)
    tokens = tokenizer(questions + answers, padding=True, return_tensors='tf')
    embs = model(**tokens)[0]

    attention_masks = tf.cast(tokens['attention_mask'], tf.float32)
    sample_length = tf.reduce_sum(attention_masks, axis=-1, keepdims=True)
    masked_embs = embs * tf.expand_dims(attention_masks, axis=-1)
    masked_embs = tf.reduce_sum(masked_embs, axis=1) / tf.cast(sample_length, tf.float32)
    a = tf.math.l2_normalize(masked_embs[:q_len, :], axis=1)
    b = tf.math.l2_normalize(masked_embs[q_len:, :], axis=1)

    similarities = tf.matmul(a, b, transpose_b=True)
        
    scores = tf.nn.softmax(similarities)
    results = list(zip(answers, scores.numpy().squeeze().tolist()))
    sorted_results = sorted(results, key=lambda x: x[1], reverse=True)
    sorted_results = [{"answer": answer.replace("<A>", ""), "score": f"{score:.4f}"} for answer, score in sorted_results]
    return sorted_results


for question in questions:
    results = answer_faq(model, tokenizer, [question], answers)
    print(question.replace("<Q>", ""))
    print(results)
    print("---------------------")

Advanced Usage

The above code shows a basic way to use the model for FAQ retrieval. You can further customize it according to your specific requirements, such as integrating it into a more complex chatbot system.

The output of the above code is as follows:

Merhaba
[{'answer': 'Merhaba, size nasıl yardımcı olabilirim?', 'score': '0.2931'}, {'answer': 'İyiyim, teşekkür ederim. Size nasıl yardımcı olabilirim?', 'score': '0.2751'}, {'answer': 'Hayır, sadece Kurumsal Araç Kiralama operasyonları gerçekleştiriyoruz. Size başka nasıl yardımcı olabilirim?', 'score': '0.2200'}, {'answer': 'Evet, kurumsal araç kiralama hizmetleri sağlıyoruz. Size nasıl yardımcı olabilirim?', 'score': '0.2118'}]
---------------------
Nasılsın?
[{'answer': 'İyiyim, teşekkür ederim. Size nasıl yardımcı olabilirim?', 'score': '0.2808'}, {'answer': 'Merhaba, size nasıl yardımcı olabilirim?', 'score': '0.2623'}, {'answer': 'Hayır, sadece Kurumsal Araç Kiralama operasyonları gerçekleştiriyoruz. Size başka nasıl yardımcı olabilirim?', 'score': '0.2320'}, {'answer': 'Evet, kurumsal araç kiralama hizmetleri sağlıyoruz. Size nasıl yardımcı olabilirim?', 'score': '0.2249'}]
---------------------
Bireysel araç kiralama yapıyor musunuz?
[{'answer': 'Hayır, sadece Kurumsal Araç Kiralama operasyonları gerçekleştiriyoruz. Size başka nasıl yardımcı olabilirim?', 'score': '0.2861'}, {'answer': 'Evet, kurumsal araç kiralama hizmetleri sağlıyoruz. Size nasıl yardımcı olabilirim?', 'score': '0.2768'}, {'answer': 'İyiyim, teşekkür ederim. Size nasıl yardımcı olabilirim?', 'score': '0.2215'}, {'answer': 'Merhaba, size nasıl yardımcı olabilirim?', 'score': '0.2156'}]
---------------------
Kurumsal araç kiralama yapıyor musunuz?
[{'answer': 'Evet, kurumsal araç kiralama hizmetleri sağlıyoruz. Size nasıl yardımcı olabilirim?', 'score': '0.3060'}, {'answer': 'Hayır, sadece Kurumsal Araç Kiralama operasyonları gerçekleştiriyoruz. Size başka nasıl yardımcı olabilirim?', 'score': '0.2929'}, {'answer': 'İyiyim, teşekkür ederim. Size nasıl yardımcı olabilirim?', 'score': '0.2066'}, {'answer': 'Merhaba, size nasıl yardımcı olabilirim?', 'score': '0.1945'}]
---------------------

Please have a look at my accompanying repo to see how it was finetuned and how it can be used in inference.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご