🚀 Document Question Answering Model - Kaleidoscope_large_v1
This model is a fine - tuned version of sberbank - ai/ruBert - large, specialized for document question answering. It can extract answers from given document contexts.
🚀 Quick Start
The model is a fine - tuned version of sberbank - ai/ruBert - large, tailored for document question answering. It has been adjusted to extract answers from provided document contexts and fine - tuned on a custom JSON dataset with context, question, and answer triples.
✨ Features
- Objective: Extract answers from documents according to user questions.
- Base Model: sberbank - ai/ruBert - large.
- Dataset: A custom JSON file with fields: context, question, and answer.
- Preprocessing: Concatenate the question and the document context as input to guide the model to focus on relevant segments.
- Training Settings:
- Number of epochs: 20.
- Batch size: 4 per device.
- Warmup steps: 0.1 of total steps.
- FP16 training: Enabled if CUDA is available.
- Hardware: Trained on an 1xRTX 3070.
📚 Documentation
The model was fine - tuned using the Transformers library with a custom training pipeline. Key aspects of the training process are as follows:
- Custom Dataset: A loader reads a JSON file containing context, question, and answer triples.
- Feature Preparation: The script tokenizes the document and question with a sliding window approach to handle long texts.
- Training Process: Utilize mixed precision training and the AdamW optimizer to improve optimization.
- Evaluation and Checkpointing: The training script evaluates model performance on a validation set, saves checkpoints, and uses early stopping based on validation loss.
- This model is suitable for interactive document question answering tasks, making it a powerful tool for applications such as customer support, document search, and automated Q&A systems.
⚠️ Important Note
While the model primarily focuses on Russian texts, it also supports English language inputs, but its English support has not been tested.
💻 Usage Examples
Basic Usage
import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("LaciaStudio/Kaleidoscope_large_v1")
model = AutoModelForQuestionAnswering.from_pretrained("LaciaStudio/Kaleidoscope_large_v1")
model.to(device)
file_path = input("Enter document path: ")
with open(file_path, "r", encoding="utf-8") as f:
context = f.read()
while True:
question = input("Enter question (or 'exit' to quit): ")
if question.lower() == "exit":
break
inputs = tokenizer(question, context, return_tensors="pt", truncation=True, max_length=384)
inputs = {k: v.to(device) for k, v in inputs.items()}
outputs = model(**inputs)
start_logits = outputs.start_logits
end_logits = outputs.end_logits
start_index = torch.argmax(start_logits)
end_index = torch.argmax(end_logits)
answer_tokens = inputs["input_ids"][0][start_index:end_index + 1]
answer = tokenizer.decode(answer_tokens, skip_special_tokens=True)
print("Answer:", answer)
Example of answering
RU
Альберт Эйнштейн разработал теорию относительности.
Кто разработал теорию относительности?
альберт эинштеин
EN
I had a red car.
What kind of car did I have?
a red car
📄 License
This model is licensed under cc - by - nc - 4.0.
Finetuned by LaciaStudio | LaciaAI
Property |
Details |
Pipeline Tag |
document - question - answering |
Tags |
DocumentQA, QuestionAnswering, NLP, DeepLearning, Transformers, Multimodal, HuggingFace, ruBert, MachineLearning, DeepQA, AIForDocs, Docs, NeuralNetworks, torch, pytorch, large, text - generation - inference |
Library Name |
transformers |
Metrics |
accuracy, f1, recall, exact_match, precision |
Base Model |
ai - forever/ruBert - large |
