S

Self Biorag 7b Olaph

Developed by dmis-lab
A fine-tuned version based on Minbyul/selfbiorag-7b-wo-kqa_golden-iter-dpo-step3-filtered, trained using the HuggingFace MedLFQA dataset (excluding kqa_golden) with Direct Preference Optimization (DPO)
Downloads 20
Release Time : 5/22/2024

Model Overview

This model is a 7B-parameter language model trained with Direct Preference Optimization (DPO), specializing in medical domain question-answering tasks, with response quality optimized through reinforcement learning

Model Features

Direct Preference Optimization
Fine-tuned using the DPO algorithm to optimize the model's preference for high-quality responses
Medical Domain Specialization
Trained on medical QA datasets, suitable for handling professional medical questions
Multi-GPU Training
Distributed training using 4 GPUs to enhance training efficiency

Model Capabilities

Medical question answering
Domain-specific text generation
Preference learning

Use Cases

Healthcare
Medical Knowledge QA System
Building an intelligent assistant capable of answering professional medical questions
Performs excellently on the MedLFQA dataset
Medical Education Tool
QA system for medical student education and training
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase