# For Academic Research Only
VARCO VISION 14B HF
VARCO-VISION-14B is a powerful English-Korean visual language model that supports image and text input to generate text output, equipped with localization, referencing, and OCR capabilities.
Image-to-Text
Transformers Supports Multiple Languages

V
NCSOFT
449
24
Llava V1.6 34b
Apache-2.0
LLaVA is an open-source multimodal chatbot, fine-tuned based on a large language model, supporting interactions with both images and text.
Text-to-Image
L
liuhaotian
9,033
351
Flan T5 Base Arxiv Math Question Answering
Apache-2.0
This is a FLAN-T5 model trained on the arXiv math QA dataset, specializing in mathematical problem-solving tasks, but the output is unreliable and intended for research purposes only.
Large Language Model
Transformers English

F
AlgorithmicResearchGroup
14
1
Llava 13b Delta V0
Apache-2.0
LLaVA is an open-source chatbot fine-tuned with GPT-generated multimodal instruction-following data based on LLaMA/Vicuna, belonging to a Transformer-based autoregressive language model.
Text-to-Image
Transformers

L
liuhaotian
352
221
Llama 13b
Other
LLaMA-13b is a 13 billion parameter large language model released by Meta, subject to a non-commercial use license agreement.
Large Language Model
Transformers

L
huggyllama
7,426
140
Featured Recommended AI Models