Jedi-7B-1080p-GGUF Open Source Model - Used in computer scenarios, capable of text generation from image-text

Jedi 7B 1080p GGUF

Developed by lmstudio-community

An image-text to text generation model based on the Transformer architecture, designed specifically for computer/GUI-related scenarios, with intelligent agent capabilities.

Text-to-Image EnglishOpen Source License:Apache-2.0 #GUI Intelligent Agent #Computer Scene Text Generation #Synthetic Data Training

Downloads 113

Release Time : 6/1/2025

Model Overview

This model is a multimodal generation model that can process image and text inputs and generate text outputs, especially suitable for tasks related to computer operations and graphical user interfaces.

Model Features

Multimodal Capability

Can process both image and text inputs simultaneously and generate relevant text outputs

Computer/GUI Optimization

Specifically optimized for computer usage scenarios and graphical user interface operations

Intelligent Agent Capability

Has a certain degree of autonomous decision-making and task execution capabilities

GGUF Quantization Support

Provides a quantized model version for easy operation on devices with limited resources

Model Capabilities

Image Understanding

Text Generation

GUI Operation Guidance

Computer Task Automation

Use Cases

Computer-assisted Operation

GUI Operation Guidance

Generate operation step instructions based on screenshots

Help users complete complex computer operations

Automated Script Generation

Automatically generate computer operation scripts according to user needs

Improve work efficiency and reduce repetitive labor

Education and Training

Computer Skills Teaching

Teach computer usage skills through image and text interaction

Lower the learning threshold and improve teaching efficiency

🚀 Community Model: Jedi 7B 1080p by Xlangai

Part of the LM Studio Community models highlights program, which showcases new and remarkable models from the community. Join the discussion on Discord.

Model Information

Property	Details
Quantized By	bartowski
Pipeline Tag	image-text-to-text
Base Model	xlangai/Jedi-7B-1080p
Base Model Relation	quantized
License	apache-2.0
Language	en

Model creator: xlangai
Original model: Jedi-7B-1080p
GGUF quantization: provided by bartowski based on llama.cpp release b5524

🔧 Technical Details

Designed for computer/GUI use.
Tuned for agentic capabilities.
Trained from Qwen 2.5 VL on their 4 million synthesized computer use examples.
Reference link: https://osworld-grounding.github.io/

🙏 Special thanks

Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible.

🚫 Disclaimers

LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error - free, viruses - free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご