đ Fine-Tuned LLM for Text-to-SQL Conversion
This fine-tuned model is designed to convert natural language queries into SQL statements, trained on specific datasets to handle various Text-to-SQL tasks.
đ Quick Start
To use the model for text-to-SQL conversion, you can load it using the transformers
library as shown below:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Ellbendls/Qwen-2.5-3b-Text_to_SQL")
model = AutoModelForCausalLM.from_pretrained("Ellbendls/Qwen-2.5-3b-Text_to_SQL")
query = "What is the total number of hospital beds in each state?"
inputs = tokenizer(query, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
⨠Features
- Text-to-SQL Conversion: Converts natural language queries into accurate SQL statements.
- Schema Generation: Generates table schema context when none is provided.
- Optimized for Analytics and Reporting: Handles SQL queries with aggregation, grouping, and filtering.
đĻ Installation
This model uses the transformers
library. You can install it via the following command:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Ellbendls/Qwen-2.5-3b-Text_to_SQL")
model = AutoModelForCausalLM.from_pretrained("Ellbendls/Qwen-2.5-3b-Text_to_SQL")
query = "What is the total number of hospital beds in each state?"
inputs = tokenizer(query, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Advanced Usage
complex_query = "Find the top 5 states with the highest average number of hospital beds per county, excluding states with less than 10 counties."
inputs = tokenizer(complex_query, return_tensors="pt")
outputs = model.generate(**inputs, max_length=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Example Output
Input:
What is the total number of hospital beds in each state?
Output:
Context:
CREATE TABLE Beds (State VARCHAR(50), Beds INT);
INSERT INTO Beds (State, Beds) VALUES ('California', 100000), ('Texas', 85000), ('New York', 70000);
SQL Query:
SELECT State, SUM(Beds) FROM Beds GROUP BY State;
đ Documentation
Model Details
Model Description
This model has been fine-tuned to help users generate SQL queries based on natural language prompts. In scenarios where table schema context is missing, the model is trained to generate schema definitions along with the SQL query, making it a robust solution for various Text-to-SQL tasks.
Training Details
Dataset
The model was fine-tuned on the gretelai/synthetic_text_to_sql
dataset, which includes diverse natural language queries mapped to SQL queries, with optional schema contexts.
đ§ Technical Details
This model is a fine-tuned version of Qwen/Qwen2.5-3B-Instruct. Fine-tuning was performed on the gretelai/synthetic_text_to_sql
dataset to enhance its Text-to-SQL conversion capabilities.
đ License
This project is licensed under the MIT License.
â ī¸ Important Note
- Complex Queries: The model may struggle with highly nested or advanced SQL tasks.
- Non-English Prompts: It is optimized for English only.
- Context Dependence: The model may generate incorrect schemas without explicit instructions.
đĄ Usage Tip
When using the model, try to provide clear and specific natural language queries to get more accurate SQL statements. If possible, provide relevant table schema context to improve the quality of the generated results.