🚀 CropSeek-LLM
CropSeek-LLM is a fine - tuned language model. It offers insights and recommendations for crop optimization, addressing various agricultural practices such as crop planting, soil conditions, pest control, and irrigation.
🚀 Quick Start
Use the code below to start using the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("persadian/CropSeek-LLM", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("persadian/CropSeek-LLM")
input_text = "What is the best planting season for cabbages in South Coast, Durban?"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
✨ Features
Model Details
CropSeek-LLM is a fine - tuned version of the deepseek - ai/DeepSeek - R1 - Distill - Qwen - 7B
model. It uses LoRA (Low - Rank Adaptation) to fine - tune the base model on a crop - related Q&A dataset. It aims to assist farmers, agronomists, and researchers in crop management.
Property |
Details |
Developed by |
persadian, DARJYO |
Model Type |
Causal Language Model (Fine - tuned with LoRA) |
Language(s) (NLP) |
English |
License |
DARJYO License v1.0 |
Finetuned from model |
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
Hardware used for training |
Tesla T4 GPU |
Uses
Direct Use
It can directly answer crop - optimization questions, including optimal planting seasons, ideal soil conditions, pest control methods, irrigation practices, and crop rotation strategies.
Downstream Use
It can be integrated into agricultural advisory systems, mobile apps, or chatbots to offer real - time recommendations.
Out - of - Scope Use
- Medical Advice: Not designed for medical or health - related advice.
- Financial Decisions: Should not be used for financial or investment decisions.
- Non - Agricultural Use: Specifically fine - tuned for crop optimization and may perform poorly in unrelated domains.
Bias, Risks, and Limitations
- Data Bias: Trained on a dataset focused on specific crops and regions, may not generalize well.
- Limited Scope: Designed for crop optimization, may not provide accurate answers for unrelated topics.
- Ethical Concerns: Should not replace professional advice from agronomists or experts.
⚠️ Important Note
The model has limitations. Users should verify its recommendations with local agricultural experts, be aware of its limitations, and report any biases or inaccuracies to the developers.
💡 Usage Tip
Use the model as a supplementary tool, not a replacement for professional advice.
Training Details
Training Data
The model was fine - tuned on a curated agricultural text dataset, including crop descriptions, disease symptoms and treatments, farming techniques, and regional guidelines. The specific dataset used is DARYJO/sawotiQ29_crop_optimization.
Training Procedure
Preprocessing
- Cleaned and preprocessed the dataset to remove irrelevant information and ensure consistency.
- Tokenized text data using the base - model's tokenizer.
- Applied data augmentation techniques like synonym replacement and paraphrasing.
Training Hyperparameters
- Training regime: Mixed precision (fp16)
- Batch size: 16
- Learning rate: 2e - 5
- Epochs: 3
- Optimizer: AdamW
- Weight decay: 0.01
- Warmup steps: 500
Speeds, Sizes, Times
- Training time: Approximately 10 hours on a T4 GPU.
- Checkpoint size: 1.5 GB
- Throughput: 120 samples/second
Evaluation
Testing Data, Factors & Metrics
Testing Data
Evaluated on a held - out test set of agricultural queries, including crop identification, disease diagnosis, and farming recommendations. [https://huggingface.co/datasets/DARJYO/sawotiQ29_crop_optimization]
Factors
Evaluation was disaggregated by crop type, disease type, and geographic region.
Metrics
- Accuracy: 92% on crop identification tasks.
- Precision/Recall/F1 - score: Precision: 0.89, Recall: 0.91, F1 - score: 0.90
- Latency: Average response time of 0.5 seconds on a T4 GPU.
Results
The model achieved high accuracy on crop identification and disease diagnosis tasks. Performance was slightly lower for region - specific recommendations due to limited training data.
Model Examination
Examined using interpretability tools like attention visualization and feature importance analysis. Key findings show that it relies on symptom descriptions for disease diagnosis and crop - specific keywords for crop identification.
Environmental Impact
- Hardware Type: T4 GPU
- Hours used: 10 hours
- Cloud Provider: Google Colab
- Compute Region: us - central1
- Carbon Emitted: Approximately 0.5 kg CO2eq
Technical Specifications
Model Architecture and Objective
- Base model architecture: deepseek - ai/deepseek - R1 - 14B
- Objective: Fine - tuned for text generation and classification tasks in the agricultural domain.
Compute Infrastructure
Hardware
- Training hardware: Google Colab with T4 GPU.
Software
- Frameworks: PyTorch, Hugging Face Transformers.
- Libraries: Datasets, Tokenizers, Accelerate.
Citation
BibTeX:
@misc{cropseek-llm,
author = {persadian~Darshani Persadh, DARJYO},
title = {CropSeek-LLM: A Fine-Tuned Language Model for Agricultural Applications},
year = {2023},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/persadian/CropSeek-LLM}},
}
APA:
persadian. Darshani Persadh (2023). CropSeek - LLM: A Fine - Tuned Language Model for Agricultural Applications. Hugging Face. https://huggingface.co/persadian/CropSeek - LLM
Glossary
- Mixed precision: Training using both 16 - bit and 32 - bit floating - point numbers to improve efficiency.
More Information
For more details, visit the CropSeek - LLM space on Hugging Face.
Model Card Authors
persadian ~Darshani Persah
Model Card Contact
info@darjyo.com