🚀 ClimateGPT-7B
ClimateGPT is a family of AI models crafted to synthesize interdisciplinary research on climate change. ClimateGPT-7B, a 7-billion parameter transformer decoder model, is adapted from Llama - 2 for the climate science domain. It undergoes continuous pre - training on a collection of 4.2B tokens from curated climate documents by Erasmus AI and further instruction fine - tuning on a dataset of instruction - completion pairs collected by AppTek with climate scientists. Notably, it outperforms Llama - 2 - 70B Chat on climate - specific benchmarks. The model is designed to work with retrieval augmentation and cascaded machine translation to enhance knowledge, factuality, and language coverage.
🚀 Quick Start
The model is ready to be used right away. For the full system including cascaded MT, RAG, etc., visit our demo website: eci.io
✨ Features
- Specialized for Climate: Tailored to the climate science domain, outperforming Llama - 2 - 70B Chat on climate - specific benchmarks.
- Enhanced with Augmentation: Designed to work with retrieval augmentation to extend knowledge and cascaded machine translation to increase language coverage.
📦 Installation
No specific installation steps are provided in the original README.
💻 Usage Examples
Basic Usage
The model can be used as a question - answering model specialized in the climate domain. When prompting, follow the ChatML format:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>context
[[0]] "{reference1_title}", {reference1_year}
{reference1_text}
[[1]] "{reference2_title}", {reference2_year}
{reference2_text}
[...]<|im_end|>
<|im_start|>assistant
Advanced Usage
For developers, the model can serve as a starting point for further fine - tuning.
📚 Documentation
Model Details
Explore the model lineage here.
Property |
Details |
Powered by |
Erasmus AI |
Trained with |
AppTek |
Authenticated by |
EQTYLab |
Model Type |
decoder - only Transformer |
Language(s) (NLP) |
English |
License |
ClimateGPT Community License |
Continued pre - trained from |
Llama - 2 - 7B |
Context length |
4K tokens |
Input |
Text - only data |
Output |
Model generates text only |
Paper |
arXiv:2401.09646 |
Website |
eci.io |
Uses
- Question Answering: Intended to be directly used as a question - answering model specialized in the climate domain.
- Feedback for Stakeholders: Aimed at providing useful feedback for decision - makers, scientists, and journalists in climate discussions.
- Fine - Tuning: Can be used as a starting point for developers for further fine - tuning.
Downstream Use
ClimateGPT - 7B is an instruction - tuned model for climate - specific question - answering applications. It works well with retrieval augmentation and supports up to 5 references in context.
Training
- Llama - 2 Training Data: Refer to https://huggingface.co/meta - llama/Llama - 2 - 7b - hf.
- Continued Pre - training: 4.2B climate - specific tokens (tokenized by the Llama tokenizer) are used.
- Instruction Fine - Tuning: About 272K instruction - completion pairs (both in the climate and general domains) are used.
Evaluation
Detailed evaluation results are presented in our paper on our model card website: [eci.io/model - card](https://eci.io/model - card)
Environmental Impact
Property |
Details |
Hardware Type |
8x NVIDIA H100 HBM |
Power Consumption per GPU |
775W |
Hours used |
157 hrs |
Cloud Provider |
MLFoundry |
Compute Region |
Washington, USA |
Energy Mix |
100% Hydro Power (24g CO2eq/kWh according to IPCC 2014) |
Carbon Emitted |
2.9kg CO2eq |
Citation
If you find ClimateGPT useful in your work, please cite it with:
@misc{thulke2024climategpt,
title={ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change},
author={David Thulke and Yingbo Gao and Petrus Pelser and Rein Brune and Rricha Jalota and Floris Fok and Michael Ramos and Ian van Wyk and Abdallah Nasir and Hayden Goldstein and Taylor Tragemann and Katie Nguyen and Ariana Fowler and Andrew Stanco and Jon Gabriel and Jordan Taylor and Dean Moro and Evgenii Tsymbalov and Juliette de Waal and Evgeny Matusov and Mudar Yaghi and Mohammad Shihadah and Hermann Ney and Christian Dugast and Jonathan Dotan and Daniel Erasmus},
year={2024},
eprint={2401.09646},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
⚠️ Important Note
Despite the efforts from the development team to eliminate them, as every other chat - capable LLMs, this model may generate biased, offensive or inaccurate responses.