Firefunction V2 GGUF

Developed by MaziyarPanahi

FireFunction V2 is a state-of-the-art function calling model developed by Fireworks AI with a commercially viable license. It is trained on Llama 3 and supports parallel function calls with strong instruction-following capabilities.

Large Language Model #Function Call Optimization #Parallel Function Processing #Low-Cost High-Efficiency Inference

Downloads 1.6M

Release Time : 6/19/2024

Model Overview

FireFunction V2 is a high-performance function calling model that retains Llama 3's conversational and instruction-following abilities, competing fiercely with GPT-4o in function calling.

Model Features

High-Performance Function Calling

Competes fiercely with GPT-4o in function calling, scoring 0.81 vs. 0.80 in multiple public evaluations.

Based on Llama 3

Retains Llama 3's conversational and instruction-following abilities, scoring 0.84 on MT bench compared to Llama 3's 0.89.

Parallel Function Calling

Supports parallel function calls with significant quality improvements over FireFunction v1.

Low Cost and High Efficiency

Hosted on the Fireworks platform, costing less than 10% of GPT-4o while being twice as fast.

Model Capabilities

Text Generation

Function Calling

Conversation

Instruction Following

Use Cases

Conversation Systems

Smart Customer Service

Used to build efficient smart customer service systems.

Delivers high-quality conversational experiences.

Function Calling

API Integration

Used to build complex API integration systems.

Efficiently and accurately calls external functions.

🚀 [MaziyarPanahi/firefunction-v2-GGUF]

This repository contains GGUF format model files for the [fireworks-ai/firefunction-v2] model, enabling efficient text generation and function calling.

🚀 Quick Start

This section provides a high - level overview of the model and its usage. For detailed information, please refer to the subsequent sections.

✨ Features

Quantized Formats: Supports multiple quantization levels including 2 - bit, 3 - bit, 4 - bit, 5 - bit, 6 - bit, and 8 - bit, which can significantly reduce memory usage and improve inference speed.
GGUF Compatibility: Uses the GGUF format, a new and efficient format introduced by the llama.cpp team, replacing the deprecated GGML format.
Function Calling: Enables function calling, a powerful feature for interacting with external APIs and performing complex tasks.
Conversational Capability: Retains good conversation and instruction - following capabilities, making it suitable for chat - based applications.
Cost - Effective: Hosted on the Fireworks platform at < 10% of the cost of GPT 4o and 2x the speed.

📦 Installation

No specific installation steps are provided in the original README. If you want to use the model, you need to refer to the supported clients and libraries mentioned below.

💻 Usage Examples

No code examples are provided in the original README. You can use the model through the following supported clients and libraries:

Supported Clients and Libraries

llama.cpp. The source project for GGUF. Offers a CLI and a server option.
[llama - cpp - python](https://github.com/abetlen/llama - cpp - python), a Python library with GPU accel, LangChain support, and OpenAI - compatible API server.
LM Studio, an easy - to - use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
[text - generation - webui](https://github.com/oobabooga/text - generation - webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
GPT4All, a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
[LoLLMS Web UI](https://github.com/ParisNeo/lollms - webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
Faraday.dev, an attractive and easy to use character - based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.
ctransformers, a Python library with GPU accel, LangChain support, and OpenAI - compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.

📚 Documentation

Model Information

Property	Details
Model Name	MaziyarPanahi/firefunction - v2 - GGUF
Base Model	fireworks - ai/firefunction - v2
Model Creator	[fireworks - ai](https://huggingface.co/fireworks - ai)
Pipeline Tag	text - generation
Quantized By	MaziyarPanahi
License	llama3

About GGUF

GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.

Comparison with Other Models

Function - Calling Performance: Competitive with GPT - 4o at function - calling, scoring 0.81 vs 0.80 on a medley of public evaluations.
Conversation and Instruction - Following: Trained on Llama 3 and retains Llama 3’s conversation and instruction - following capabilities, scoring 0.84 vs Llama 3’s 0.89 on MT bench.
Quality Improvement: Significant quality improvements over FireFunction v1 across the broad range of metrics.

General Info

Successor Model: It is the successor of the [FireFunction](https://fireworks.ai/models/fireworks/firefunction - v1) model.
Function Calling Support: Supports parallel function calling (unlike FireFunction v1) and has good instruction following.
Hosting Platform: Hosted on the [Fireworks](https://fireworks.ai/models/fireworks/firefunction - v2) platform at < 10% of the cost of GPT 4o and 2x the speed.

🔧 Technical Details

No specific technical details are provided in the original README.

📄 License

The model is licensed under the llama3 license.

Special thanks

🙏 Special thanks to Georgi Gerganov and the whole team working on llama.cpp for making all of this possible.

Original README

FireFunction V2: Fireworks Function Calling Model

[Try on Fireworks](https://fireworks.ai/models/fireworks/firefunction - v2) | [API Docs](https://readme.fireworks.ai/docs/function - calling) | [Demo App](https://functional - chat.vercel.app/) | Discord

FireFunction is a state - of - the - art function calling model with a commercially viable license. View detailed info in our [announcement blog](https://fireworks.ai/blog/firefunction - v2 - launch - post).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご