N

Nvidia Llama 3.1 Nemotron 70B Instruct HF AWQ INT4

Developed by ibnzterrell
This is NVIDIA's AWQ 4-bit quantized version of the Llama-3.1-Nemotron-70B-Instruct model, customized based on Meta's Llama-3.1-70B-Instruct, focusing on improving the usefulness of generated responses.
Downloads 206
Release Time : 10/24/2024

Model Overview

This model is a large language model optimized to provide high-quality answers, supports multiple languages, and is suitable for text generation tasks.

Model Features

High-performance quantization
Quantized from FP16 to INT4 using AutoAWQ, employing GEMM kernels, zero-point quantization, and a group size of 128 to optimize inference efficiency.
Multilingual support
Supports multiple languages including English, German, French, Spanish, etc., suitable for international applications.
Reinforcement learning alignment
Uses RLHF and HelpSteer2-Preference prompts for reinforcement learning alignment training to enhance the usefulness of generated responses.

Model Capabilities

Text generation
Multilingual support
Dialogue systems

Use Cases

Dialogue systems
Intelligent customer service
Used to build multilingual intelligent customer service systems, providing high-quality responses.
Achieved 85.0 on Arena Hard and 57.6 on AlpacaEval 2 LC.
Content generation
Multilingual content creation
Generates high-quality multilingual text content suitable for news, blogs, etc.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase