L

Llama 3 70B Special Tokens Adjusted

Developed by astronomer
A special token adjusted version optimized based on Meta-Llama-3-70B, fixing fine-tuning issues caused by untrained special tokens in the original model
Downloads 33
Release Time : 4/25/2024

Model Overview

This model is an optimized version of Meta-Llama-3-70B, primarily addressing the issue of some special tokens not being trained in the original model, making it more suitable for downstream task fine-tuning.

Model Features

Special Token Optimization
Fixed the issue of untrained special tokens in the original model by filling their weights with the mean of trained tokens
Fine-tuning Stability Enhancement
Resolved potential gradient explosion or NaN gradient issues during fine-tuning
Compatibility Preservation
Maintained full functional compatibility with the original Meta-Llama-3-70B model, only optimizing special token handling

Model Capabilities

Text Generation
Instruction Following
Downstream Task Fine-tuning

Use Cases

Natural Language Processing
Instruction Fine-tuning
Used as a base model for instruction fine-tuning
Avoids training instability caused by special token issues
Adding New Tokens
Adding new tokens during fine-tuning
New tokens can obtain reasonable initial embedding values
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase