L

Layerskip Llama2 7B

Developed by facebook
An improved model based on Llama2 7B, supporting hierarchical skip and self-speculative decoding to enhance inference efficiency
Downloads 1,674
Release Time : 6/13/2024

Model Overview

Supports hierarchical skip functionality through continuous pretraining, enabling draft generation with shallow layers followed by deep-layer validation to achieve self-speculative decoding for accelerated inference

Model Features

Hierarchical Skip
Supports early-exit inference, allowing predictions at different depth levels
Self-Speculative Decoding
Generates drafts with shallow sub-models first, then validates with deep models, significantly improving decoding speed
Efficient Inference
Self-speculative decoding delivers 60% speed improvement compared to standard Llama2 models

Model Capabilities

Text Generation
Question Answering
Content Creation

Use Cases

Research & Development
Efficient Inference Research
Used for studying efficient inference methods in large language models
60% speed improvement with self-speculative decoding
Educational Applications
Teaching Demonstration
Demonstrates working principles and optimization techniques of large language models
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase