LWM-v1.1 Open-source Wireless Channel Feature Extraction Model - Supports Various Configurations to Improve Extraction Quality and Generalization Ability

Lwm V1.1

Developed by wi-lab

LWM 1.1 is an upgraded pre-trained model specifically designed for wireless channel feature extraction, supporting diverse channel configurations to enhance feature extraction quality and generalization capabilities.

Physics Model

Transformers

#Wireless Channel Feature Extraction #Masked Channel Modeling #Multi-scale Adaptation

Downloads 277

Release Time : 4/25/2025

Model Overview

LWM 1.1 is a Transformer-based foundational model for wireless channels, learning spatial and frequency dependencies of wireless channels through Masked Channel Modeling (MCM) pre-training, suitable for various wireless communication and sensing tasks.

Model Features

Enhanced Input Flexibility

Supports various (N, SC) configurations with sequence length increased to 512, adapting to the diverse configurations of real-world wireless systems.

Dataset and Pre-training Enhancement

Training scenarios increased from 15 to 140, masking ratio raised to 40%, with pre-training samples reaching 1.05 million, significantly improving cross-environment generalization.

Model Architecture Optimization

Parameter count increased to 2.5 million, employing 2D block processing technology to cover both antenna and subcarrier dimensions, enhancing spatial-frequency feature learning.

Training and Efficiency Optimization

Utilizes AdamW optimizer with cosine decay strategy and bucket batching to optimize memory usage, balancing computational cost and feature extraction capability.

Task Adaptability

Supports freezing specific layers for targeted fine-tuning, provides default classification and regression heads, and allows user-defined modules.

Model Capabilities

Wireless Channel Feature Extraction

Spatial and Frequency Dependency Modeling

LoS/NLoS Classification

Beam Prediction

Few-shot Learning

Use Cases

Wireless Communication

LoS/NLoS Classification

Classification task based on (32, 32) channels between BS 3 and 8,299 users in DeepMIMO's Denver dense scenario.

Embedding features based on LWM show significant advantages compared to raw channel data.

Beam Prediction

Predicts optimal beam direction using pre-trained embedding features.

Wireless Sensing

Environmental Sensing

Extracts environmental features from channel data for scene recognition or user localization.

🚀 LWM 1.1

LWM 1.1 is an updated pre - trained model for feature extraction in wireless channels, offering enhanced scalability, generalization, and efficiency.

🚀 Quick Start

Interactive Demo: 🚀 Click here to try the Interactive Demo Based on LWM 1.0!
Colab Notebook: 🚀 Click here to try the Colab Notebook!

✨ Features

🎥 LWM Tutorial Series

Explore LWM concepts and applications in this compact video series:


▶ Watch	▶ Watch	▶ Watch
▶ Watch	▶ Watch	▶ Watch

How is LWM 1.1 built?

LWM 1.1 is a transformer - based architecture designed to model spatial and frequency dependencies in wireless channel data. It utilizes an enhanced Masked Channel Modeling (MCM) pretraining approach, with an increased masking ratio to improve feature learning and generalization. The introduction of 2D patch segmentation allows the model to jointly process spatial (antenna) and frequency (subcarrier) relationships, providing a more structured representation of the channel. Additionally, bucket - based batching is employed to efficiently handle variable - sized inputs without excessive padding, ensuring memory - efficient training and inference. These modifications enable LWM 1.1 to extract meaningful embeddings from a wide range of wireless scenarios, improving its applicability across different system configurations.

What does LWM 1.1 offer?

LWM 1.1 serves as a general - purpose feature extractor for wireless communication and sensing tasks. Pretrained on an expanded and more diverse dataset, it effectively captures channel characteristics across various environments, including dense urban areas, simulated settings, and real - world deployments. The model's increased capacity and optimized pretraining strategy improve the quality of extracted representations, enhancing its applicability for downstream tasks.

How is LWM 1.1 used?

LWM 1.1 is designed for seamless integration into wireless communication pipelines as a pre - trained embedding extractor. By processing raw channel data, the model generates structured representations that encode spatial, frequency, and propagation characteristics. These embeddings can be directly used for downstream tasks, reducing the need for extensive labeled data while improving model efficiency and generalization across different system configurations.

Advantages of Using LWM 1.1

Enhanced Flexibility: Handles diverse channel configurations with no size limitations.
Refined Embeddings: Improved feature extraction through advanced pretraining and increased model capacity.
Efficient Processing: Memory - optimized with bucket - based batching for variable - sized inputs.
Broad Generalization: Trained on a larger, more diverse dataset for reliable performance across environments.
Task Adaptability: Fine - tuning options enable seamless integration into a wide range of applications.

For example, the following figure demonstrates the advantages of using LWM - based highly compact CLS embeddings and high - dimensional channel embeddings over raw channels for the LoS/NLoS classification task. The raw dataset is derived from channels of size (32, 32) between BS 3 and 8,299 users in the densified Denver scenario of the DeepMIMO dataset.

LoS/NLoS Classification Performance

Figure: This figure shows the F1 - score comparison of models trained with wireless channels and their LWM embeddings for LoS/NLoS classification.

🔧 Technical Details

Key Improvements in LWM - v1.1

1️⃣ Expanded Input Flexibility

Removed Fixed Channel Size Constraints: Supports multiple (N, SC) configurations instead of being restricted to (32, 32).
Increased Sequence Length: Extended from 128 to 512, allowing the model to process larger input dimensions efficiently.

2️⃣ Enhanced Dataset and Pretraining

Broader Dataset Coverage: Increased the number of training scenarios from 15 to 140, improving generalization across environments.
Higher Masking Ratio in MCM: Increased from 15% to 40%, making the Masked Channel Modeling (MCM) task more challenging and effective for feature extraction.
Larger Pretraining Dataset: Expanded from 820K to 1.05M samples for more robust representation learning.

3️⃣ Improved Model Architecture

Increased Model Capacity: Parameter count expanded from 600K to 2.5M, enhancing representational power.
2D Patch Segmentation: Instead of segmenting channels along a single dimension (antennas or subcarriers), patches now span both antennas and subcarriers, improving spatial - frequency feature learning.

4️⃣ Optimized Training and Efficiency

Adaptive Learning Rate Schedule: Implemented AdamW with Cosine Decay, improving convergence stability.
Computational Efficiency: Reduced the number of attention heads per layer from 12 to 8, balancing computational cost with feature extraction capability.

Comparison of LWM Versions

Property	Details
Channel Size Limitation	LWM 1.0: Fixed at (32, 32); LWM 1.1: Supports multiple (N, SC) pairs
Sequence Length Support	LWM 1.0: 128 (16 - dimensional); LWM 1.1: 512 (32 - dimensional)
Pre - training Samples	LWM 1.0: 820K; LWM 1.1: 1.05M
Pre - training Scenarios	LWM 1.0: 15; LWM 1.1: 140
Masking Ratio	LWM 1.0: 15%; LWM 1.1: 40%
Embedding size	LWM 1.0: 64; LWM 1.1: 128
Number of Parameters	LWM 1.0: 600K; LWM 1.1: 2.5M
Segmentation	LWM 1.0: 1D; LWM 1.1: 2D

📚 Documentation

Detailed Changes in LWM 1.1

No Channel Size Limitation

In LWM 1.0, the model was pre - trained on a single (N, SC)=(32, 32) pair, which limited its generalization to other channel configurations. Wireless communication systems in the real world exhibit vast variability in the number of antennas (N) at base stations and subcarriers (SC). To address this limitation, LWM 1.1 was pre - trained on 20 distinct (N, SC) pairs, ranging from smaller setups like (8, 32) to more complex setups like (128, 64). This variety enables the model to effectively handle diverse channel configurations and ensures robust generalization without overfitting to specific configurations.

To handle variable - sized inputs efficiently, we implemented bucket - based batching, where inputs of similar sizes are grouped together. For example, channels with sizes (32, 64) and (16, 128) are placed in the same bucket, avoiding the excessive padding common in traditional batching approaches. This not only saves memory but also ensures computational efficiency during training. Furthermore, validation samples were drawn as 20% of each bucket, maintaining a balanced evaluation process across all input sizes.

This approach eliminates the rigidity of fixed channel sizes and positions LWM 1.1 as a versatile model capable of adapting to real - world wireless systems with varying configurations.

Larger and More Diverse Pretraining Dataset

Generalization is a critical aspect of any foundation model. In LWM 1.1, we significantly expanded the training dataset to cover more diverse scenarios and environments. We added seven new city scenarios—Charlotte, Denver, Oklahoma, Indianapolis, Fort Worth, Santa Clara, and San Diego—to enrich the model’s exposure to a variety of real - world conditions.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご