Gemma-2-2B CrossCoder-L13 mu4.1e-02 lr1e-04 Open Source Model - Powerful Cross-Encoding for Information Processing

Gemma 2 2b Crosscoder L13 Mu4.1e 02 Lr1e 04

Developed by science-of-finetuning

Cross-encoder trained on parallel activations from layer 13 of Gemma 2 2B and Gemma 2 2B IT models

Open Source License:MIT #Cross-encoder feature extraction #Gemma dual-model hybrid #Neural network layer activation analysis

Downloads 51

Release Time : 11/22/2024

Model Overview

This cross-encoder was trained on subsets of fineweb and lsmsy-chat-1m datasets, primarily for feature extraction tasks.

Model Features

Parallel activation training

Trained on parallel activations from layer 13 of Gemma 2 2B and Gemma 2 2B IT models

Efficient feature extraction

Focuses on extracting meaningful feature representations from intermediate model layers

Sparse feature learning

Supports L1 and L0 sparsity metrics to generate sparse feature representations

Model Capabilities

Intermediate model layer feature extraction

Cross-model feature fusion

Sparse feature generation

Use Cases

Model analysis

Model internal representation research

Analyze differences in internal representations of different models under identical inputs

Quantifiable comparison of feature representation similarity across models

Feature engineering

Downstream task feature extraction

Extract intermediate layer features from pre-trained models for downstream tasks

Provides richer feature representations

🚀 CrossCoder for Gemma 2 2B

This crosscoder is trained on parallel activations of Gemma 2 2B and Gemma 2 2B IT, enabling feature - extraction capabilities on specific datasets.

🚀 Quick Start

This crosscoder was trained on parallel activations of Gemma 2 2B and Gemma 2 2B IT at layer 13 on a subset of fineweb and lsmsy - chat - 1m dataset.

You can load it using our branch of the dictionary_learning library:

💻 Usage Examples

Basic Usage

!pip install git+https://github.com/jkminder/dictionary_learning
from dictionary_learning import CrossCoder
from nnsight import LanguageModel
import torch as th

crosscoder = CrossCoder.from_pretrained("Butanium/gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04", from_hub=True)
gemma_2 = LanguageModel("google/gemma-2-2b", device_map="cuda:0")
gemma_2_it = LanguageModel("google/gemma-2-2b-it", device_map="cuda:1")
prompt = "quick fox brown"

with gemma_2.trace(prompt):
    l13_act_base = gemma_2.model.layers[13].output[0][:, -1].save() # (1, 2304)
    gemma_2.model.layers[13].output.stop()

with gemma_2_it.trace(prompt):
    l13_act_it = gemma_2_it.model.layers[13].output[0][:, -1].save() # (1, 2304)
    gemma_2_it.model.layers[13].output.stop()


crosscoder_input = th.cat([l13_act_base, l13_act_it], dim=0).unsqueeze(0).cpu() # (batch, 2, 2304)
print(crosscoder_input.shape)
reconstruction, features = crosscoder(crosscoder_input, output_features=True)

# print metrics
print(f"MSE loss: {th.nn.functional.mse_loss(reconstruction, crosscoder_input).item():.2f}")
print(f"L1 sparsity: {features.abs().sum():.1f}")
print(f"L0 sparsity: {(features > 1e-4).sum()}")

📄 License

This project is licensed under the MIT license.

📚 Documentation

Model Details

Property	Details
Model Type	CrossCoder for feature - extraction
Training Data	Subset of HuggingFaceFW/fineweb and lmsys/lmsys - chat - 1m
Base Model	google/gemma - 2 - 2b - it, google/gemma - 2 - 2b
Pipeline Tag	feature - extraction
Tags	model_hub_mixin, pytorch_model_hub_mixin, crosscoder