Sparse Autoencoder Training & Analysis
Skill Verifiziert AktivProvides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.
To enable researchers and practitioners to discover interpretable features within neural networks by training and analyzing Sparse Autoencoders.
Funktionen
- Train custom Sparse Autoencoders
- Load and analyze pre-trained SAEs
- Decompose neural network activations into sparse features
- Perform feature attribution and steering
- Analyze superposition and monosemanticity
Anwendungsfälle
- Discovering interpretable concepts learned by neural networks
- Analyzing feature interactions and superposition effects
- Studying safety-relevant features like bias or deception
- Performing feature-based model steering or ablation experiments
Nicht-Ziele
- Directly modifying neural network architectures beyond SAE integration
- Performing causal intervention experiments without SAE features
- Production deployment of steering mechanisms (focus is on analysis)
Workflow
- Load model and pre-trained SAE
- Get model activations
- Encode activations to SAE features
- Analyze features and reconstruction
- Optionally, train a custom SAE
- Analyze feature attribution and steering
Praktiken
- Mechanistic Interpretability
- Feature Engineering
- Model Analysis
Voraussetzungen
- Python 3.10+
- transformer-lens>=2.0.0
- torch>=2.0.0
- sae-lens>=6.0.0
Installation
Zuerst Marketplace hinzufügen
/plugin marketplace add Orchestra-Research/AI-Research-SKILLs/plugin install AI-Research-SKILLs@ai-research-skillsQualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Sparse Autoencoder Training
98Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.
Embedding Strategies
100Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.
Aws Cdk Development
100AWS Cloud Development Kit (CDK) Experte für den Aufbau von Cloud-Infrastruktur mit TypeScript/Python. Verwenden Sie dies beim Erstellen von CDK-Stacks, Definieren von CDK-Konstrukten, Implementieren von Infrastructure as Code oder wenn der Benutzer CDK, CloudFormation, IaC, cdk synth, cdk deploy erwähnt oder AWS-Infrastruktur programmatisch definieren möchte. Behandelt CDK-App-Struktur, Konstruktmuster, Stack-Komposition und Bereitstellungs-Workflows.
Fit Drift Diffusion Model
100Fit cognitive drift-diffusion models (Ratcliff DDM) to reaction time and accuracy data with parameter estimation (drift rate, boundary separation, non-decision time), model comparison, and parameter recovery validation. Use when modeling binary decision-making with reaction time data, estimating cognitive parameters from experimental data, comparing sequential sampling model variants, or decomposing speed-accuracy tradeoff effects into latent cognitive components.
Ui Ux Pro Max
100UI/UX design intelligence with searchable style, palette, typography, and chart databases. Use when designing UI components, choosing colors/fonts, reviewing code for UX issues, building landing pages, or implementing responsive layouts.
Google Tts
100Convert documents and text to audio using Google Cloud Text-to-Speech. Use this skill when the user wants to: narrate a document, read aloud text, generate audio from a file, convert text to speech, create a recording of documentation or analysis, create a podcast from a document, or use Google TTS/text-to-speech. Trigger phrases: "read this aloud", "narrate this", "create a recording", "text to speech", "TTS", "convert to audio", "audio from document", "listen to this", "generate audio", "google tts", "create a podcast".