此内容尚未提供您的语言版本,正在以英文显示。

Mamba Architecture

技能已验证活跃

属于:Agent Native Research Artifact (ARA) Tooling

State-space model with O(n) complexity vs Transformers' O(n²). 5× faster inference, million-token sequences, no KV cache. Selective SSM with hardware-aware design. Mamba-1 (d_state=16) and Mamba-2 (d_state=128, multi-head). Models 130M-2.8B on HuggingFace.

目的

To explain and demonstrate the Mamba state-space model architecture, highlighting its advantages in speed, memory efficiency, and long-context handling for AI research and development.

功能

O(n) linear complexity for sequence modeling
5x faster inference than Transformers
No KV cache required, reducing memory usage
Enables million-token sequences
Hardware-aware design for performance optimization

使用场景

Implementing models for long sequences (100K+ tokens)
Building streaming applications with LLMs
Optimizing inference speed and memory footprint
Researching alternatives to Transformer architectures

非目标

Providing a pre-trained Mamba model for direct use
Acting as a general-purpose LLM framework
Covering Transformer architecture details beyond comparison

安装

请先添加 Marketplace

/plugin marketplace add Orchestra-Research/AI-Research-SKILLs

/plugin install AI-Research-SKILLs@ai-research-skills

质量评分

已验证

99 /100

1 day ago 分析

信任信号

最近提交17 days ago

GitHub 所有者 Orchestra-Research

星标8.3k

下载量 0

许可证MIT

网站orchestra-research.com

状态

查看源代码

类似扩展

Mamba Architecture

技能

davila7

Rwkv Architecture

RNN+Transformer hybrid with O(n) inference. Linear time, infinite context, no KV cache. Train like GPT (parallel), infer like RNN (sequential). Linux Foundation AI project. Production at Windows, Office, NeMo. RWKV-7 (March 2025). Models up to 14B parameters.

技能

Orchestra-Research

Rwkv Architecture

技能

davila7

TorchTitan Distributed LLM Pretraining

Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.

技能

Orchestra-Research

Implementing Llms Litgpt

Implements and trains LLMs using Lightning AI's LitGPT with 20+ pretrained architectures (Llama, Gemma, Phi, Qwen, Mistral). Use when need clean model implementations, educational understanding of architectures, or production fine-tuning with LoRA/QLoRA. Single-file implementations, no abstraction layers.

技能

Orchestra-Research

Distributed Llm Pretraining Torchtitan

技能

davila7