Ruflo Ruvllm
Plugin Verified ActiveRuVLLM local inference with chat formatting (Claude/GPT/Gemini/Ollama/Cohere), model configuration, MicroLoRA fine-tuning, and SONA real-time adaptation
To enable users to run and fine-tune large language models locally with advanced features for optimal performance and integration into RAG pipelines.
Features
- Local LLM inference with RuVLLM
- Model configuration and optimization
- MicroLoRA task-specific fine-tuning
- SONA real-time adaptation
- Multi-provider chat formatting (Claude, GPT, Gemini, Ollama, Cohere)
- HNSW routing for RAG context retrieval
Use Cases
- Configuring optimal local LLM models for specific tasks.
- Fine-tuning LLMs with lightweight adapters (MicroLoRA) for specialized domains.
- Implementing real-time adaptation (SONA) for continuous feedback loops.
- Preparing prompts for various LLM providers and integrating RAG context.
Non-Goals
- Cloud-based LLM inference
- Replacing core Claude Code functionality
- General-purpose system administration tools
Installation
First, add the marketplace
/plugin marketplace add ruvnet/ruflo/plugin install ruflo-ruvllm@rufloQuality Score
VerifiedTrust Signals
Similar Extensions
Microsoft Learn MCP Server
100Access official Microsoft documentation, API references, and code samples for Azure, .NET, Windows, and more.
Build with Claude
99Docker-based MCP servers from the official Docker MCP registry - includes 199+ verified servers
Ruflo Rag Memory
99RuVector memory with HNSW search, AgentDB, and semantic retrieval
Brave Search Skills
99Official Brave Search API skills for AI coding agents
Build with Claude
98Complete collection of 117 specialized AI agents across 11 categories
Fp Check
97Systematic false positive verification for security bug analysis with mandatory gate reviews