此内容尚未提供您的语言版本,正在以英文显示。

Data Engineering

插件已验证活跃

ETL pipeline construction, data warehouse design, batch processing workflows, and data-driven feature development

4 个 Skill 0 个 MCP

目的

To equip users with the knowledge and patterns required to build robust, scalable, and efficient data engineering solutions, from initial pipeline design to data-driven feature implementation.

功能

ETL pipeline construction patterns
Data warehouse and lakehouse design guidance
Batch and streaming data processing workflows
Data-driven feature development orchestration
Spark, dbt, and Airflow optimization techniques
Data quality framework implementation
API design and integration for data systems

使用场景

Building a new data pipeline from scratch
Optimizing existing slow Spark jobs
Implementing robust data quality checks
Designing a modern data warehouse schema
Orchestrating complex batch processing workflows

非目标

Providing a fully automated, one-click data pipeline generator
Replacing specialized data engineering tools directly
Offering real-time data visualization dashboards

Compliance

info:GDPRWhile the skills focus on data pipeline construction and do not directly handle PII, the data quality patterns mention GDPR compliance, implying awareness but no explicit sanitization mechanisms are detailed.

安装

请先添加 Marketplace

/plugin marketplace add wshobson/agents

/plugin install data-engineering@claude-code-workflows

包含 4 个扩展

Skill (4)

Airflow Dag Patterns 技能

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

Data Quality Frameworks 技能

Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.

Dbt Transformation Patterns 技能

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

Spark Optimization 技能

Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.

质量评分

已验证

98 /100

3 days ago 分析

信任信号

最近提交5 days ago

GitHub 所有者 wshobson

星标35.3k

许可证MIT

网站sethhobson.com

状态

查看源代码

类似扩展

Snowflake Development

Snowflake SQL, data pipelines (Dynamic Tables, Streams+Tasks), Cortex AI functions, Snowpark Python, and dbt integration. Includes query helper script, 3 reference guides, and troubleshooting.

插件

alirezarezvani

Voltagent Data Ai

数据工程、机器学习和人工智能专家 - 数据管道、机器学习、LLM 架构

插件

VoltAgent

MongoDB MCP Server + Skills

100

官方 Claude 插件，用于 MongoDB (MCP Server + Skills)。连接到数据库，探索数据，管理集合，优化查询，生成可靠的代码，实施最佳实践，开发高级功能，等等。

插件

mongodb

Autoresearch Agent

100

Autonomous experiment loop that optimizes any file by a measurable metric. 5 slash commands, 8 evaluators, configurable loop intervals (10min to monthly).

插件

alirezarezvani

Train Sentence Transformers

Train or fine-tune sentence-transformers models across all three architectures: SentenceTransformer (bi-encoder embeddings), CrossEncoder (rerankers), and SparseEncoder (SPLADE). Covers loss selection, hard-negative mining, evaluators, distillation, LoRA, Matryoshka, and Hugging Face Hub publishing.

插件

huggingface

Build with Claude

Agents for data engineering, machine learning, and AI development

插件

davepoon