跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Senior Data Engineer

技能 已验证 活跃

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

目的

To empower users with the tools and knowledge needed to design, build, and optimize robust data pipelines and infrastructure.

功能

  • Build scalable data pipelines
  • Develop ETL/ELT systems
  • Orchestrate data workflows
  • Implement data quality frameworks
  • Optimize data infrastructure performance

使用场景

  • Designing data architectures
  • Building robust data pipelines
  • Optimizing data workflow performance
  • Implementing data governance and quality checks

非目标

  • Real-time data analysis
  • Machine learning model deployment
  • Application development

Documentation

  • info:Configuration & parameter referenceWhile the SKILL.md provides example CLI commands and code snippets, explicit documentation for all configuration options, parameters, and their precedence is not detailed.

Code Execution

  • info:ValidationInput validation and sanitization are present in tools like `etl_performance_optimizer.py` and `data_quality_validator.py` through argument parsing and schema checks, but a formal validation library like Zod or Pydantic is not explicitly demonstrated across all scripts.
  • info:LoggingThe Python scripts utilize the `logging` module for messages, but there is no explicit mention or implementation of a persistent audit log file for tracking destructive actions or outbound calls.

Scope

  • info:Tool surface sizeThe repository contains multiple large Python scripts and numerous reference markdown files, suggesting a broad surface area, though individual tools seem focused.

Errors

  • info:Actionable error messagesThe CLI tools provide basic error messages, and scripts include logging, but detailed remediation steps or links for every error path are not consistently provided.

Practical Utility

  • info:Edge casesWhile the code handles some errors and provides usage patterns, explicit documentation on failure modes (e.g., malformed input, missing dependencies) with recovery steps is limited.

安装

请先添加 Marketplace

/plugin marketplace add alirezarezvani/claude-skills
/plugin install engineering-team@claude-code-skills

质量评分

已验证
95 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标14.6k
许可证MIT
状态
查看源代码

类似扩展

Data Engineer

94

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms.

技能
davila7

Spark Engineer

99

Use when writing Spark jobs, debugging performance issues, or configuring cluster settings for Apache Spark applications, distributed data processing pipelines, or big data workloads. Invoke to write DataFrame transformations, optimize Spark SQL queries, implement RDD pipelines, tune shuffle operations, configure executor memory, process .parquet files, handle data partitioning, or build structured streaming analytics.

技能
jeffallan

Dbt Transformation Patterns

98

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

技能
wshobson

Data Quality Frameworks

97

Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.

技能
wshobson

Data Quality Auditor

97

Audit datasets for completeness, consistency, accuracy, and validity. Profile data distributions, detect anomalies and outliers, surface structural issues, and produce an actionable remediation plan.

技能
alirezarezvani

Airflow Dag Patterns

95

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

技能
wshobson