Senior Data Engineer

Skill Verified Active

Data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, implementing data governance, or troubleshooting data issues.

Purpose

To empower users with the tools and knowledge needed to design, build, and optimize robust data pipelines and infrastructure.

Features

Build scalable data pipelines
Develop ETL/ELT systems
Orchestrate data workflows
Implement data quality frameworks
Optimize data infrastructure performance

Use Cases

Designing data architectures
Building robust data pipelines
Optimizing data workflow performance
Implementing data governance and quality checks

Non-Goals

Real-time data analysis
Machine learning model deployment
Application development

Documentation

info:Configuration & parameter referenceWhile the SKILL.md provides example CLI commands and code snippets, explicit documentation for all configuration options, parameters, and their precedence is not detailed.

Code Execution

info:ValidationInput validation and sanitization are present in tools like `etl_performance_optimizer.py` and `data_quality_validator.py` through argument parsing and schema checks, but a formal validation library like Zod or Pydantic is not explicitly demonstrated across all scripts.
info:LoggingThe Python scripts utilize the `logging` module for messages, but there is no explicit mention or implementation of a persistent audit log file for tracking destructive actions or outbound calls.

Scope

info:Tool surface sizeThe repository contains multiple large Python scripts and numerous reference markdown files, suggesting a broad surface area, though individual tools seem focused.

Errors

info:Actionable error messagesThe CLI tools provide basic error messages, and scripts include logging, but detailed remediation steps or links for every error path are not consistently provided.

Practical Utility

info:Edge casesWhile the code handles some errors and provides usage patterns, explicit documentation on failure modes (e.g., malformed input, missing dependencies) with recovery steps is limited.

Installation

First, add the marketplace

/plugin marketplace add alirezarezvani/claude-skills

/plugin install engineering-team@claude-code-skills

Quality Score

Verified

95 /100

Analyzed about 16 hours ago

Trust Signals

Last commitabout 21 hours ago

GitHub owner alirezarezvani

Stars14.6k

LicenseMIT

Websitealirezarezvani.medium.com

Status

View Source

Similar Extensions

Data Engineer

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms.

Skill

davila7

Spark Engineer

Use when writing Spark jobs, debugging performance issues, or configuring cluster settings for Apache Spark applications, distributed data processing pipelines, or big data workloads. Invoke to write DataFrame transformations, optimize Spark SQL queries, implement RDD pipelines, tune shuffle operations, configure executor memory, process .parquet files, handle data partitioning, or build structured streaming analytics.

Skill

jeffallan

Dbt Transformation Patterns

Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies. Use when building data transformations, creating data models, or implementing analytics engineering best practices.

Skill

wshobson

Data Quality Frameworks

Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.

Skill

wshobson

Data Quality Auditor

Audit datasets for completeness, consistency, accuracy, and validity. Profile data distributions, detect anomalies and outliers, surface structural issues, and produce an actionable remediation plan.

Skill

alirezarezvani

Airflow Dag Patterns

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

Skill

wshobson