Tuning Incremental Sync Config
Skill Verifiziert AktivChange the sync configuration of an existing data warehouse schema — switch sync_type, pick a different incremental_field, set primary_key_columns, choose cdc_table_mode, or change sync_frequency. Use when the user asks "switch my orders table from full refresh to incremental", "this table is syncing too slowly / too frequently", "I need to pick a different incremental column", "set up CDC for this Postgres table", or when diagnosis of a failing sync pointed to an incremental-field or PK misconfiguration.
Modify the synchronization configuration of existing data warehouse schemas to optimize performance, fix issues, or adapt to source changes.
Funktionen
- Change sync type (full refresh, incremental, CDC, webhook)
- Select or update incremental fields and their types
- Define or modify primary key columns
- Configure CDC table modes
- Adjust sync frequency and time of day
- Pause or resume schema syncing
- Register and manage webhooks for specific sources
- Check CDC prerequisites and webhook validity
- Trigger full resyncs or data deletion when needed
Anwendungsfälle
- Switching a table's sync type from full refresh to incremental.
- Resolving sync failures caused by incorrect incremental fields or primary keys.
- Adjusting sync frequency for tables that are syncing too slowly or too frequently.
- Setting up Change Data Capture (CDC) for supported sources.
- Pausing a schema sync without deleting its configuration.
Nicht-Ziele
- Setting up brand new data warehouse sources (use a dedicated skill for that).
- Modifying the underlying source table schemas directly.
- Performing general data quality checks or transformations on synced data.
Installation
npx skills add PostHog/posthogFührt das Vercel skills CLI (skills.sh) via npx aus — benötigt Node.js lokal und mindestens einen installierten skills-kompatiblen Agent (Claude Code, Cursor, Codex, …). Setzt voraus, dass das Repo dem agentskills.io-Format folgt.
Qualitätspunktzahl
VerifiziertVertrauenssignale
Ähnliche Erweiterungen
Suggesting Data Imports
76Use when the user asks about revenue, payments, subscriptions, billing, CRM deals, support tickets, production database tables, or other data that PostHog does not collect natively. Also use when a query fails because a table does not exist or returns no results for expected external data. The data warehouse can import from SaaS tools (Stripe, Hubspot, etc.), production databases (Postgres, MySQL, BigQuery, Snowflake), and other arbitrary data sources. Covers checking existing sources, identifying the right source type, and guiding the setup.
Polars
99Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.
Spark Engineer
99Use when writing Spark jobs, debugging performance issues, or configuring cluster settings for Apache Spark applications, distributed data processing pipelines, or big data workloads. Invoke to write DataFrame transformations, optimize Spark SQL queries, implement RDD pipelines, tune shuffle operations, configure executor memory, process .parquet files, handle data partitioning, or build structured streaming analytics.
MongoDB Atlas Stream Processing
98Verwaltet MongoDB Atlas Stream Processing (ASP)-Workflows. Kümmert sich um die Bereitstellung von Arbeitsbereichen, Verbindungen zu Datenquellen/Senken, Betriebslebenszyklus von Prozessoren, Debugging-Diagnosen und Größeneinstufung von Ebenen. Unterstützt Kafka-, Atlas-Cluster-, S3-, HTTPS- und Lambda-Integrationen für Streaming-Daten-Workloads und Ereignisverarbeitung. NICHT für allgemeine MongoDB-Abfragen oder Atlas-Clusterverwaltung. Erfordert MongoDB MCP Server mit Atlas API-Anmeldeinformationen.
Snowflake Development
98Use when writing Snowflake SQL, building data pipelines with Dynamic Tables or Streams/Tasks, using Cortex AI functions, creating Cortex Agents, writing Snowpark Python, configuring dbt for Snowflake, or troubleshooting Snowflake errors.
Data Warehouse Experimentation
97Running experiments out of the data warehouse instead of via dedicated experiment platforms. SQL-based assignment, exposure logging discipline, metric definitions in dbt models, statistical analysis in SQL or Python, variance reduction with CUPED, sequential testing, and the operational tradeoffs vs platforms like Statsig and Optimizely. Triggers on warehouse-native experimentation, run experiments in BigQuery, run experiments in Snowflake, dbt experiments, SQL t-test, CUPED variance reduction, exposure log, sample ratio mismatch, sequential testing, mSPRT, doubly robust estimation, build vs buy experimentation. Also triggers when the team is choosing between platform and warehouse, building warehouse-native experiment infrastructure, auditing one, or running an experiment with a custom metric the platform cannot handle.