此内容尚未提供您的语言版本,正在以英文显示。

Tuning Incremental Sync Config

技能已验证活跃

Change the sync configuration of an existing data warehouse schema — switch sync_type, pick a different incremental_field, set primary_key_columns, choose cdc_table_mode, or change sync_frequency. Use when the user asks "switch my orders table from full refresh to incremental", "this table is syncing too slowly / too frequently", "I need to pick a different incremental column", "set up CDC for this Postgres table", or when diagnosis of a failing sync pointed to an incremental-field or PK misconfiguration.

目的

Modify the synchronization configuration of existing data warehouse schemas to optimize performance, fix issues, or adapt to source changes.

功能

Change sync type (full refresh, incremental, CDC, webhook)
Select or update incremental fields and their types
Define or modify primary key columns
Configure CDC table modes
Adjust sync frequency and time of day
Pause or resume schema syncing
Register and manage webhooks for specific sources
Check CDC prerequisites and webhook validity
Trigger full resyncs or data deletion when needed

使用场景

Switching a table's sync type from full refresh to incremental.
Resolving sync failures caused by incorrect incremental fields or primary keys.
Adjusting sync frequency for tables that are syncing too slowly or too frequently.
Setting up Change Data Capture (CDC) for supported sources.
Pausing a schema sync without deleting its configuration.

非目标

Setting up brand new data warehouse sources (use a dedicated skill for that).
Modifying the underlying source table schemas directly.
Performing general data quality checks or transformations on synced data.

安装

npx skills add PostHog/posthog

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

已验证

99 /100

1 day ago 分析

信任信号

最近提交1 day ago

GitHub 所有者 PostHog

星标34.5k

下载量 0

许可证NOASSERTION

网站posthog.com

状态

查看源代码

类似扩展

Suggesting Data Imports

Use when the user asks about revenue, payments, subscriptions, billing, CRM deals, support tickets, production database tables, or other data that PostHog does not collect natively. Also use when a query fails because a table does not exist or returns no results for expected external data. The data warehouse can import from SaaS tools (Stripe, Hubspot, etc.), production databases (Postgres, MySQL, BigQuery, Snowflake), and other arbitrary data sources. Covers checking existing sources, identifying the right source type, and guiding the setup.

技能

PostHog

Polars

Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.

技能

K-Dense-AI

Spark Engineer

Use when writing Spark jobs, debugging performance issues, or configuring cluster settings for Apache Spark applications, distributed data processing pipelines, or big data workloads. Invoke to write DataFrame transformations, optimize Spark SQL queries, implement RDD pipelines, tune shuffle operations, configure executor memory, process .parquet files, handle data partitioning, or build structured streaming analytics.

技能

jeffallan

MongoDB Atlas Stream Processing

管理 MongoDB Atlas Stream Processing (ASP) 工作流。负责工作空间预配、数据源/接收器连接、处理器生命周期操作、调试诊断和层大小调整。支持 Kafka、Atlas 集群、S3、HTTPS 和 Lambda 集成，用于流式处理数据工作负载和事件处理。不适用于常规 MongoDB 查询或 Atlas 集群管理。需要带有 Atlas API 凭证的 MongoDB MCP Server。

技能

mongodb

Snowflake Development

Use when writing Snowflake SQL, building data pipelines with Dynamic Tables or Streams/Tasks, using Cortex AI functions, creating Cortex Agents, writing Snowpark Python, configuring dbt for Snowflake, or troubleshooting Snowflake errors.

技能

alirezarezvani

Data Warehouse Experimentation

Running experiments out of the data warehouse instead of via dedicated experiment platforms. SQL-based assignment, exposure logging discipline, metric definitions in dbt models, statistical analysis in SQL or Python, variance reduction with CUPED, sequential testing, and the operational tradeoffs vs platforms like Statsig and Optimizely. Triggers on warehouse-native experimentation, run experiments in BigQuery, run experiments in Snowflake, dbt experiments, SQL t-test, CUPED variance reduction, exposure log, sample ratio mismatch, sequential testing, mSPRT, doubly robust estimation, build vs buy experimentation. Also triggers when the team is choosing between platform and warehouse, building warehouse-native experiment infrastructure, auditing one, or running an experiment with a custom metric the platform cannot handle.

技能

rampstackco