此内容尚未提供您的语言版本,正在以英文显示。

Dask

技能已验证活跃

Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.

目的

To allow users to scale their existing pandas and NumPy workflows beyond memory limits or across clusters using the Dask library.

功能

Larger-than-RAM execution on single machines
Parallel processing across multiple cores
Distributed computation for terabyte-scale datasets
Familiar pandas/NumPy APIs for DataFrames and Arrays
Task-based parallelization with Futures

使用场景

Process datasets that exceed available RAM
Scale pandas or NumPy operations to larger datasets
Parallelize computations for performance improvements
Process multiple files efficiently (CSVs, Parquet, JSON, text logs)

非目标

Out-of-core analytics on a single machine (use vaex)
In-memory speed optimization (use polars)

安装

npx skills add K-Dense-AI/claude-scientific-skills

通过 npx 运行 Vercel skills CLI(skills.sh)— 需要本地安装 Node.js,以及至少一个兼容 skills 的智能体(Claude Code、Cursor、Codex 等)。前提是仓库遵循 agentskills.io 格式。

质量评分

已验证

98 /100

1 day ago 分析

信任信号

最近提交3 days ago

GitHub 所有者 K-Dense-AI

星标21k

许可证MIT

网站k-dense.ai

状态

查看源代码

类似扩展

Dask Data Science

Part of the AlterLab Academic Skills suite. Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.

技能

AlterLab-IEU

Ray Data

Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow. Scales from single machine to 100s of nodes. Use for batch inference, data preprocessing, multi-modal data loading, or distributed ETL pipelines.

技能

Orchestra-Research

Create Spatial Visualization

100

Create interactive maps, elevation profiles, and spatial visualizations from GPX tracks, waypoints, or route data using R (sf, leaflet, tmap) or Observable (D3, deck.gl). Covers data import, coordinate system handling, map styling, and export to HTML or image formats. Use when visualizing a planned or completed tour route on an interactive map, creating elevation profiles for hiking or cycling routes, overlaying waypoints and POIs on a basemap, or building a web-based trip dashboard.

技能

pjt222

Embedding Strategies

100

Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.

技能

wshobson

Aws Cdk Development

100

AWS Cloud Development Kit (CDK) 专家，用于使用 TypeScript/Python 构建云基础设施。在创建 CDK 堆栈、定义 CDK 构造、实现基础设施即代码，或当用户提及 CDK、CloudFormation、IaC、cdk synth、cdk deploy，或希望以编程方式定义 AWS 基础设施时使用。涵盖 CDK 应用结构、构造模式、堆栈组合和部署工作流。

技能

zxkane

Fit Drift Diffusion Model

100

Fit cognitive drift-diffusion models (Ratcliff DDM) to reaction time and accuracy data with parameter estimation (drift rate, boundary separation, non-decision time), model comparison, and parameter recovery validation. Use when modeling binary decision-making with reaction time data, estimating cognitive parameters from experimental data, comparing sequential sampling model variants, or decomposing speed-accuracy tradeoff effects into latent cognitive components.

技能

pjt222