Skip to main content

Nextflow Development

Skill Verified Active

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.

Purpose

To simplify and automate complex omics data analysis for researchers by leveraging nf-core pipelines through an AI agent.

Features

  • Automated GEO/SRA data acquisition
  • FASTQ, BAM, CRAM file processing
  • Sample sheet generation for multiple pipelines
  • Environment and resource validation
  • Pipeline execution orchestration via Nextflow

Use Cases

  • Analyzing RNA-seq data for gene expression
  • Performing variant calling on WGS/WES data
  • Investigating chromatin accessibility with ATAC-seq
  • Reanalyzing public datasets from GEO/SRA

Non-Goals

  • Performing the bioinformatics analysis itself (delegated to nf-core pipelines)
  • Managing computational infrastructure (relies on Nextflow/Docker)
  • Providing direct interpretation of analysis results

Workflow

  1. Acquire data (if from GEO/SRA)
  2. Check environment (Docker, Nextflow, Java)
  3. Detect data type and suggest pipeline
  4. Generate samplesheet
  5. Configure and run nf-core pipeline
  6. Verify outputs

Practices

  • Bioinformatics workflow automation
  • Data acquisition and preparation
  • Pipeline execution management

Prerequisites

  • Docker installed and running
  • Nextflow version >= 23.04
  • Java version >= 11
  • Network access to NCBI, ENA, Docker Hub, and GitHub

Installation

First, add the marketplace

/plugin marketplace add anthropics/knowledge-work-plugins
/plugin install bio-research@knowledge-work-plugins

Quality Score

Verified
98 /100
Analyzed 13 days ago

Trust Signals

Last commit14 days ago
Stars12.1k
LicenseApache-2.0
Status
View Source

Similar Extensions

PyDESeq2

100

Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.

Skill
K-Dense-AI

Scanpy

99

Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, and visualization. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.

Skill
K-Dense-AI

Pysam

99

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

Skill
K-Dense-AI

Polars Bio

99

High-performance genomic interval operations and bioinformatics file I/O on Polars DataFrames. Overlap, nearest, merge, coverage, complement, subtract for BED/VCF/BAM/GFF intervals. Streaming, cloud-native, faster bioframe alternative.

Skill
K-Dense-AI

Gtars

99

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

Skill
K-Dense-AI

Geniml

99

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

Skill
K-Dense-AI

© 2025 SkillRepo · Find the right skill, skip the noise.