Nextflow Development

Skill Verified Active

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or public datasets from GEO/SRA. Triggers on nf-core, Nextflow, FASTQ analysis, variant calling, gene expression, differential expression, GEO reanalysis, GSE/GSM/SRR accessions, or samplesheet creation.

Purpose

To simplify and automate complex omics data analysis for researchers by leveraging nf-core pipelines through an AI agent.

Features

Automated GEO/SRA data acquisition
FASTQ, BAM, CRAM file processing
Sample sheet generation for multiple pipelines
Environment and resource validation
Pipeline execution orchestration via Nextflow

Use Cases

Analyzing RNA-seq data for gene expression
Performing variant calling on WGS/WES data
Investigating chromatin accessibility with ATAC-seq
Reanalyzing public datasets from GEO/SRA

Non-Goals

Performing the bioinformatics analysis itself (delegated to nf-core pipelines)
Managing computational infrastructure (relies on Nextflow/Docker)
Providing direct interpretation of analysis results

Workflow

Acquire data (if from GEO/SRA)
Check environment (Docker, Nextflow, Java)
Detect data type and suggest pipeline
Generate samplesheet
Configure and run nf-core pipeline
Verify outputs

Practices

Bioinformatics workflow automation
Data acquisition and preparation
Pipeline execution management

Prerequisites

Docker installed and running
Nextflow version >= 23.04
Java version >= 11
Network access to NCBI, ENA, Docker Hub, and GitHub

Installation

First, add the marketplace

/plugin marketplace add anthropics/knowledge-work-plugins

/plugin install bio-research@knowledge-work-plugins

Quality Score

Verified

98 /100

Analyzed 13 days ago

Trust Signals

Last commit14 days ago

GitHub owner anthropics

Stars12.1k

LicenseApache-2.0

Status

View Source

Similar Extensions

PyDESeq2

100

Differential gene expression analysis (Python DESeq2). Identify DE genes from bulk RNA-seq counts, Wald tests, FDR correction, volcano/MA plots, for RNA-seq analysis.

Skill

K-Dense-AI

Scanpy

Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, and visualization. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.

Skill

K-Dense-AI

Pysam

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

Skill

K-Dense-AI

Polars Bio

High-performance genomic interval operations and bioinformatics file I/O on Polars DataFrames. Overlap, nearest, merge, coverage, complement, subtract for BED/VCF/BAM/GFF intervals. Streaming, cloud-native, faster bioframe alternative.

Skill

K-Dense-AI

Gtars

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

Skill

K-Dense-AI

Geniml

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

Skill

K-Dense-AI