Midscene Browser Automation
Skill GeverifieerdVision-driven browser automation using Midscene. Operates from screenshots — no DOM or accessibility labels needed. Runs in headless Puppeteer — does NOT take over the user's mouse or keyboard. Also supports CDP mode and Bridge mode to connect to an existing Chrome. Use this skill when the user wants to: - Browse, navigate, or open web pages - Scrape, extract, or collect data from websites - Fill out forms, click buttons, or interact with web elements - Verify, validate, test, or QA frontend UI behavior - Take screenshots of web pages - Automate multi-step web workflows - Test what was just built, see if it works in browser - Connect to Chrome via CDP, DevTools Protocol, or remote debugging - Connect to user's Chrome browser, control my browser, operate my Chrome Powered by Midscene.js (https://midscenejs.com)
This skill leverages Midscene.js to control web browsers from screenshots, supporting headless Puppeteer, CDP, and Bridge modes. It allows for complex interactions like form filling, data scraping, and UI testing, with clear instructions for setup and usage.
Security
- info:Secret ManagementThe documentation mentions environment variables for API keys, which are standard practice for secret management. It advises against committing secrets to files. No specific secrets are hardcoded in the provided skill files.
Versioning
- warning:Release ManagementNo explicit version information (e.g., version field in frontmatter, CHANGELOG) is present for this skill, and installation instructions might point to the 'main' branch.
Compliance
- info:GDPRThe skill operates on web pages, which may contain personal data. While the skill itself doesn't submit data to third parties, the underlying LLM processing and browser interactions could potentially expose personal data if not handled carefully by the user or the model.
Installatie
npx skills add web-infra-dev/midscene-skillsVoert de Vercel skills CLI (skills.sh) uit via npx — vereist Node.js lokaal en minstens één geïnstalleerde skills-compatibele agent (Claude Code, Cursor, Codex, …). Gaat ervan uit dat de repo het agentskills.io-formaat volgt.
Vergelijkbare extensies
Browser Tools
92Minimal Chrome DevTools Protocol tools for browser automation and scraping. Use when you need to start Chrome, navigate pages, execute JavaScript, take screenshots, or interactively pick DOM elements. Triggers include "browse website", "scrape page", "take screenshot", "automate browser", "extract DOM", "web scraping".
Bright Data CLI
99Guide for using the Bright Data CLI (`brightdata` / `bdata`) to scrape websites, search the web, extract structured data from 40+ platforms, manage proxy zones, and check account budget. Use this skill whenever the user wants to scrape a URL, search Google/Bing/Yandex, extract data from Amazon/LinkedIn/Instagram/TikTok/YouTube/Reddit or any other platform, check their Bright Data balance or zones, or do anything involving web data collection from the terminal. Also trigger when the user mentions brightdata, bdata, web scraping CLI, SERP API, or wants to install Bright Data skills into their coding agent.
Browser Automation
95Automate web browser interactions, scraping, testing, and workflow automation with Puppeteer/Playwright
Headless Browser
90Headless browser automation, CDP discovery, and cross-browser cookie extraction. Internal building block for the `react-doctor browser ...` CLI.
PPTX Generator
99Create and manipulate PowerPoint PPTX files programmatically. Use when the user needs to generate presentations, modify PPTX templates, extract slide content, create thumbnail previews, or automate PowerPoint workflows. Supports both template-based generation (for branding compliance) and from-scratch creation. Keywords: PowerPoint, PPTX, presentation, slides, template, deck, slideshow, corporate, branding.
Day 2: 나만의 Context Sync 스킬 만들기
98AI Native Camp Day 2 Context Sync 스킬 만들기. 여러 외부 도구에서 컨텍스트를 수집하여 하나의 sync 문서로 만드는 나만의 스킬을 직접 구축한다. "2일차", "Day 2", "context sync", "컨텍스트 싱크", "sync 스킬", "스킬 만들기", "정보 수집 스킬" 요청에 사용.