跳转到主要内容
此内容尚未提供您的语言版本,正在以英文显示。

Self Eval

技能 已验证 活跃

Honestly evaluate AI work quality using a two-axis scoring system. Use after completing a task, code review, or work session to get an unbiased assessment. Detects score inflation, forces devil's advocate reasoning, and persists scores across sessions.

目的

To enable users to get honest, calibrated assessments of AI-generated work, moving beyond default score inflation to a more meaningful evaluation.

功能

  • Two-axis scoring (task ambition vs. execution quality)
  • Mandatory devil's advocate reasoning before scoring
  • Score persistence to a local JSONL file
  • Anti-inflation detection based on score history
  • Matrix-locked composite score generation

使用场景

  • Evaluating code reviews performed by an AI
  • Assessing AI-generated task completion quality
  • Getting unbiased feedback on AI-assisted work sessions
  • Calibrating AI self-assessment tendencies

非目标

  • Replacing human code review entirely
  • Providing a simple numerical score without justification
  • Evaluating work quality outside of an AI coding agent context

安装

请先添加 Marketplace

/plugin marketplace add alirezarezvani/claude-skills
/plugin install engineering@claude-code-skills

质量评分

已验证
97 /100
1 day ago 分析

信任信号

最近提交1 day ago
星标14.6k
许可证MIT
状态
查看源代码

类似扩展

Wrap Up Ritual

100

End-of-session ritual that audits changes, runs quality checks, captures learnings, and produces a session summary. Use when saying "wrap up", "done for the day", "finish coding", or ending a coding session.

技能
rohitg00

Migrate Validate

100

Validate pending migrations for foreign key consistency, rollback safety, and best practices

技能
ruvnet

Semgrep Rule Creator

100

Creates custom Semgrep rules for detecting security vulnerabilities, bug patterns, and code patterns. Use when writing Semgrep rules or building custom static analysis detections.

技能
trailofbits

Moyu (摸鱼)

100

감지된 과잉 엔지니어링 패턴: (1) 사용자가 명시적으로 요청하지 않은 코드나 파일을 수정할 때 (2) 요청되지 않은 새로운 추상화 계층(클래스, 인터페이스, 팩토리, 래퍼)을 생성할 때 (3) 요청되지 않은 주석, 문서, JSDoc, 타입 주석을 추가할 때 (4) 요청되지 않은 새로운 종속성을 도입할 때 (5) 최소 편집 대신 파일 전체를 다시 작성할 때 (6) diff 범위가 사용자의 요청을 명백히 초과할 때 (7) 사용자가 "너무 많아", "거기는 건드리지 마", "X만 변경해", "간단하게", "그만"과 같은 신호를 보낼 때 (8) 발생할 수 없는 시나리오에 대한 오류 처리, 유효성 검사, 방어적 코드를 추가할 때 (9) 요청되지 않은 테스트, 설정 스캐폴딩, 문서를 생성할 때

技能
uucz

Cleanup Cycles

100

Detect and untangle circular dependencies. Runs madge/skott (TS), pycycle (Py), or compiler-only checks (Go/Rust). Auto-fixes leaf-extractable cycles; reports core cycles for human review. Use when the user asks to find circular imports, fix dependency cycles, or untangle module graph. Example queries — "find circular imports", "fix dependency cycles", "untangle our module graph", "why is madge complaining".

技能
raintree-technology

Safe Mode

100

Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.

技能
rohitg00