Skip to main content

Self Eval

Skill Verified Active

Honestly evaluate AI work quality using a two-axis scoring system. Use after completing a task, code review, or work session to get an unbiased assessment. Detects score inflation, forces devil's advocate reasoning, and persists scores across sessions.

Purpose

To enable users to get honest, calibrated assessments of AI-generated work, moving beyond default score inflation to a more meaningful evaluation.

Features

  • Two-axis scoring (task ambition vs. execution quality)
  • Mandatory devil's advocate reasoning before scoring
  • Score persistence to a local JSONL file
  • Anti-inflation detection based on score history
  • Matrix-locked composite score generation

Use Cases

  • Evaluating code reviews performed by an AI
  • Assessing AI-generated task completion quality
  • Getting unbiased feedback on AI-assisted work sessions
  • Calibrating AI self-assessment tendencies

Non-Goals

  • Replacing human code review entirely
  • Providing a simple numerical score without justification
  • Evaluating work quality outside of an AI coding agent context

Installation

First, add the marketplace

/plugin marketplace add alirezarezvani/claude-skills
/plugin install engineering@claude-code-skills

Quality Score

Verified
97 /100
Analyzed about 19 hours ago

Trust Signals

Last commitabout 22 hours ago
Stars14.6k
LicenseMIT
Status
View Source

Similar Extensions

Wrap Up Ritual

100

End-of-session ritual that audits changes, runs quality checks, captures learnings, and produces a session summary. Use when saying "wrap up", "done for the day", "finish coding", or ending a coding session.

Skill
rohitg00

Migrate Validate

100

Validate pending migrations for foreign key consistency, rollback safety, and best practices

Skill
ruvnet

Semgrep Rule Creator

100

Creates custom Semgrep rules for detecting security vulnerabilities, bug patterns, and code patterns. Use when writing Semgrep rules or building custom static analysis detections.

Skill
trailofbits

Moyu (摸鱼)

100

과잉 엔지니어링 패턴이 감지되면 자동으로 활성화됩니다: (1) 사용자가 명시적으로 변경을 요청하지 않은 코드나 파일을 수정하는 경우 (2) 요청되지 않은 새로운 추상화 레이어(class, interface, factory, wrapper)를 생성하는 경우 (3) 요청되지 않은 주석, 문서, JSDoc, 타입 어노테이션을 추가하는 경우 (4) 요청되지 않은 새로운 의존성을 도입하는 경우 (5) 최소한의 편집 대신 파일 전체를 다시 작성하는 경우 (6) diff 범위가 사용자의 요청을 명백히 초과하는 경우 (7) 사용자가 "너무 많아", "거기는 건드리지 마", "X만 변경해", "간단하게", "그만" 등의 신호를 보내는 경우 (8) 발생할 수 없는 시나리오에 대한 에러 처리, 유효성 검사, 방어적 코드를 추가하는 경우 (9) 요청되지 않은 테스트, 설정 스캐폴딩, 문서를 생성하는 경우

Skill
uucz

Cleanup Cycles

100

Detect and untangle circular dependencies. Runs madge/skott (TS), pycycle (Py), or compiler-only checks (Go/Rust). Auto-fixes leaf-extractable cycles; reports core cycles for human review. Use when the user asks to find circular imports, fix dependency cycles, or untangle module graph. Example queries — "find circular imports", "fix dependency cycles", "untangle our module graph", "why is madge complaining".

Skill
raintree-technology

Safe Mode

100

Prevent destructive operations using Claude Code hooks. Three modes — cautious (warn on dangerous commands), lockdown (restrict edits to one directory), and clear (remove restrictions). Uses PreToolUse matchers for Bash, Edit, and Write.

Skill
rohitg00

© 2025 SkillRepo · Find the right skill, skip the noise.