From 517e08c9545d7638fd9fd7c2f3bb20b73b73e4f7 Mon Sep 17 00:00:00 2001 From: Krzysztof kuhy Rudnicki Date: Thu, 7 May 2026 22:08:00 +0200 Subject: [PATCH] style(prettier): apply markdown/json formatting updates --- .github/agents/code-reviewer.md | 10 ++ .github/agents/security-auditor.md | 22 ++- .github/agents/test-engineer.md | 21 ++- .../skills/code-review-and-quality/SKILL.md | 51 ++++--- .../skills/spec-driven-development/SKILL.md | 28 +++- .../skills/test-driven-development/SKILL.md | 119 +++++++-------- .../contracts/agent-automation-bootstrap.json | 9 +- .../contracts/run-sh-wrapper-smoke.json | 9 +- docs/superpowers/contracts/template.json | 9 +- .../evidence/agent-automation-bootstrap.json | 20 +-- .../evidence/run-sh-wrapper-smoke.json | 20 +-- docs/superpowers/evidence/template.json | 15 +- .../memory/verification_playbook.json | 6 +- .../agent-skills/.claude/commands/ship.md | 6 + .../agent-skills/.claude/commands/spec.md | 1 + .../agent-skills/.claude/commands/test.md | 2 + third_party/agent-skills/AGENTS.md | 25 ++- third_party/agent-skills/README.md | 105 +++++++------ third_party/agent-skills/agents/README.md | 29 ++-- .../agent-skills/agents/code-reviewer.md | 10 ++ .../agent-skills/agents/security-auditor.md | 22 ++- .../agent-skills/agents/test-engineer.md | 21 ++- .../agent-skills/docs/copilot-setup.md | 5 + .../agent-skills/docs/gemini-cli-setup.md | 20 +-- .../agent-skills/docs/getting-started.md | 39 ++--- .../agent-skills/docs/opencode-setup.md | 6 + .../agent-skills/docs/skill-anatomy.md | 22 ++- third_party/agent-skills/hooks/SDD-CACHE.md | 21 ++- .../agent-skills/hooks/SIMPLIFY-IGNORE.md | 33 ++-- .../references/accessibility-checklist.md | 42 +++--- .../references/orchestration-patterns.md | 56 ++++--- .../references/performance-checklist.md | 39 +++-- .../references/security-checklist.md | 40 ++--- .../references/testing-patterns.md | 142 +++++++++--------- .../skills/api-and-interface-design/SKILL.md | 72 +++++---- .../browser-testing-with-devtools/SKILL.md | 43 +++--- .../skills/ci-cd-and-automation/SKILL.md | 131 ++++++++-------- .../skills/code-review-and-quality/SKILL.md | 51 ++++--- .../skills/code-simplification/SKILL.md | 91 ++++++----- .../skills/context-engineering/SKILL.md | 52 ++++--- .../debugging-and-error-recovery/SKILL.md | 24 +-- .../skills/deprecation-and-migration/SKILL.md | 25 +-- .../skills/documentation-and-adrs/SKILL.md | 55 ++++--- .../skills/frontend-ui-engineering/SKILL.md | 77 ++++++---- .../git-workflow-and-versioning/SKILL.md | 21 +-- .../agent-skills/skills/idea-refine/SKILL.md | 8 + .../skills/idea-refine/examples.md | 40 ++--- .../skills/idea-refine/frameworks.md | 4 +- .../skills/idea-refine/refinement-criteria.md | 23 ++- .../incremental-implementation/SKILL.md | 18 ++- .../skills/performance-optimization/SKILL.md | 77 +++++----- .../planning-and-task-breakdown/SKILL.md | 47 ++++-- .../skills/security-and-hardening/SKILL.md | 135 ++++++++++------- .../skills/shipping-and-launch/SKILL.md | 37 +++-- .../skills/source-driven-development/SKILL.md | 29 ++-- .../skills/spec-driven-development/SKILL.md | 28 +++- .../skills/test-driven-development/SKILL.md | 119 +++++++-------- .../skills/using-agent-skills/SKILL.md | 42 +++--- 58 files changed, 1289 insertions(+), 985 deletions(-) diff --git a/.github/agents/code-reviewer.md b/.github/agents/code-reviewer.md index 3bce85c..98e88ca 100644 --- a/.github/agents/code-reviewer.md +++ b/.github/agents/code-reviewer.md @@ -12,18 +12,21 @@ You are an experienced Staff Engineer conducting a thorough code review. Your ro Evaluate every change across these five dimensions: ### 1. Correctness + - Does the code do what the spec/task says it should? - Are edge cases handled (null, empty, boundary values, error paths)? - Do the tests actually verify the behavior? Are they testing the right things? - Are there race conditions, off-by-one errors, or state inconsistencies? ### 2. Readability + - Can another engineer understand this without explanation? - Are names descriptive and consistent with project conventions? - Is the control flow straightforward (no deeply nested logic)? - Is the code well-organized (related code grouped, clear boundaries)? ### 3. Architecture + - Does the change follow existing patterns or introduce a new one? - If a new pattern, is it justified and documented? - Are module boundaries maintained? Any circular dependencies? @@ -31,6 +34,7 @@ Evaluate every change across these five dimensions: - Are dependencies flowing in the right direction? ### 4. Security + - Is user input validated and sanitized at system boundaries? - Are secrets kept out of code, logs, and version control? - Is authentication/authorization checked where needed? @@ -38,6 +42,7 @@ Evaluate every change across these five dimensions: - Any new dependencies with known vulnerabilities? ### 5. Performance + - Any N+1 query patterns? - Any unbounded loops or unconstrained data fetching? - Any synchronous operations that should be async? @@ -64,18 +69,23 @@ Categorize every finding: **Overview:** [1-2 sentences summarizing the change and overall assessment] ### Critical Issues + - [File:line] [Description and recommended fix] ### Important Issues + - [File:line] [Description and recommended fix] ### Suggestions + - [File:line] [Description] ### What's Done Well + - [Positive observation — always include at least one] ### Verification Story + - Tests reviewed: [yes/no, observations] - Build verified: [yes/no] - Security checked: [yes/no, observations] diff --git a/.github/agents/security-auditor.md b/.github/agents/security-auditor.md index 07bc30b..0fef188 100644 --- a/.github/agents/security-auditor.md +++ b/.github/agents/security-auditor.md @@ -10,6 +10,7 @@ You are an experienced Security Engineer conducting a security review. Your role ## Review Scope ### 1. Input Handling + - Is all user input validated at system boundaries? - Are there injection vectors (SQL, NoSQL, OS command, LDAP)? - Is HTML output encoded to prevent XSS? @@ -17,6 +18,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Are URL redirects validated against an allowlist? ### 2. Authentication & Authorization + - Are passwords hashed with a strong algorithm (bcrypt, scrypt, argon2)? - Are sessions managed securely (httpOnly, secure, sameSite cookies)? - Is authorization checked on every protected endpoint? @@ -25,6 +27,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Is rate limiting applied to authentication endpoints? ### 3. Data Protection + - Are secrets in environment variables (not code)? - Are sensitive fields excluded from API responses and logs? - Is data encrypted in transit (HTTPS) and at rest (if required)? @@ -32,6 +35,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Are database backups encrypted? ### 4. Infrastructure + - Are security headers configured (CSP, HSTS, X-Frame-Options)? - Is CORS restricted to specific origins? - Are dependencies audited for known vulnerabilities? @@ -39,6 +43,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Is the principle of least privilege applied to service accounts? ### 5. Third-Party Integrations + - Are API keys and tokens stored securely? - Are webhook payloads verified (signature validation)? - Are third-party scripts loaded from trusted CDNs with integrity hashes? @@ -46,13 +51,13 @@ You are an experienced Security Engineer conducting a security review. Your role ## Severity Classification -| Severity | Criteria | Action | -|----------|----------|--------| +| Severity | Criteria | Action | +| ------------ | ------------------------------------------------------------- | ------------------------------ | | **Critical** | Exploitable remotely, leads to data breach or full compromise | Fix immediately, block release | -| **High** | Exploitable with some conditions, significant data exposure | Fix before release | -| **Medium** | Limited impact or requires authenticated access to exploit | Fix in current sprint | -| **Low** | Theoretical risk or defense-in-depth improvement | Schedule for next sprint | -| **Info** | Best practice recommendation, no current risk | Consider adopting | +| **High** | Exploitable with some conditions, significant data exposure | Fix before release | +| **Medium** | Limited impact or requires authenticated access to exploit | Fix in current sprint | +| **Low** | Theoretical risk or defense-in-depth improvement | Schedule for next sprint | +| **Info** | Best practice recommendation, no current risk | Consider adopting | ## Output Format @@ -60,6 +65,7 @@ You are an experienced Security Engineer conducting a security review. Your role ## Security Audit Report ### Summary + - Critical: [count] - High: [count] - Medium: [count] @@ -68,6 +74,7 @@ You are an experienced Security Engineer conducting a security review. Your role ### Findings #### [CRITICAL] [Finding title] + - **Location:** [file:line] - **Description:** [What the vulnerability is] - **Impact:** [What an attacker could do] @@ -75,12 +82,15 @@ You are an experienced Security Engineer conducting a security review. Your role - **Recommendation:** [Specific fix with code example] #### [HIGH] [Finding title] + ... ### Positive Observations + - [Security practices done well] ### Recommendations + - [Proactive improvements to consider] ``` diff --git a/.github/agents/test-engineer.md b/.github/agents/test-engineer.md index 3e2c6be..bf19149 100644 --- a/.github/agents/test-engineer.md +++ b/.github/agents/test-engineer.md @@ -12,6 +12,7 @@ You are an experienced QA Engineer focused on test strategy and quality assuranc ### 1. Analyze Before Writing Before writing any test: + - Read the code being tested to understand its behavior - Identify the public API / interface (what to test) - Identify edge cases and error paths @@ -30,6 +31,7 @@ Test at the lowest level that captures the behavior. Don't write E2E tests for t ### 3. Follow the Prove-It Pattern for Bugs When asked to write a test for a bug: + 1. Write a test that demonstrates the bug (must FAIL with current code) 2. Confirm the test fails 3. Report the test is ready for the fix implementation @@ -48,13 +50,13 @@ describe('[Module/Function name]', () => { For every function or component: -| Scenario | Example | -|----------|---------| -| Happy path | Valid input produces expected output | -| Empty input | Empty string, empty array, null, undefined | -| Boundary values | Min, max, zero, negative | -| Error paths | Invalid input, network failure, timeout | -| Concurrency | Rapid repeated calls, out-of-order responses | +| Scenario | Example | +| --------------- | -------------------------------------------- | +| Happy path | Valid input produces expected output | +| Empty input | Empty string, empty array, null, undefined | +| Boundary values | Min, max, zero, negative | +| Error paths | Invalid input, network failure, timeout | +| Concurrency | Rapid repeated calls, out-of-order responses | ## Output Format @@ -64,14 +66,17 @@ When analyzing test coverage: ## Test Coverage Analysis ### Current Coverage -- [X] tests covering [Y] functions/components + +- [x] tests covering [Y] functions/components - Coverage gaps identified: [list] ### Recommended Tests + 1. **[Test name]** — [What it verifies, why it matters] 2. **[Test name]** — [What it verifies, why it matters] ### Priority + - Critical: [Tests that catch potential data loss or security issues] - High: [Tests for core business logic] - Medium: [Tests for edge cases and error handling] diff --git a/.github/skills/code-review-and-quality/SKILL.md b/.github/skills/code-review-and-quality/SKILL.md index fcf77dd..33c0849 100644 --- a/.github/skills/code-review-and-quality/SKILL.md +++ b/.github/skills/code-review-and-quality/SKILL.md @@ -94,12 +94,12 @@ Small, focused changes are easier to review, faster to merge, and safer to deplo **Splitting strategies when a change is too large:** -| Strategy | How | When | -|----------|-----|------| -| **Stack** | Submit a small change, start the next one based on it | Sequential dependencies | -| **By file group** | Separate changes for groups needing different reviewers | Cross-cutting concerns | -| **Horizontal** | Create shared code/stubs first, then consumers | Layered architecture | -| **Vertical** | Break into smaller full-stack slices of the feature | Feature work | +| Strategy | How | When | +| ----------------- | ------------------------------------------------------- | ----------------------- | +| **Stack** | Submit a small change, start the next one based on it | Sequential dependencies | +| **By file group** | Separate changes for groups needing different reviewers | Cross-cutting concerns | +| **Horizontal** | Create shared code/stubs first, then consumers | Layered architecture | +| **Vertical** | Break into smaller full-stack slices of the feature | Feature work | **When large changes are acceptable:** Complete file deletions and automated refactoring where the reviewer only needs to verify intent, not every line. @@ -156,13 +156,13 @@ For each file changed: Label every comment with its severity so the author knows what's required vs optional: -| Prefix | Meaning | Author Action | -|--------|---------|---------------| -| *(no prefix)* | Required change | Must address before merge | -| **Critical:** | Blocks merge | Security vulnerability, data loss, broken functionality | -| **Nit:** | Minor, optional | Author may ignore — formatting, style preferences | -| **Optional:** / **Consider:** | Suggestion | Worth considering but not required | -| **FYI** | Informational only | No action needed — context for future reference | +| Prefix | Meaning | Author Action | +| ----------------------------- | ------------------ | ------------------------------------------------------- | +| _(no prefix)_ | Required change | Must address before merge | +| **Critical:** | Blocks merge | Security vulnerability, data loss, broken functionality | +| **Nit:** | Minor, optional | Author may ignore — formatting, style preferences | +| **Optional:** / **Consider:** | Suggestion | Worth considering but not required | +| **FYI** | Informational only | No action needed — context for future reference | This prevents authors from treating all feedback as mandatory and wasting time on optional suggestions. @@ -198,6 +198,7 @@ Human makes the final call This catches issues that a single model might miss — different models have different blind spots. **Example prompt for a review agent:** + ``` Review this code change for correctness, security, and adherence to our project conventions. The spec says [X]. The change should [Y]. @@ -257,6 +258,7 @@ When reviewing code — whether written by you, another agent, or a human: Part of code review is dependency review: **Before adding any dependency:** + 1. Does the existing stack solve this? (Often it does.) 2. How large is the dependency? (Check bundle impact.) 3. Is it actively maintained? (Check last commit, open issues.) @@ -271,25 +273,30 @@ Part of code review is dependency review: ## Review: [PR/Change title] ### Context + - [ ] I understand what this change does and why ### Correctness + - [ ] Change matches spec/task requirements - [ ] Edge cases handled - [ ] Error paths handled - [ ] Tests cover the change adequately ### Readability + - [ ] Names are clear and consistent - [ ] Logic is straightforward - [ ] No unnecessary complexity ### Architecture + - [ ] Follows existing patterns - [ ] No unnecessary coupling or dependencies - [ ] Appropriate abstraction level ### Security + - [ ] No secrets in code - [ ] Input validated at boundaries - [ ] No injection vulnerabilities @@ -297,19 +304,23 @@ Part of code review is dependency review: - [ ] External data sources treated as untrusted ### Performance + - [ ] No N+1 patterns - [ ] No unbounded operations - [ ] Pagination on list endpoints ### Verification + - [ ] Tests pass - [ ] Build succeeds - [ ] Manual verification done (if applicable) ### Verdict + - [ ] **Approve** — Ready to merge - [ ] **Request changes** — Issues must be addressed ``` + ## See Also - For detailed security review guidance, see `references/security-checklist.md` @@ -317,13 +328,13 @@ Part of code review is dependency review: ## Common Rationalizations -| Rationalization | Reality | -|---|---| -| "It works, that's good enough" | Working code that's unreadable, insecure, or architecturally wrong creates debt that compounds. | -| "I wrote it, so I know it's correct" | Authors are blind to their own assumptions. Every change benefits from another set of eyes. | -| "We'll clean it up later" | Later never comes. The review is the quality gate — use it. Require cleanup before merge, not after. | -| "AI-generated code is probably fine" | AI code needs more scrutiny, not less. It's confident and plausible, even when wrong. | -| "The tests pass, so it's good" | Tests are necessary but not sufficient. They don't catch architecture problems, security issues, or readability concerns. | +| Rationalization | Reality | +| ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------- | +| "It works, that's good enough" | Working code that's unreadable, insecure, or architecturally wrong creates debt that compounds. | +| "I wrote it, so I know it's correct" | Authors are blind to their own assumptions. Every change benefits from another set of eyes. | +| "We'll clean it up later" | Later never comes. The review is the quality gate — use it. Require cleanup before merge, not after. | +| "AI-generated code is probably fine" | AI code needs more scrutiny, not less. It's confident and plausible, even when wrong. | +| "The tests pass, so it's good" | Tests are necessary but not sufficient. They don't catch architecture problems, security issues, or readability concerns. | ## Red Flags diff --git a/.github/skills/spec-driven-development/SKILL.md b/.github/skills/spec-driven-development/SKILL.md index 3922346..3b7f78c 100644 --- a/.github/skills/spec-driven-development/SKILL.md +++ b/.github/skills/spec-driven-development/SKILL.md @@ -46,13 +46,14 @@ ASSUMPTIONS I'M MAKING: → Correct me now or I'll proceed with these. ``` -Don't silently fill in ambiguous requirements. The spec's entire purpose is to surface misunderstandings *before* code gets written — assumptions are the most dangerous form of misunderstanding. +Don't silently fill in ambiguous requirements. The spec's entire purpose is to surface misunderstandings _before_ code gets written — assumptions are the most dangerous form of misunderstanding. **Write a spec document covering these six core areas:** 1. **Objective** — What are we building and why? Who is the user? What does success look like? 2. **Commands** — Full executable commands with flags, not just tool names. + ``` Build: npm run build Test: npm test -- --coverage @@ -61,6 +62,7 @@ Don't silently fill in ambiguous requirements. The spec's entire purpose is to s ``` 3. **Project Structure** — Where source code lives, where tests go, where docs belong. + ``` src/ → Application source code src/components → React components @@ -85,32 +87,41 @@ Don't silently fill in ambiguous requirements. The spec's entire purpose is to s # Spec: [Project/Feature Name] ## Objective + [What we're building and why. User stories or acceptance criteria.] ## Tech Stack + [Framework, language, key dependencies with versions] ## Commands + [Build, test, lint, dev — full commands] ## Project Structure + [Directory layout with descriptions] ## Code Style + [Example snippet + key conventions] ## Testing Strategy + [Framework, test locations, coverage requirements, test levels] ## Boundaries + - Always: [...] - Ask first: [...] - Never: [...] ## Success Criteria + [How we'll know this is done — specific, testable conditions] ## Open Questions + [Anything unresolved that needs human input] ``` @@ -151,6 +162,7 @@ Break the plan into discrete, implementable tasks: - No task should require changing more than ~5 files **Task template:** + ```markdown - [ ] Task: [Description] - Acceptance: [What must be true when done] @@ -173,13 +185,13 @@ The spec is a living document, not a one-time artifact: ## Common Rationalizations -| Rationalization | Reality | -|---|---| -| "This is simple, I don't need a spec" | Simple tasks don't need *long* specs, but they still need acceptance criteria. A two-line spec is fine. | -| "I'll write the spec after I code it" | That's documentation, not specification. The spec's value is in forcing clarity *before* code. | -| "The spec will slow us down" | A 15-minute spec prevents hours of rework. Waterfall in 15 minutes beats debugging in 15 hours. | -| "Requirements will change anyway" | That's why the spec is a living document. An outdated spec is still better than no spec. | -| "The user knows what they want" | Even clear requests have implicit assumptions. The spec surfaces those assumptions. | +| Rationalization | Reality | +| ------------------------------------- | ------------------------------------------------------------------------------------------------------- | +| "This is simple, I don't need a spec" | Simple tasks don't need _long_ specs, but they still need acceptance criteria. A two-line spec is fine. | +| "I'll write the spec after I code it" | That's documentation, not specification. The spec's value is in forcing clarity _before_ code. | +| "The spec will slow us down" | A 15-minute spec prevents hours of rework. Waterfall in 15 minutes beats debugging in 15 hours. | +| "Requirements will change anyway" | That's why the spec is a living document. An outdated spec is still better than no spec. | +| "The user knows what they want" | Even clear requests have implicit assumptions. The spec surfaces those assumptions. | ## Red Flags diff --git a/.github/skills/test-driven-development/SKILL.md b/.github/skills/test-driven-development/SKILL.md index c96a67f..2791b30 100644 --- a/.github/skills/test-driven-development/SKILL.md +++ b/.github/skills/test-driven-development/SKILL.md @@ -38,13 +38,13 @@ Write the test first. It must fail. A test that passes immediately proves nothin ```typescript // RED: This test fails because createTask doesn't exist yet -describe('TaskService', () => { - it('creates a task with title and default status', async () => { - const task = await taskService.createTask({ title: 'Buy groceries' }); +describe("TaskService", () => { + it("creates a task with title and default status", async () => { + const task = await taskService.createTask({ title: "Buy groceries" }); expect(task.id).toBeDefined(); - expect(task.title).toBe('Buy groceries'); - expect(task.status).toBe('pending'); + expect(task.title).toBe("Buy groceries"); + expect(task.status).toBe("pending"); expect(task.createdAt).toBeInstanceOf(Date); }); }); @@ -60,7 +60,7 @@ export async function createTask(input: { title: string }): Promise { const task = { id: generateId(), title: input.title, - status: 'pending' as const, + status: "pending" as const, createdAt: new Date(), }; await db.tasks.insert(task); @@ -108,19 +108,19 @@ Bug report arrives // Bug: "Completing a task doesn't update the completedAt timestamp" // Step 1: Write the reproduction test (it should FAIL) -it('sets completedAt when task is completed', async () => { - const task = await taskService.createTask({ title: 'Test' }); +it("sets completedAt when task is completed", async () => { + const task = await taskService.createTask({ title: "Test" }); const completed = await taskService.completeTask(task.id); - expect(completed.status).toBe('completed'); - expect(completed.completedAt).toBeInstanceOf(Date); // This fails → bug confirmed + expect(completed.status).toBe("completed"); + expect(completed.completedAt).toBeInstanceOf(Date); // This fails → bug confirmed }); // Step 2: Fix the bug export async function completeTask(id: string): Promise { return db.tasks.update(id, { - status: 'completed', - completedAt: new Date(), // This was missing + status: "completed", + completedAt: new Date(), // This was missing }); } @@ -150,11 +150,11 @@ Invest testing effort according to the pyramid — most tests should be small an Beyond the pyramid levels, classify tests by what resources they consume: -| Size | Constraints | Speed | Example | -|------|------------|-------|---------| -| **Small** | Single process, no I/O, no network, no database | Milliseconds | Pure function tests, data transforms | -| **Medium** | Multi-process OK, localhost only, no external services | Seconds | API tests with test DB, component tests | -| **Large** | Multi-machine OK, external services allowed | Minutes | E2E tests, performance benchmarks, staging integration | +| Size | Constraints | Speed | Example | +| ---------- | ------------------------------------------------------ | ------------ | ------------------------------------------------------ | +| **Small** | Single process, no I/O, no network, no database | Milliseconds | Pure function tests, data transforms | +| **Medium** | Multi-process OK, localhost only, no external services | Seconds | API tests with test DB, component tests | +| **Large** | Multi-machine OK, external services allowed | Minutes | E2E tests, performance benchmarks, staging integration | Small tests should make up the vast majority of your suite. They're fast, reliable, and easy to debug when they fail. @@ -175,21 +175,22 @@ Is it a critical user flow that must work end-to-end? ### Test State, Not Interactions -Assert on the *outcome* of an operation, not on which methods were called internally. Tests that verify method call sequences break when you refactor, even if the behavior is unchanged. +Assert on the _outcome_ of an operation, not on which methods were called internally. Tests that verify method call sequences break when you refactor, even if the behavior is unchanged. ```typescript // Good: Tests what the function does (state-based) -it('returns tasks sorted by creation date, newest first', async () => { - const tasks = await listTasks({ sortBy: 'createdAt', sortOrder: 'desc' }); - expect(tasks[0].createdAt.getTime()) - .toBeGreaterThan(tasks[1].createdAt.getTime()); +it("returns tasks sorted by creation date, newest first", async () => { + const tasks = await listTasks({ sortBy: "createdAt", sortOrder: "desc" }); + expect(tasks[0].createdAt.getTime()).toBeGreaterThan( + tasks[1].createdAt.getTime(), + ); }); // Bad: Tests how the function works internally (interaction-based) -it('calls db.query with ORDER BY created_at DESC', async () => { - await listTasks({ sortBy: 'createdAt', sortOrder: 'desc' }); +it("calls db.query with ORDER BY created_at DESC", async () => { + await listTasks({ sortBy: "createdAt", sortOrder: "desc" }); expect(db.query).toHaveBeenCalledWith( - expect.stringContaining('ORDER BY created_at DESC') + expect.stringContaining("ORDER BY created_at DESC"), ); }); ``` @@ -200,15 +201,15 @@ In production code, DRY (Don't Repeat Yourself) is usually right. In tests, **DA ```typescript // DAMP: Each test is self-contained and readable -it('rejects tasks with empty titles', () => { - const input = { title: '', assignee: 'user-1' }; - expect(() => createTask(input)).toThrow('Title is required'); +it("rejects tasks with empty titles", () => { + const input = { title: "", assignee: "user-1" }; + expect(() => createTask(input)).toThrow("Title is required"); }); -it('trims whitespace from titles', () => { - const input = { title: ' Buy groceries ', assignee: 'user-1' }; +it("trims whitespace from titles", () => { + const input = { title: " Buy groceries ", assignee: "user-1" }; const task = createTask(input); - expect(task.title).toBe('Buy groceries'); + expect(task.title).toBe("Buy groceries"); }); // Over-DRY: Shared setup obscures what each test actually verifies @@ -234,15 +235,15 @@ Preference order (most to least preferred): ### Use the Arrange-Act-Assert Pattern ```typescript -it('marks overdue tasks when deadline has passed', () => { +it("marks overdue tasks when deadline has passed", () => { // Arrange: Set up the test scenario const task = createTask({ - title: 'Test', - deadline: new Date('2025-01-01'), + title: "Test", + deadline: new Date("2025-01-01"), }); // Act: Perform the action being tested - const result = checkOverdue(task, new Date('2025-01-02')); + const result = checkOverdue(task, new Date("2025-01-02")); // Assert: Verify the outcome expect(result.isOverdue).toBe(true); @@ -286,14 +287,14 @@ describe('TaskService', () => { ## Test Anti-Patterns to Avoid -| Anti-Pattern | Problem | Fix | -|---|---|---| -| Testing implementation details | Tests break when refactoring even if behavior is unchanged | Test inputs and outputs, not internal structure | -| Flaky tests (timing, order-dependent) | Erode trust in the test suite | Use deterministic assertions, isolate test state | -| Testing framework code | Wastes time testing third-party behavior | Only test YOUR code | -| Snapshot abuse | Large snapshots nobody reviews, break on any change | Use snapshots sparingly and review every change | -| No test isolation | Tests pass individually but fail together | Each test sets up and tears down its own state | -| Mocking everything | Tests pass but production breaks | Prefer real implementations > fakes > stubs > mocks. Mock only at boundaries where real deps are slow or non-deterministic | +| Anti-Pattern | Problem | Fix | +| ------------------------------------- | ---------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | +| Testing implementation details | Tests break when refactoring even if behavior is unchanged | Test inputs and outputs, not internal structure | +| Flaky tests (timing, order-dependent) | Erode trust in the test suite | Use deterministic assertions, isolate test state | +| Testing framework code | Wastes time testing third-party behavior | Only test YOUR code | +| Snapshot abuse | Large snapshots nobody reviews, break on any change | Use snapshots sparingly and review every change | +| No test isolation | Tests pass individually but fail together | Each test sets up and tears down its own state | +| Mocking everything | Tests pass but production breaks | Prefer real implementations > fakes > stubs > mocks. Mock only at boundaries where real deps are slow or non-deterministic | ## Browser Testing with DevTools @@ -311,14 +312,14 @@ For anything that runs in a browser, unit tests alone aren't enough — you need ### What to Check -| Tool | When | What to Look For | -|------|------|-----------------| -| **Console** | Always | Zero errors and warnings in production-quality code | -| **Network** | API issues | Status codes, payload shape, timing, CORS errors | -| **DOM** | UI bugs | Element structure, attributes, accessibility tree | -| **Styles** | Layout issues | Computed styles vs expected, specificity conflicts | -| **Performance** | Slow pages | LCP, CLS, INP, long tasks (>50ms) | -| **Screenshots** | Visual changes | Before/after comparison for CSS and layout changes | +| Tool | When | What to Look For | +| --------------- | -------------- | --------------------------------------------------- | +| **Console** | Always | Zero errors and warnings in production-quality code | +| **Network** | API issues | Status codes, payload shape, timing, CORS errors | +| **DOM** | UI bugs | Element structure, attributes, accessibility tree | +| **Styles** | Layout issues | Computed styles vs expected, specificity conflicts | +| **Performance** | Slow pages | LCP, CLS, INP, long tasks (>50ms) | +| **Screenshots** | Visual changes | Before/after comparison for CSS and layout changes | ### Security Boundaries @@ -348,14 +349,14 @@ For detailed testing patterns, examples, and anti-patterns across frameworks, se ## Common Rationalizations -| Rationalization | Reality | -|---|---| -| "I'll write tests after the code works" | You won't. And tests written after the fact test implementation, not behavior. | -| "This is too simple to test" | Simple code gets complicated. The test documents the expected behavior. | -| "Tests slow me down" | Tests slow you down now. They speed you up every time you change the code later. | -| "I tested it manually" | Manual testing doesn't persist. Tomorrow's change might break it with no way to know. | -| "The code is self-explanatory" | Tests ARE the specification. They document what the code should do, not what it does. | -| "It's just a prototype" | Prototypes become production code. Tests from day one prevent the "test debt" crisis. | +| Rationalization | Reality | +| -------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | +| "I'll write tests after the code works" | You won't. And tests written after the fact test implementation, not behavior. | +| "This is too simple to test" | Simple code gets complicated. The test documents the expected behavior. | +| "Tests slow me down" | Tests slow you down now. They speed you up every time you change the code later. | +| "I tested it manually" | Manual testing doesn't persist. Tomorrow's change might break it with no way to know. | +| "The code is self-explanatory" | Tests ARE the specification. They document what the code should do, not what it does. | +| "It's just a prototype" | Prototypes become production code. Tests from day one prevent the "test debt" crisis. | | "Let me run the tests again just to be extra sure" | After a clean test run, repeating the same command adds nothing unless the code has changed since. Run again after subsequent edits, not as reassurance. | ## Red Flags diff --git a/docs/superpowers/contracts/agent-automation-bootstrap.json b/docs/superpowers/contracts/agent-automation-bootstrap.json index 43b7e7a..2684d1a 100644 --- a/docs/superpowers/contracts/agent-automation-bootstrap.json +++ b/docs/superpowers/contracts/agent-automation-bootstrap.json @@ -1,12 +1,7 @@ { "title": "agent automation bootstrap", "objective": "Define what success looks like for agent automation bootstrap.", - "acceptance_criteria": [ - "Criterion 1", - "Criterion 2" - ], - "out_of_scope": [ - "Explicitly excluded work item" - ], + "acceptance_criteria": ["Criterion 1", "Criterion 2"], + "out_of_scope": ["Explicitly excluded work item"], "verifier": "pre-commit + task-specific tests" } diff --git a/docs/superpowers/contracts/run-sh-wrapper-smoke.json b/docs/superpowers/contracts/run-sh-wrapper-smoke.json index b62fd99..287571f 100644 --- a/docs/superpowers/contracts/run-sh-wrapper-smoke.json +++ b/docs/superpowers/contracts/run-sh-wrapper-smoke.json @@ -1,12 +1,7 @@ { "title": "run-sh-wrapper-smoke", "objective": "Define what success looks like for run-sh-wrapper-smoke.", - "acceptance_criteria": [ - "Criterion 1", - "Criterion 2" - ], - "out_of_scope": [ - "Explicitly excluded work item" - ], + "acceptance_criteria": ["Criterion 1", "Criterion 2"], + "out_of_scope": ["Explicitly excluded work item"], "verifier": "pre-commit + task-specific tests" } diff --git a/docs/superpowers/contracts/template.json b/docs/superpowers/contracts/template.json index 6ecca7e..0d4fc26 100644 --- a/docs/superpowers/contracts/template.json +++ b/docs/superpowers/contracts/template.json @@ -1,12 +1,7 @@ { "title": "Short contract title", "objective": "One-paragraph objective and success definition.", - "acceptance_criteria": [ - "Criterion 1", - "Criterion 2" - ], - "out_of_scope": [ - "Explicitly excluded item 1" - ], + "acceptance_criteria": ["Criterion 1", "Criterion 2"], + "out_of_scope": ["Explicitly excluded item 1"], "verifier": "Name the command(s) or gate responsible for verification" } diff --git a/docs/superpowers/evidence/agent-automation-bootstrap.json b/docs/superpowers/evidence/agent-automation-bootstrap.json index 3e3aba9..84f2698 100644 --- a/docs/superpowers/evidence/agent-automation-bootstrap.json +++ b/docs/superpowers/evidence/agent-automation-bootstrap.json @@ -1,13 +1,7 @@ { "intent": "Describe the expected user-visible outcome for agent automation bootstrap.", - "scope": [ - "Impacted modules/files", - "Constraints/non-goals" - ], - "changes": [ - "Implementation summary item 1", - "Implementation summary item 2" - ], + "scope": ["Impacted modules/files", "Constraints/non-goals"], + "changes": ["Implementation summary item 1", "Implementation summary item 2"], "verification": [ { "command": "pre-commit run --files ", @@ -15,12 +9,6 @@ "evidence": "Paste command output summary" } ], - "risks": [ - "Risk 1", - "Risk 2" - ], - "rollback": [ - "Revert commit(s)", - "Re-run validation checks" - ] + "risks": ["Risk 1", "Risk 2"], + "rollback": ["Revert commit(s)", "Re-run validation checks"] } diff --git a/docs/superpowers/evidence/run-sh-wrapper-smoke.json b/docs/superpowers/evidence/run-sh-wrapper-smoke.json index bddc7fe..187daa6 100644 --- a/docs/superpowers/evidence/run-sh-wrapper-smoke.json +++ b/docs/superpowers/evidence/run-sh-wrapper-smoke.json @@ -1,13 +1,7 @@ { "intent": "Describe the expected user-visible outcome for run-sh-wrapper-smoke.", - "scope": [ - "Impacted modules/files", - "Constraints/non-goals" - ], - "changes": [ - "Implementation summary item 1", - "Implementation summary item 2" - ], + "scope": ["Impacted modules/files", "Constraints/non-goals"], + "changes": ["Implementation summary item 1", "Implementation summary item 2"], "verification": [ { "command": "pre-commit run --files ", @@ -15,12 +9,6 @@ "evidence": "Paste command output summary" } ], - "risks": [ - "Risk 1", - "Risk 2" - ], - "rollback": [ - "Revert commit(s)", - "Re-run validation checks" - ] + "risks": ["Risk 1", "Risk 2"], + "rollback": ["Revert commit(s)", "Re-run validation checks"] } diff --git a/docs/superpowers/evidence/template.json b/docs/superpowers/evidence/template.json index 6dcc50e..d172862 100644 --- a/docs/superpowers/evidence/template.json +++ b/docs/superpowers/evidence/template.json @@ -1,9 +1,6 @@ { "intent": "Describe the intended user-visible outcome.", - "scope": [ - "List impacted modules or files", - "List constraints or non-goals" - ], + "scope": ["List impacted modules or files", "List constraints or non-goals"], "changes": [ "Summarize key implementation change #1", "Summarize key implementation change #2" @@ -15,12 +12,6 @@ "evidence": "Paste compact output summary here" } ], - "risks": [ - "Potential risk #1", - "Potential risk #2" - ], - "rollback": [ - "How to revert safely", - "What to validate after rollback" - ] + "risks": ["Potential risk #1", "Potential risk #2"], + "rollback": ["How to revert safely", "What to validate after rollback"] } diff --git a/docs/superpowers/memory/verification_playbook.json b/docs/superpowers/memory/verification_playbook.json index ffc3a72..91ddf2f 100644 --- a/docs/superpowers/memory/verification_playbook.json +++ b/docs/superpowers/memory/verification_playbook.json @@ -7,9 +7,5 @@ "Capture exact command outputs in evidence artifact", "Record residual risks and rollback plan" ], - "forbidden_phrases": [ - "should work", - "probably fine", - "seems right" - ] + "forbidden_phrases": ["should work", "probably fine", "seems right"] } diff --git a/third_party/agent-skills/.claude/commands/ship.md b/third_party/agent-skills/.claude/commands/ship.md index 1dfaf01..f53f049 100644 --- a/third_party/agent-skills/.claude/commands/ship.md +++ b/third_party/agent-skills/.claude/commands/ship.md @@ -19,6 +19,7 @@ In Claude Code, each call passes `subagent_type` matching the persona's `name` f In other harnesses without an Agent tool, invoke each persona's system prompt sequentially and treat their outputs as if returned in parallel — the merge phase still works. Constraints (from Claude Code's subagent model): + - Subagents cannot spawn other subagents — do not let one persona delegate to another. - Each subagent gets its own context window and returns only its report to this main session. - If you need teammates that talk to each other instead of just reporting back, use Claude Code Agent Teams and reference these personas as teammate types (see `references/orchestration-patterns.md`). @@ -44,20 +45,25 @@ Produce a single output: ## Ship Decision: GO | NO-GO ### Blockers (must fix before ship) + - [Source persona: Critical finding + file:line] ### Recommended fixes (should fix before ship) + - [Source persona: Important finding + file:line] ### Acknowledged risks (shipping anyway) + - [Risk + mitigation] ### Rollback plan + - Trigger conditions: [what signals would prompt rollback] - Rollback procedure: [exact steps] - Recovery time objective: [target] ### Specialist reports (full) + - [code-reviewer report] - [security-auditor report] - [test-engineer report] diff --git a/third_party/agent-skills/.claude/commands/spec.md b/third_party/agent-skills/.claude/commands/spec.md index 2207935..b858f55 100644 --- a/third_party/agent-skills/.claude/commands/spec.md +++ b/third_party/agent-skills/.claude/commands/spec.md @@ -5,6 +5,7 @@ description: Start spec-driven development — write a structured specification Invoke the agent-skills:spec-driven-development skill. Begin by understanding what the user wants to build. Ask clarifying questions about: + 1. The objective and target users 2. Core features and acceptance criteria 3. Tech stack preferences and constraints diff --git a/third_party/agent-skills/.claude/commands/test.md b/third_party/agent-skills/.claude/commands/test.md index a2b9cfd..94c3caf 100644 --- a/third_party/agent-skills/.claude/commands/test.md +++ b/third_party/agent-skills/.claude/commands/test.md @@ -5,11 +5,13 @@ description: Run TDD workflow — write failing tests, implement, verify. For bu Invoke the agent-skills:test-driven-development skill. For new features: + 1. Write tests that describe the expected behavior (they should FAIL) 2. Implement the code to make them pass 3. Refactor while keeping tests green For bug fixes (Prove-It pattern): + 1. Write a test that reproduces the bug (must FAIL) 2. Confirm the test fails 3. Implement the fix diff --git a/third_party/agent-skills/AGENTS.md b/third_party/agent-skills/AGENTS.md index 7b09470..5174a87 100644 --- a/third_party/agent-skills/AGENTS.md +++ b/third_party/agent-skills/AGENTS.md @@ -69,9 +69,9 @@ This ensures OpenCode behaves similarly to Claude Code with full workflow enforc This repo has three composable layers. They have different jobs and should not be confused: -- **Skills** (`skills//SKILL.md`) — workflows with steps and exit criteria. The *how*. Mandatory hops when an intent matches. -- **Personas** (`agents/.md`) — roles with a perspective and an output format. The *who*. -- **Slash commands** (`.claude/commands/*.md`) — user-facing entry points. The *when*. The orchestration layer. +- **Skills** (`skills//SKILL.md`) — workflows with steps and exit criteria. The _how_. Mandatory hops when an intent matches. +- **Personas** (`agents/.md`) — roles with a perspective and an output format. The _who_. +- **Slash commands** (`.claude/commands/*.md`) — user-facing entry points. The _when_. The orchestration layer. Composition rule: **the user (or a slash command) is the orchestrator. Personas do not invoke other personas.** A persona may invoke skills. @@ -103,10 +103,15 @@ skills/ ### SKILL.md Format -```markdown +````markdown --- -name: {skill-name} -description: {One sentence describing when to use this skill. Include trigger phrases like "Deploy my app", "Check logs", etc.} +name: { skill-name } +description: + { + One sentence describing when to use this skill. Include trigger phrases like "Deploy my app", + "Check logs", + etc., + } --- # {Skill Title} @@ -122,8 +127,10 @@ description: {One sentence describing when to use this skill. Include trigger ph ```bash bash /mnt/skills/user/{skill-name}/scripts/{script}.sh [args] ``` +```` **Arguments:** + - `arg1` - Description (defaults to X) **Examples:** @@ -140,7 +147,8 @@ bash /mnt/skills/user/{skill-name}/scripts/{script}.sh [args] ## Troubleshooting {Common issues and solutions, especially network/permissions errors} -``` + +```` ### Best Practices for Context Efficiency @@ -168,13 +176,14 @@ After creating or updating a skill: ```bash cd skills zip -r {skill-name}.zip {skill-name}/ -``` +```` ### End-User Installation Document these two installation methods for users: **Claude Code:** + ```bash cp -r skills/{skill-name} ~/.claude/skills/ ``` diff --git a/third_party/agent-skills/README.md b/third_party/agent-skills/README.md index 68e6300..28a80b9 100644 --- a/third_party/agent-skills/README.md +++ b/third_party/agent-skills/README.md @@ -19,15 +19,15 @@ Skills encode the workflows, quality gates, and best practices that senior engin 7 slash commands that map to the development lifecycle. Each one activates the right skills automatically. -| What you're doing | Command | Key principle | -|-------------------|---------|---------------| -| Define what to build | `/spec` | Spec before code | -| Plan how to build it | `/plan` | Small, atomic tasks | -| Build incrementally | `/build` | One slice at a time | -| Prove it works | `/test` | Tests are proof | -| Review before merge | `/review` | Improve code health | -| Simplify the code | `/code-simplify` | Clarity over cleverness | -| Ship to production | `/ship` | Faster is safer | +| What you're doing | Command | Key principle | +| -------------------- | ---------------- | ----------------------- | +| Define what to build | `/spec` | Spec before code | +| Plan how to build it | `/plan` | Small, atomic tasks | +| Build incrementally | `/build` | One slice at a time | +| Prove it works | `/test` | Tests are proof | +| Review before merge | `/review` | Improve code health | +| Simplify the code | `/code-simplify` | Clarity over cleverness | +| Ship to production | `/ship` | Faster is safer | Skills also activate automatically based on what you're doing — designing an API triggers `api-and-interface-design`, building UI triggers `frontend-ui-engineering`, and so on. @@ -46,6 +46,7 @@ Skills also activate automatically based on what you're doing — designing an A ``` > **SSH errors?** The marketplace clones repos via SSH. If you don't have SSH keys set up on GitHub, either [add your SSH key](https://docs.github.com/en/authentication/connecting-to-github-with-ssh/adding-a-new-ssh-key-to-your-github-account) or use the full HTTPS URL to force the HTTPS cloning: +> > ```bash > /plugin marketplace add https://github.com/addyosmani/agent-skills.git > /plugin install agent-skills@addy-agent-skills @@ -121,8 +122,6 @@ Skills are plain Markdown - they work with any agent that accepts system prompts - - --- ## All 20 Skills @@ -131,53 +130,53 @@ The commands above are the entry points. Under the hood, they activate these 20 ### Define - Clarify what to build -| Skill | What It Does | Use When | -|-------|-------------|----------| -| [idea-refine](skills/idea-refine/SKILL.md) | Structured divergent/convergent thinking to turn vague ideas into concrete proposals | You have a rough concept that needs exploration | +| Skill | What It Does | Use When | +| ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | +| [idea-refine](skills/idea-refine/SKILL.md) | Structured divergent/convergent thinking to turn vague ideas into concrete proposals | You have a rough concept that needs exploration | | [spec-driven-development](skills/spec-driven-development/SKILL.md) | Write a PRD covering objectives, commands, structure, code style, testing, and boundaries before any code | Starting a new project, feature, or significant change | ### Plan - Break it down -| Skill | What It Does | Use When | -|-------|-------------|----------| +| Skill | What It Does | Use When | +| -------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- | -------------------------------------------- | | [planning-and-task-breakdown](skills/planning-and-task-breakdown/SKILL.md) | Decompose specs into small, verifiable tasks with acceptance criteria and dependency ordering | You have a spec and need implementable units | ### Build - Write the code -| Skill | What It Does | Use When | -|-------|-------------|----------| -| [incremental-implementation](skills/incremental-implementation/SKILL.md) | Thin vertical slices - implement, test, verify, commit. Feature flags, safe defaults, rollback-friendly changes | Any change touching more than one file | -| [test-driven-development](skills/test-driven-development/SKILL.md) | Red-Green-Refactor, test pyramid (80/15/5), test sizes, DAMP over DRY, Beyonce Rule, browser testing | Implementing logic, fixing bugs, or changing behavior | -| [context-engineering](skills/context-engineering/SKILL.md) | Feed agents the right information at the right time - rules files, context packing, MCP integrations | Starting a session, switching tasks, or when output quality drops | -| [source-driven-development](skills/source-driven-development/SKILL.md) | Ground every framework decision in official documentation - verify, cite sources, flag what's unverified | You want authoritative, source-cited code for any framework or library | -| [frontend-ui-engineering](skills/frontend-ui-engineering/SKILL.md) | Component architecture, design systems, state management, responsive design, WCAG 2.1 AA accessibility | Building or modifying user-facing interfaces | -| [api-and-interface-design](skills/api-and-interface-design/SKILL.md) | Contract-first design, Hyrum's Law, One-Version Rule, error semantics, boundary validation | Designing APIs, module boundaries, or public interfaces | +| Skill | What It Does | Use When | +| ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- | +| [incremental-implementation](skills/incremental-implementation/SKILL.md) | Thin vertical slices - implement, test, verify, commit. Feature flags, safe defaults, rollback-friendly changes | Any change touching more than one file | +| [test-driven-development](skills/test-driven-development/SKILL.md) | Red-Green-Refactor, test pyramid (80/15/5), test sizes, DAMP over DRY, Beyonce Rule, browser testing | Implementing logic, fixing bugs, or changing behavior | +| [context-engineering](skills/context-engineering/SKILL.md) | Feed agents the right information at the right time - rules files, context packing, MCP integrations | Starting a session, switching tasks, or when output quality drops | +| [source-driven-development](skills/source-driven-development/SKILL.md) | Ground every framework decision in official documentation - verify, cite sources, flag what's unverified | You want authoritative, source-cited code for any framework or library | +| [frontend-ui-engineering](skills/frontend-ui-engineering/SKILL.md) | Component architecture, design systems, state management, responsive design, WCAG 2.1 AA accessibility | Building or modifying user-facing interfaces | +| [api-and-interface-design](skills/api-and-interface-design/SKILL.md) | Contract-first design, Hyrum's Law, One-Version Rule, error semantics, boundary validation | Designing APIs, module boundaries, or public interfaces | ### Verify - Prove it works -| Skill | What It Does | Use When | -|-------|-------------|----------| +| Skill | What It Does | Use When | +| ------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | | [browser-testing-with-devtools](skills/browser-testing-with-devtools/SKILL.md) | Chrome DevTools MCP for live runtime data - DOM inspection, console logs, network traces, performance profiling | Building or debugging anything that runs in a browser | -| [debugging-and-error-recovery](skills/debugging-and-error-recovery/SKILL.md) | Five-step triage: reproduce, localize, reduce, fix, guard. Stop-the-line rule, safe fallbacks | Tests fail, builds break, or behavior is unexpected | +| [debugging-and-error-recovery](skills/debugging-and-error-recovery/SKILL.md) | Five-step triage: reproduce, localize, reduce, fix, guard. Stop-the-line rule, safe fallbacks | Tests fail, builds break, or behavior is unexpected | ### Review - Quality gates before merge -| Skill | What It Does | Use When | -|-------|-------------|----------| -| [code-review-and-quality](skills/code-review-and-quality/SKILL.md) | Five-axis review, change sizing (~100 lines), severity labels (Nit/Optional/FYI), review speed norms, splitting strategies | Before merging any change | -| [code-simplification](skills/code-simplification/SKILL.md) | Chesterton's Fence, Rule of 500, reduce complexity while preserving exact behavior | Code works but is harder to read or maintain than it should be | -| [security-and-hardening](skills/security-and-hardening/SKILL.md) | OWASP Top 10 prevention, auth patterns, secrets management, dependency auditing, three-tier boundary system | Handling user input, auth, data storage, or external integrations | -| [performance-optimization](skills/performance-optimization/SKILL.md) | Measure-first approach - Core Web Vitals targets, profiling workflows, bundle analysis, anti-pattern detection | Performance requirements exist or you suspect regressions | +| Skill | What It Does | Use When | +| -------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- | +| [code-review-and-quality](skills/code-review-and-quality/SKILL.md) | Five-axis review, change sizing (~100 lines), severity labels (Nit/Optional/FYI), review speed norms, splitting strategies | Before merging any change | +| [code-simplification](skills/code-simplification/SKILL.md) | Chesterton's Fence, Rule of 500, reduce complexity while preserving exact behavior | Code works but is harder to read or maintain than it should be | +| [security-and-hardening](skills/security-and-hardening/SKILL.md) | OWASP Top 10 prevention, auth patterns, secrets management, dependency auditing, three-tier boundary system | Handling user input, auth, data storage, or external integrations | +| [performance-optimization](skills/performance-optimization/SKILL.md) | Measure-first approach - Core Web Vitals targets, profiling workflows, bundle analysis, anti-pattern detection | Performance requirements exist or you suspect regressions | ### Ship - Deploy with confidence -| Skill | What It Does | Use When | -|-------|-------------|----------| -| [git-workflow-and-versioning](skills/git-workflow-and-versioning/SKILL.md) | Trunk-based development, atomic commits, change sizing (~100 lines), the commit-as-save-point pattern | Making any code change (always) | -| [ci-cd-and-automation](skills/ci-cd-and-automation/SKILL.md) | Shift Left, Faster is Safer, feature flags, quality gate pipelines, failure feedback loops | Setting up or modifying build and deploy pipelines | -| [deprecation-and-migration](skills/deprecation-and-migration/SKILL.md) | Code-as-liability mindset, compulsory vs advisory deprecation, migration patterns, zombie code removal | Removing old systems, migrating users, or sunsetting features | -| [documentation-and-adrs](skills/documentation-and-adrs/SKILL.md) | Architecture Decision Records, API docs, inline documentation standards - document the *why* | Making architectural decisions, changing APIs, or shipping features | -| [shipping-and-launch](skills/shipping-and-launch/SKILL.md) | Pre-launch checklists, feature flag lifecycle, staged rollouts, rollback procedures, monitoring setup | Preparing to deploy to production | +| Skill | What It Does | Use When | +| -------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------- | +| [git-workflow-and-versioning](skills/git-workflow-and-versioning/SKILL.md) | Trunk-based development, atomic commits, change sizing (~100 lines), the commit-as-save-point pattern | Making any code change (always) | +| [ci-cd-and-automation](skills/ci-cd-and-automation/SKILL.md) | Shift Left, Faster is Safer, feature flags, quality gate pipelines, failure feedback loops | Setting up or modifying build and deploy pipelines | +| [deprecation-and-migration](skills/deprecation-and-migration/SKILL.md) | Code-as-liability mindset, compulsory vs advisory deprecation, migration patterns, zombie code removal | Removing old systems, migrating users, or sunsetting features | +| [documentation-and-adrs](skills/documentation-and-adrs/SKILL.md) | Architecture Decision Records, API docs, inline documentation standards - document the _why_ | Making architectural decisions, changing APIs, or shipping features | +| [shipping-and-launch](skills/shipping-and-launch/SKILL.md) | Pre-launch checklists, feature flag lifecycle, staged rollouts, rollback procedures, monitoring setup | Preparing to deploy to production | --- @@ -185,11 +184,11 @@ The commands above are the entry points. Under the hood, they activate these 20 Pre-configured specialist personas for targeted reviews: -| Agent | Role | Perspective | -|-------|------|-------------| -| [code-reviewer](agents/code-reviewer.md) | Senior Staff Engineer | Five-axis code review with "would a staff engineer approve this?" standard | -| [test-engineer](agents/test-engineer.md) | QA Specialist | Test strategy, coverage analysis, and the Prove-It pattern | -| [security-auditor](agents/security-auditor.md) | Security Engineer | Vulnerability detection, threat modeling, OWASP assessment | +| Agent | Role | Perspective | +| ---------------------------------------------- | --------------------- | -------------------------------------------------------------------------- | +| [code-reviewer](agents/code-reviewer.md) | Senior Staff Engineer | Five-axis code review with "would a staff engineer approve this?" standard | +| [test-engineer](agents/test-engineer.md) | QA Specialist | Test strategy, coverage analysis, and the Prove-It pattern | +| [security-auditor](agents/security-auditor.md) | Security Engineer | Vulnerability detection, threat modeling, OWASP assessment | --- @@ -197,12 +196,12 @@ Pre-configured specialist personas for targeted reviews: Quick-reference material that skills pull in when needed: -| Reference | Covers | -|-----------|--------| -| [testing-patterns.md](references/testing-patterns.md) | Test structure, naming, mocking, React/API/E2E examples, anti-patterns | -| [security-checklist.md](references/security-checklist.md) | Pre-commit checks, auth, input validation, headers, CORS, OWASP Top 10 | -| [performance-checklist.md](references/performance-checklist.md) | Core Web Vitals targets, frontend/backend checklists, measurement commands | -| [accessibility-checklist.md](references/accessibility-checklist.md) | Keyboard nav, screen readers, visual design, ARIA, testing tools | +| Reference | Covers | +| ------------------------------------------------------------------- | -------------------------------------------------------------------------- | +| [testing-patterns.md](references/testing-patterns.md) | Test structure, naming, mocking, React/API/E2E examples, anti-patterns | +| [security-checklist.md](references/security-checklist.md) | Pre-commit checks, auth, input validation, headers, CORS, OWASP Top 10 | +| [performance-checklist.md](references/performance-checklist.md) | Core Web Vitals targets, frontend/backend checklists, measurement commands | +| [accessibility-checklist.md](references/accessibility-checklist.md) | Keyboard nav, screen readers, visual design, ARIA, testing tools | --- @@ -218,7 +217,7 @@ Every skill follows a consistent anatomy: │ │ name: lowercase-hyphen-name │ │ │ │ description: Guides agents through [task].│ │ │ │ Use when… │ │ -│ └───────────────────────────────────────────┘ │ +│ └───────────────────────────────────────────┘ │ │ Overview → What this skill does │ │ When to Use → Triggering conditions │ │ Process → Step-by-step workflow │ @@ -277,7 +276,7 @@ agent-skills/ AI coding agents default to the shortest path - which often means skipping specs, tests, security reviews, and the practices that make software reliable. Agent Skills gives agents structured workflows that enforce the same discipline senior engineers bring to production code. -Each skill encodes hard-won engineering judgment: *when* to write a spec, *what* to test, *how* to review, and *when* to ship. These aren't generic prompts - they're the kind of opinionated, process-driven workflows that separate production-quality work from prototype-quality work. +Each skill encodes hard-won engineering judgment: _when_ to write a spec, _what_ to test, _how_ to review, and _when_ to ship. These aren't generic prompts - they're the kind of opinionated, process-driven workflows that separate production-quality work from prototype-quality work. Skills bake in best practices from Google's engineering culture — including concepts from [Software Engineering at Google](https://abseil.io/resources/swe-book) and Google's [engineering practices guide](https://google.github.io/eng-practices/). You'll find Hyrum's Law in API design, the Beyonce Rule and test pyramid in testing, change sizing and review speed norms in code review, Chesterton's Fence in simplification, trunk-based development in git workflow, Shift Left and feature flags in CI/CD, and a dedicated deprecation skill treating code as a liability. These aren't abstract principles — they're embedded directly into the step-by-step workflows agents follow. diff --git a/third_party/agent-skills/agents/README.md b/third_party/agent-skills/agents/README.md index 508bb36..a1dbd17 100644 --- a/third_party/agent-skills/agents/README.md +++ b/third_party/agent-skills/agents/README.md @@ -2,27 +2,28 @@ Specialist personas that play a single role with a single perspective. Each persona is a Markdown file consumed as a system prompt by your harness (Claude Code, Cursor, Copilot, etc.). -| Persona | Role | Best for | -|---------|------|----------| -| [code-reviewer](code-reviewer.md) | Senior Staff Engineer | Five-axis review before merge | -| [security-auditor](security-auditor.md) | Security Engineer | Vulnerability detection, OWASP-style audit | -| [test-engineer](test-engineer.md) | QA Engineer | Test strategy, coverage analysis, Prove-It pattern | +| Persona | Role | Best for | +| --------------------------------------- | --------------------- | -------------------------------------------------- | +| [code-reviewer](code-reviewer.md) | Senior Staff Engineer | Five-axis review before merge | +| [security-auditor](security-auditor.md) | Security Engineer | Vulnerability detection, OWASP-style audit | +| [test-engineer](test-engineer.md) | QA Engineer | Test strategy, coverage analysis, Prove-It pattern | ## How personas relate to skills and commands Three layers, each with a distinct job: -| Layer | What it is | Example | Composition role | -|-------|-----------|---------|------------------| -| **Skill** | A workflow with steps and exit criteria | `code-review-and-quality` | The *how* — invoked from inside a persona or command | -| **Persona** | A role with a perspective and an output format | `code-reviewer` | The *who* — adopts a viewpoint, produces a report | -| **Command** | A user-facing entry point | `/review`, `/ship` | The *when* — composes personas and skills | +| Layer | What it is | Example | Composition role | +| ----------- | ---------------------------------------------- | ------------------------- | ---------------------------------------------------- | +| **Skill** | A workflow with steps and exit criteria | `code-review-and-quality` | The _how_ — invoked from inside a persona or command | +| **Persona** | A role with a perspective and an output format | `code-reviewer` | The _who_ — adopts a viewpoint, produces a report | +| **Command** | A user-facing entry point | `/review`, `/ship` | The _when_ — composes personas and skills | The user (or a slash command) is the orchestrator. **Personas do not call other personas.** Skills are mandatory hops inside a persona's workflow. ## When to use each ### Direct persona invocation + Pick this when you want one perspective on the current change and the user is in the loop. - "Review this PR" → invoke `code-reviewer` directly @@ -30,12 +31,14 @@ Pick this when you want one perspective on the current change and the user is in - "What tests are missing for the checkout flow?" → invoke `test-engineer` directly ### Slash command (single persona behind it) + Pick this when there's a repeatable workflow you'd otherwise re-explain every time. - `/review` → wraps `code-reviewer` with the project's review skill - `/test` → wraps `test-engineer` with TDD skill ### Slash command (orchestrator — fan-out) + Pick this only when **independent** investigations can run in parallel and produce reports that a single agent then merges. - `/ship` → fans out to `code-reviewer` + `security-auditor` + `test-engineer` in parallel, then synthesizes their reports into a go/no-go decision @@ -68,6 +71,7 @@ Is the work a single perspective on a single artifact? ``` Why this works: + - Each sub-agent operates on the same diff but produces a **different perspective** - They have no dependencies on each other → genuine parallelism, real wall-clock savings - Each runs in a fresh context window → main session stays uncluttered @@ -88,6 +92,7 @@ A `meta-orchestrator` persona whose job is "decide which other persona to call": ``` Why this fails: + - Pure routing layer with no domain value - Adds two paraphrasing hops → information loss + 2× token cost - The user already knows they want a review; let them call `/review` directly @@ -96,8 +101,8 @@ Why this fails: ## Rules for personas 1. A persona is a single role with a single output format. If you find yourself adding a second role, create a second persona. -2. **Personas do not invoke other personas.** Composition is the job of slash commands or the user. On Claude Code this is also a hard platform constraint — *"subagents cannot spawn other subagents"* — so the rule is enforced for you. -3. A persona may invoke skills (the *how*). +2. **Personas do not invoke other personas.** Composition is the job of slash commands or the user. On Claude Code this is also a hard platform constraint — _"subagents cannot spawn other subagents"_ — so the rule is enforced for you. +3. A persona may invoke skills (the _how_). 4. Every persona file ends with a "Composition" block stating where it fits. ## Claude Code interop diff --git a/third_party/agent-skills/agents/code-reviewer.md b/third_party/agent-skills/agents/code-reviewer.md index 3bce85c..98e88ca 100644 --- a/third_party/agent-skills/agents/code-reviewer.md +++ b/third_party/agent-skills/agents/code-reviewer.md @@ -12,18 +12,21 @@ You are an experienced Staff Engineer conducting a thorough code review. Your ro Evaluate every change across these five dimensions: ### 1. Correctness + - Does the code do what the spec/task says it should? - Are edge cases handled (null, empty, boundary values, error paths)? - Do the tests actually verify the behavior? Are they testing the right things? - Are there race conditions, off-by-one errors, or state inconsistencies? ### 2. Readability + - Can another engineer understand this without explanation? - Are names descriptive and consistent with project conventions? - Is the control flow straightforward (no deeply nested logic)? - Is the code well-organized (related code grouped, clear boundaries)? ### 3. Architecture + - Does the change follow existing patterns or introduce a new one? - If a new pattern, is it justified and documented? - Are module boundaries maintained? Any circular dependencies? @@ -31,6 +34,7 @@ Evaluate every change across these five dimensions: - Are dependencies flowing in the right direction? ### 4. Security + - Is user input validated and sanitized at system boundaries? - Are secrets kept out of code, logs, and version control? - Is authentication/authorization checked where needed? @@ -38,6 +42,7 @@ Evaluate every change across these five dimensions: - Any new dependencies with known vulnerabilities? ### 5. Performance + - Any N+1 query patterns? - Any unbounded loops or unconstrained data fetching? - Any synchronous operations that should be async? @@ -64,18 +69,23 @@ Categorize every finding: **Overview:** [1-2 sentences summarizing the change and overall assessment] ### Critical Issues + - [File:line] [Description and recommended fix] ### Important Issues + - [File:line] [Description and recommended fix] ### Suggestions + - [File:line] [Description] ### What's Done Well + - [Positive observation — always include at least one] ### Verification Story + - Tests reviewed: [yes/no, observations] - Build verified: [yes/no] - Security checked: [yes/no, observations] diff --git a/third_party/agent-skills/agents/security-auditor.md b/third_party/agent-skills/agents/security-auditor.md index 07bc30b..0fef188 100644 --- a/third_party/agent-skills/agents/security-auditor.md +++ b/third_party/agent-skills/agents/security-auditor.md @@ -10,6 +10,7 @@ You are an experienced Security Engineer conducting a security review. Your role ## Review Scope ### 1. Input Handling + - Is all user input validated at system boundaries? - Are there injection vectors (SQL, NoSQL, OS command, LDAP)? - Is HTML output encoded to prevent XSS? @@ -17,6 +18,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Are URL redirects validated against an allowlist? ### 2. Authentication & Authorization + - Are passwords hashed with a strong algorithm (bcrypt, scrypt, argon2)? - Are sessions managed securely (httpOnly, secure, sameSite cookies)? - Is authorization checked on every protected endpoint? @@ -25,6 +27,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Is rate limiting applied to authentication endpoints? ### 3. Data Protection + - Are secrets in environment variables (not code)? - Are sensitive fields excluded from API responses and logs? - Is data encrypted in transit (HTTPS) and at rest (if required)? @@ -32,6 +35,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Are database backups encrypted? ### 4. Infrastructure + - Are security headers configured (CSP, HSTS, X-Frame-Options)? - Is CORS restricted to specific origins? - Are dependencies audited for known vulnerabilities? @@ -39,6 +43,7 @@ You are an experienced Security Engineer conducting a security review. Your role - Is the principle of least privilege applied to service accounts? ### 5. Third-Party Integrations + - Are API keys and tokens stored securely? - Are webhook payloads verified (signature validation)? - Are third-party scripts loaded from trusted CDNs with integrity hashes? @@ -46,13 +51,13 @@ You are an experienced Security Engineer conducting a security review. Your role ## Severity Classification -| Severity | Criteria | Action | -|----------|----------|--------| +| Severity | Criteria | Action | +| ------------ | ------------------------------------------------------------- | ------------------------------ | | **Critical** | Exploitable remotely, leads to data breach or full compromise | Fix immediately, block release | -| **High** | Exploitable with some conditions, significant data exposure | Fix before release | -| **Medium** | Limited impact or requires authenticated access to exploit | Fix in current sprint | -| **Low** | Theoretical risk or defense-in-depth improvement | Schedule for next sprint | -| **Info** | Best practice recommendation, no current risk | Consider adopting | +| **High** | Exploitable with some conditions, significant data exposure | Fix before release | +| **Medium** | Limited impact or requires authenticated access to exploit | Fix in current sprint | +| **Low** | Theoretical risk or defense-in-depth improvement | Schedule for next sprint | +| **Info** | Best practice recommendation, no current risk | Consider adopting | ## Output Format @@ -60,6 +65,7 @@ You are an experienced Security Engineer conducting a security review. Your role ## Security Audit Report ### Summary + - Critical: [count] - High: [count] - Medium: [count] @@ -68,6 +74,7 @@ You are an experienced Security Engineer conducting a security review. Your role ### Findings #### [CRITICAL] [Finding title] + - **Location:** [file:line] - **Description:** [What the vulnerability is] - **Impact:** [What an attacker could do] @@ -75,12 +82,15 @@ You are an experienced Security Engineer conducting a security review. Your role - **Recommendation:** [Specific fix with code example] #### [HIGH] [Finding title] + ... ### Positive Observations + - [Security practices done well] ### Recommendations + - [Proactive improvements to consider] ``` diff --git a/third_party/agent-skills/agents/test-engineer.md b/third_party/agent-skills/agents/test-engineer.md index 3e2c6be..bf19149 100644 --- a/third_party/agent-skills/agents/test-engineer.md +++ b/third_party/agent-skills/agents/test-engineer.md @@ -12,6 +12,7 @@ You are an experienced QA Engineer focused on test strategy and quality assuranc ### 1. Analyze Before Writing Before writing any test: + - Read the code being tested to understand its behavior - Identify the public API / interface (what to test) - Identify edge cases and error paths @@ -30,6 +31,7 @@ Test at the lowest level that captures the behavior. Don't write E2E tests for t ### 3. Follow the Prove-It Pattern for Bugs When asked to write a test for a bug: + 1. Write a test that demonstrates the bug (must FAIL with current code) 2. Confirm the test fails 3. Report the test is ready for the fix implementation @@ -48,13 +50,13 @@ describe('[Module/Function name]', () => { For every function or component: -| Scenario | Example | -|----------|---------| -| Happy path | Valid input produces expected output | -| Empty input | Empty string, empty array, null, undefined | -| Boundary values | Min, max, zero, negative | -| Error paths | Invalid input, network failure, timeout | -| Concurrency | Rapid repeated calls, out-of-order responses | +| Scenario | Example | +| --------------- | -------------------------------------------- | +| Happy path | Valid input produces expected output | +| Empty input | Empty string, empty array, null, undefined | +| Boundary values | Min, max, zero, negative | +| Error paths | Invalid input, network failure, timeout | +| Concurrency | Rapid repeated calls, out-of-order responses | ## Output Format @@ -64,14 +66,17 @@ When analyzing test coverage: ## Test Coverage Analysis ### Current Coverage -- [X] tests covering [Y] functions/components + +- [x] tests covering [Y] functions/components - Coverage gaps identified: [list] ### Recommended Tests + 1. **[Test name]** — [What it verifies, why it matters] 2. **[Test name]** — [What it verifies, why it matters] ### Priority + - Critical: [Tests that catch potential data loss or security issues] - High: [Tests for core business logic] - Medium: [Tests for edge cases and error handling] diff --git a/third_party/agent-skills/docs/copilot-setup.md b/third_party/agent-skills/docs/copilot-setup.md index 660ae02..47f46ba 100644 --- a/third_party/agent-skills/docs/copilot-setup.md +++ b/third_party/agent-skills/docs/copilot-setup.md @@ -28,6 +28,7 @@ cp /path/to/agent-skills/agents/security-auditor.md .github/agents/security-audi ``` Invoke agents in Copilot Chat: + - `@code-reviewer Review this PR` - `@test-engineer Analyze test coverage for this module` - `@security-auditor Check this endpoint for vulnerabilities` @@ -49,22 +50,26 @@ GitHub Copilot supports project-level instructions via `.github/copilot-instruct # Project Coding Standards ## Testing + - Write tests before code (TDD) - For bugs: write a failing test first, then fix (Prove-It pattern) - Test hierarchy: unit > integration > e2e (use the lowest level that captures the behavior) - Run `npm test` after every change ## Code Quality + - Review across five axes: correctness, readability, architecture, security, performance - Every PR must pass: lint, type check, tests, build - No secrets in code or version control ## Implementation + - Build in small, verifiable increments - Each increment: implement → test → verify → commit - Never mix formatting changes with behavior changes ## Boundaries + - Always: Run tests before commits, validate user input - Ask first: Database schema changes, new dependencies - Never: Commit secrets, remove failing tests, skip verification diff --git a/third_party/agent-skills/docs/gemini-cli-setup.md b/third_party/agent-skills/docs/gemini-cli-setup.md index 1e6f3e5..56feb99 100644 --- a/third_party/agent-skills/docs/gemini-cli-setup.md +++ b/third_party/agent-skills/docs/gemini-cli-setup.md @@ -109,15 +109,15 @@ This is useful when you want to ensure a specific workflow is followed without w The repo ships 7 slash commands under `.gemini/commands/` that map to the development lifecycle. Gemini CLI auto-discovers them when you run from the project root. -| Command | What it does | -|---------|--------------| -| `/spec` | Write a structured spec before writing code | -| `/planning` | Break work into small, verifiable tasks | -| `/build` | Implement the next task incrementally | -| `/test` | Run TDD workflow — red, green, refactor | -| `/review` | Five-axis code review | -| `/code-simplify` | Reduce complexity without changing behavior | -| `/ship` | Pre-launch checklist via parallel persona fan-out | +| Command | What it does | +| ---------------- | ------------------------------------------------- | +| `/spec` | Write a structured spec before writing code | +| `/planning` | Break work into small, verifiable tasks | +| `/build` | Implement the next task incrementally | +| `/test` | Run TDD workflow — red, green, refactor | +| `/review` | Five-axis code review | +| `/code-simplify` | Reduce complexity without changing behavior | +| `/ship` | Pre-launch checklist via parallel persona fan-out | Each command invokes the corresponding skill automatically — no manual skill loading required. @@ -126,6 +126,6 @@ Each command invokes the corresponding skill automatically — no manual skill l ## Usage Tips 1. **Prefer skills over GEMINI.md** — Skills activate on demand and keep your context window focused. Only put skills in GEMINI.md if you want them always loaded. -2. **Skill descriptions matter** — Each SKILL.md has a `description` field in its frontmatter that tells agents when to activate it. The descriptions in this repo are optimized for auto-discovery across all supported tools (Claude Code, Gemini CLI, etc.) by clearly stating both *what* the skill does and *when* it should be triggered. +2. **Skill descriptions matter** — Each SKILL.md has a `description` field in its frontmatter that tells agents when to activate it. The descriptions in this repo are optimized for auto-discovery across all supported tools (Claude Code, Gemini CLI, etc.) by clearly stating both _what_ the skill does and _when_ it should be triggered. 3. **Use agents for review** — Copy `agents/code-reviewer.md` content when requesting structured code reviews. 4. **Combine with references** — Reference checklists from `references/` when working on specific quality areas like testing or performance. diff --git a/third_party/agent-skills/docs/getting-started.md b/third_party/agent-skills/docs/getting-started.md index f40eb14..445857b 100644 --- a/third_party/agent-skills/docs/getting-started.md +++ b/third_party/agent-skills/docs/getting-started.md @@ -19,6 +19,7 @@ git clone https://github.com/addyosmani/agent-skills.git ### 2. Choose a skill Browse the `skills/` directory. Each subdirectory contains a `SKILL.md` with: + - **When to use** — triggers that indicate this skill applies - **Process** — step-by-step workflow - **Verification** — how to confirm the work is done @@ -91,11 +92,11 @@ See [skill-anatomy.md](skill-anatomy.md) for the full specification. The `agents/` directory contains pre-configured agent personas: -| Agent | Purpose | -|-------|---------| -| `code-reviewer.md` | Five-axis code review | -| `test-engineer.md` | Test strategy and writing | -| `security-auditor.md` | Vulnerability detection | +| Agent | Purpose | +| --------------------- | ------------------------- | +| `code-reviewer.md` | Five-axis code review | +| `test-engineer.md` | Test strategy and writing | +| `security-auditor.md` | Vulnerability detection | Load an agent definition when you need specialized review. For example, ask your coding agent to "review this change using the code-reviewer agent persona" and provide the agent definition. @@ -103,25 +104,25 @@ Load an agent definition when you need specialized review. For example, ask your The `.claude/commands/` directory contains slash commands for Claude Code: -| Command | Skill Invoked | -|---------|---------------| -| `/spec` | spec-driven-development | -| `/plan` | planning-and-task-breakdown | -| `/build` | incremental-implementation + test-driven-development | -| `/test` | test-driven-development | -| `/review` | code-review-and-quality | -| `/ship` | shipping-and-launch | +| Command | Skill Invoked | +| --------- | ---------------------------------------------------- | +| `/spec` | spec-driven-development | +| `/plan` | planning-and-task-breakdown | +| `/build` | incremental-implementation + test-driven-development | +| `/test` | test-driven-development | +| `/review` | code-review-and-quality | +| `/ship` | shipping-and-launch | ## Using References The `references/` directory contains supplementary checklists: -| Reference | Use With | -|-----------|----------| -| `testing-patterns.md` | test-driven-development | -| `performance-checklist.md` | performance-optimization | -| `security-checklist.md` | security-and-hardening | -| `accessibility-checklist.md` | frontend-ui-engineering | +| Reference | Use With | +| ---------------------------- | ------------------------ | +| `testing-patterns.md` | test-driven-development | +| `performance-checklist.md` | performance-optimization | +| `security-checklist.md` | security-and-hardening | +| `accessibility-checklist.md` | frontend-ui-engineering | Load a reference when you need detailed patterns beyond what the skill covers. diff --git a/third_party/agent-skills/docs/opencode-setup.md b/third_party/agent-skills/docs/opencode-setup.md index 84a96d5..43ddb8f 100644 --- a/third_party/agent-skills/docs/opencode-setup.md +++ b/third_party/agent-skills/docs/opencode-setup.md @@ -92,11 +92,13 @@ This replaces slash commands like `/spec`, `/plan`, etc. ### Example 1: Feature Development User: + ``` Add authentication to this app ``` Agent behavior: + - Detects feature work - Invokes `spec-driven-development` - Produces a spec before writing code @@ -107,11 +109,13 @@ Agent behavior: ### Example 2: Bug Fix User: + ``` This endpoint is returning 500 errors ``` Agent behavior: + - Invokes `debugging-and-error-recovery` - Reproduces → localizes → fixes → adds guards @@ -120,11 +124,13 @@ Agent behavior: ### Example 3: Code Review User: + ``` Review this PR ``` Agent behavior: + - Invokes `code-review-and-quality` - Applies structured review (correctness, design, readability, etc.) diff --git a/third_party/agent-skills/docs/skill-anatomy.md b/third_party/agent-skills/docs/skill-anatomy.md index a71685a..bea5780 100644 --- a/third_party/agent-skills/docs/skill-anatomy.md +++ b/third_party/agent-skills/docs/skill-anatomy.md @@ -25,8 +25,9 @@ description: Guides agents through [task/workflow]. Use when [specific trigger c ``` **Rules:** + - `name`: Lowercase, hyphen-separated. Must match the directory name. -- `description`: Start with what the skill does in third person, then include one or more clear "Use when" trigger conditions. Include both *what* and *when*. Maximum 1024 characters. +- `description`: Start with what the skill does in third person, then include one or more clear "Use when" trigger conditions. Include both _what_ and _when_. Maximum 1024 characters. **Why this matters:** Agents discover skills by reading descriptions. The description is injected into the system prompt, so it must tell the agent both what the skill provides and when to activate it. Do not summarize the workflow — if the description contains process steps, the agent may follow the summary instead of reading the full skill. @@ -36,32 +37,40 @@ description: Guides agents through [task/workflow]. Use when [specific trigger c # Skill Title ## Overview + One-two sentences explaining what this skill does and why it matters. ## When to Use + - Bullet list of triggering conditions (symptoms, task types) - When NOT to use (exclusions) ## [Core Process / The Workflow / Steps] + The main workflow, broken into numbered steps or phases. Include code examples where they help. Use flowcharts (ASCII) where decision points exist. ## [Specific Techniques / Patterns] + Detailed guidance for specific scenarios. Code examples, templates, configuration. ## Common Rationalizations -| Rationalization | Reality | -|---|---| + +| Rationalization | Reality | +| ------------------------------- | ----------------------- | | Excuse agents use to skip steps | Why the excuse is wrong | ## Red Flags + - Behavioral patterns indicating the skill is being violated - Things to watch for during review ## Verification + After completing the skill's process, confirm: + - [ ] Checklist of exit criteria - [ ] Evidence requirements ``` @@ -69,31 +78,38 @@ After completing the skill's process, confirm: ## Section Purposes ### Overview + The "elevator pitch" for the skill. Should answer: What does this skill do, and why should an agent follow it? ### When to Use + Helps agents and humans decide if this skill applies to the current task. Include both positive triggers ("Use when X") and negative exclusions ("NOT for Y"). ### Core Process + The heart of the skill. This is the step-by-step workflow the agent follows. Must be specific and actionable — not vague advice. **Good:** "Run `npm test` and verify all tests pass" **Bad:** "Make sure the tests work" ### Common Rationalizations + The most distinctive feature of well-crafted skills. These are excuses agents use to skip important steps, paired with rebuttals. They prevent the agent from rationalizing its way out of following the process. Think of every time an agent has said "I'll add tests later" or "This is simple enough to skip the spec" — those go here with a factual counter-argument. ### Red Flags + Observable signs that the skill is being violated. Useful during code review and self-monitoring. ### Verification + The exit criteria. A checklist the agent uses to confirm the skill's process is complete. Every checkbox should be verifiable with evidence (test output, build result, screenshot, etc.). ## Supporting Files Create supporting files only when: + - Reference material exceeds 100 lines (keep the main SKILL.md focused) - Code tools or scripts are needed - Checklists are long enough to justify separate files diff --git a/third_party/agent-skills/hooks/SDD-CACHE.md b/third_party/agent-skills/hooks/SDD-CACHE.md index e0f69ac..d5d0cbf 100644 --- a/third_party/agent-skills/hooks/SDD-CACHE.md +++ b/third_party/agent-skills/hooks/SDD-CACHE.md @@ -39,7 +39,7 @@ This hook caches fetched content on disk, but **revalidates with the origin serv } ``` - `${CLAUDE_PROJECT_DIR}` resolves to the directory you launched Claude Code from. The snippet above works when the hooks live inside the same project. If you installed `agent-skills` elsewhere (e.g. as a shared plugin under `~/agent-skills`), replace `${CLAUDE_PROJECT_DIR}/hooks/...` with the absolute path to each script. +`${CLAUDE_PROJECT_DIR}` resolves to the directory you launched Claude Code from. The snippet above works when the hooks live inside the same project. If you installed `agent-skills` elsewhere (e.g. as a shared plugin under `~/agent-skills`), replace `${CLAUDE_PROJECT_DIR}/hooks/...` with the absolute path to each script. 2. Make sure `.claude/sdd-cache/` is in your `.gitignore` (already included in this repo). @@ -55,10 +55,10 @@ The stored body is not raw HTML — `WebFetch` post-processes each response thro One cache entry per URL, stored as JSON in `.claude/sdd-cache/.json`: -| Event | Action | -|---|---| -| `PreToolUse WebFetch` | If an entry exists, sends a `HEAD` request with `If-None-Match` / `If-Modified-Since`. On `304`, blocks the fetch and returns the cached content to the agent via stderr, with the original prompt surfaced as metadata. Otherwise allows the fetch. | -| `PostToolUse WebFetch` | Captures the response, issues a `HEAD` request to record the current `ETag` / `Last-Modified`, and stores `{url, prompt, etag, last_modified, content, fetched_at}`. | +| Event | Action | +| ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `PreToolUse WebFetch` | If an entry exists, sends a `HEAD` request with `If-None-Match` / `If-Modified-Since`. On `304`, blocks the fetch and returns the cached content to the agent via stderr, with the original prompt surfaced as metadata. Otherwise allows the fetch. | +| `PostToolUse WebFetch` | Captures the response, issues a `HEAD` request to record the current `ETag` / `Last-Modified`, and stores `{url, prompt, etag, last_modified, content, fetched_at}`. | **Freshness rules:** @@ -109,16 +109,21 @@ Expected: 6. Verify the second `WebFetch` is blocked and the cached content is returned (visible in the session transcript as a tool error with `[sdd-cache]` prefix). ### 3. Freshness verification + # Pick the entry you want to corrupt (swap in the actual filename) + ENTRY=.claude/sdd-cache/e49c9f378670cfbb1d7d871b6dee16d9.json # Patch its ETag to something the origin will not recognize + jq '.etag = "W/\"stale-etag-forced\""' "$ENTRY" > "$ENTRY.tmp" && mv "$ENTRY.tmp" "$ENTRY" # Next PreToolUse should miss (server returns 200, not 304) + echo '{"tool_input":{"url":"...", "prompt":"..."}}' | bash hooks/sdd-cache-pre.sh -echo "exit=$?" # expect 0 (fetch allowed through) -``` +echo "exit=$?" # expect 0 (fetch allowed through) + +```` ### 4. Debugging @@ -131,7 +136,7 @@ SDD_CACHE_DEBUG=1 claude # Option B: sentinel file (persistent) mkdir -p .claude/sdd-cache && touch .claude/sdd-cache/.debug # …disable with: rm .claude/sdd-cache/.debug -``` +```` The log captures URL, detected `tool_response` shape, HEAD status, and why each invocation hit or missed. Useful when a cache miss looks unexpected (typically: the origin stopped emitting validators). diff --git a/third_party/agent-skills/hooks/SIMPLIFY-IGNORE.md b/third_party/agent-skills/hooks/SIMPLIFY-IGNORE.md index 9e81af9..eea5458 100644 --- a/third_party/agent-skills/hooks/SIMPLIFY-IGNORE.md +++ b/third_party/agent-skills/hooks/SIMPLIFY-IGNORE.md @@ -24,18 +24,33 @@ result[3] = buf[3] ^ key[3]; "PreToolUse": [ { "matcher": "Read", - "hooks": [{ "type": "command", "command": "bash ${CLAUDE_PROJECT_DIR}/hooks/simplify-ignore.sh" }] + "hooks": [ + { + "type": "command", + "command": "bash ${CLAUDE_PROJECT_DIR}/hooks/simplify-ignore.sh" + } + ] } ], "PostToolUse": [ { "matcher": "Edit|Write", - "hooks": [{ "type": "command", "command": "bash ${CLAUDE_PROJECT_DIR}/hooks/simplify-ignore.sh" }] + "hooks": [ + { + "type": "command", + "command": "bash ${CLAUDE_PROJECT_DIR}/hooks/simplify-ignore.sh" + } + ] } ], "Stop": [ { - "hooks": [{ "type": "command", "command": "bash ${CLAUDE_PROJECT_DIR}/hooks/simplify-ignore.sh" }] + "hooks": [ + { + "type": "command", + "command": "bash ${CLAUDE_PROJECT_DIR}/hooks/simplify-ignore.sh" + } + ] } ] } @@ -50,19 +65,19 @@ result[3] = buf[3] ^ key[3]; One script, three hook events: -| Event | Action | -|---|---| -| `PreToolUse Read` | Backs up file, replaces blocks with `BLOCK_` placeholders in-place | +| Event | Action | +| ------------------------- | ------------------------------------------------------------------------- | +| `PreToolUse Read` | Backs up file, replaces blocks with `BLOCK_` placeholders in-place | | `PostToolUse Edit\|Write` | Expands placeholders back to real code, saves model's changes, re-filters | -| `Stop` | Restores all files from backup when session ends | +| `Stop` | Restores all files from backup when session ends | Each block is content-hashed (8 hex chars via `shasum`/`sha1sum`) so the round-trip is unambiguous even if the model duplicates or reorders placeholders. Cache is project-scoped to prevent cross-session interference. ## Annotation syntax ```js -/* simplify-ignore-start */ // basic — hides the block -/* simplify-ignore-start: reason */ // with reason — appears in placeholder +/* simplify-ignore-start */ // basic — hides the block +/* simplify-ignore-start: reason */ // with reason — appears in placeholder /* simplify-ignore-end */ ``` diff --git a/third_party/agent-skills/references/accessibility-checklist.md b/third_party/agent-skills/references/accessibility-checklist.md index c8c61e5..161a872 100644 --- a/third_party/agent-skills/references/accessibility-checklist.md +++ b/third_party/agent-skills/references/accessibility-checklist.md @@ -13,6 +13,7 @@ Quick reference for WCAG 2.1 AA compliance. Use alongside the `frontend-ui-engin ## Essential Checks ### Keyboard Navigation + - [ ] All interactive elements focusable via Tab key - [ ] Focus order follows visual/logical order - [ ] Focus is visible (outline/ring on focused elements) @@ -22,6 +23,7 @@ Quick reference for WCAG 2.1 AA compliance. Use alongside the `frontend-ui-engin - [ ] Modals trap focus while open, return focus on close ### Screen Readers + - [ ] All images have `alt` text (or `alt=""` for decorative images) - [ ] All form inputs have associated labels (`