Skip to content

Conversation

@nhorton
Copy link
Contributor

@nhorton nhorton commented Jan 16, 2026

Summary

Complete implementation of the rules system (renamed from "policy") with v2 markdown format:

  • Renamed policy → rules throughout codebase for clarity
  • V2 format: Individual .deepwork/rules/*.md files with YAML frontmatter instead of single .deepwork.rules.yml
  • Hook fixes: Exit code 0 for JSON format hooks (blocking via {"decision": "block"} not exit code)
  • Test coverage: Comprehensive tests with critical contract warnings
  • Test consolidation: Merged hook test files into single test_hooks.py

Rules v2 Format

---
name: Rule Name
trigger: "**/*.py"
compare_to: base  # or: default_tip, prompt
---
Instructions for the agent when this rule triggers.

Key Changes

Before After
.deepwork.rules.yml (single file) .deepwork/rules/*.md (individual files)
policy terminology rules terminology
Exit code 2 for blocking Exit code 0 + JSON {"decision": "block"}

Test plan

  • All 470 tests pass
  • Rules trigger correctly with compare_to: prompt mode
  • Hook JSON format follows Claude Code contract
  • Exit codes verified against documentation

🤖 Generated with Claude Code

@nhorton nhorton changed the title Plan and document policy system changes Implement rules system v2 with markdown format Jan 17, 2026
@nhorton nhorton marked this pull request as ready for review January 17, 2026 22:18
@nhorton nhorton added this pull request to the merge queue Jan 17, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 17, 2026
claude and others added 21 commits January 17, 2026 16:57
Design docs for next-generation policy system with:
- File correspondence matching (sets and pairs)
- Idempotent command execution
- Queue-based state tracking with detector/evaluator pattern
- Folder-based policy storage using frontmatter markdown files

Key changes from current system:
- Policies move from single .deepwork.policy.yml to .deepwork/policies/*.md
- YAML frontmatter for config, markdown body for instructions
- New 'set' syntax for bidirectional file relationships
- New 'pair' syntax for directional file relationships
- New 'action' field for running commands instead of prompts
- Queue system prevents duplicate policy triggers across sessions
Key changes:
- Restructure taxonomy: detection modes (trigger/safety, set, pair) + action types (prompt, command)
- Add required `name` field for human-friendly promise tag display (e.g., "✓ Source/Test Pairing")
- Remove priority and defer features (not needed yet)
- Clarify .deepwork/tmp is gitignored, so cleanup is not critical
- Shorten output format - group by policy name, use simple arrow notation for correspondence
- Update all examples to include name field
- Don't enforce idempotency, just document it as expected behavior
- Give lint formatters (black, ruff, prettier) as good examples
- Remove output_mode from config (not referenced elsewhere)
- Remove idempotency verification test scenarios
This implements the redesigned policy system with:

- Detection modes: trigger/safety (default), set (bidirectional), pair (directional)
- Action types: prompt (show instructions), command (run idempotent command)
- Variable pattern matching: {path} for multi-segment, {name} for single-segment
- Queue system in .deepwork/tmp/policy/queue/ for state tracking
- Frontmatter markdown format for policy files in .deepwork/policies/

New core modules:
- pattern_matcher.py: Variable pattern matching with regex
- policy_queue.py: Queue system for policy state persistence
- command_executor.py: Command action execution with substitution

Updates to existing modules:
- policy_parser.py: v2 Policy class with detection modes and action types
- policy_check.py: Uses new v2 system with queue deduplication
- evaluate_policies.py: Updated for v1 backward compatibility
- policy_schema.py: New frontmatter schema for v2 format

Tests updated to work with both v1 and v2 APIs.
- Update README.md with v2 policy examples and directory structure
- Update doc/architecture.md with v2 detection modes, action types, and queue system
- Bump version to 0.4.0 in pyproject.toml
- Add changelog entry for v2 policy system features
The hook now:
- Checks for v2 policies in .deepwork/policies/ first
- Falls back to v1 policies in .deepwork.policy.yml if no v2 found
- Passes JSON input directly to policy_check.py for v2 (via wrapper)
- Maintains existing behavior for v1 evaluate_policies.py
Remove all legacy v1 policy format (.deepwork.policy.yml) support:

- Remove evaluate_policies.py hook module
- Remove PolicyV1 class and parse_policy_file from policy_parser.py
- Remove v1 schema (POLICY_SCHEMA_V1) from policy_schema.py
- Remove v1 test fixtures and test_evaluate_policies.py
- Update test fixtures to use v2 frontmatter markdown format
- Update documentation to remove v1 references
- Fix policy_stop_hook.sh to handle exit code 2 (block) correctly

Only v2 frontmatter markdown format (.deepwork/policies/*.md) is now supported.
Rename all policy-related terminology to rules throughout the codebase:
- Rename deepwork_policy job to deepwork_rules
- Rename .deepwork.policy.yml to .deepwork.rules.yml
- Rename policy_parser.py, policy_queue.py, policy_check.py to rules_*
- Rename policy_schema.py to rules_schema.py
- Rename policy_stop_hook.sh to rules_stop_hook.sh
- Update all documentation, tests, and references

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous commit renamed deepwork_policy to deepwork_rules but left
duplicate hook entries in settings.json pointing to the old paths.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add 134 new tests covering test plan scenarios:
  - test_pattern_matcher.py: glob patterns, variable extraction, resolution
  - test_command_executor.py: variable substitution, command execution
  - test_rules_queue.py: queue entry lifecycle, hash calculation
  - test_schema_validation.py: required fields, mutual exclusivity
  - Extended test_rules_parser.py with correspondence sets/pairs tests

- Security: Add shlex.quote() to command_executor.py to prevent
  command injection via malicious file paths

- Fix ruff linting issues in pattern_matcher.py, rules_queue.py,
  and rules_check.py (f-strings, datetime.UTC, open mode)

- Update .gitignore comment from "policy" to "rules"

- Remove doc/test_scenarios.md (all scenarios now covered by tests)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace single .deepwork.rules.yml (v1) with individual .md files
  in .deepwork/rules/ directory (v2 frontmatter markdown format)

- Update install.py to create rules directory structure with:
  - README explaining v2 format
  - Example templates (.md.example files)

- Add v2 example templates in standard_jobs/deepwork_rules/rules/:
  - readme-documentation.md.example (trigger/safety mode)
  - api-documentation-sync.md.example (trigger/safety mode)
  - security-review.md.example (trigger-only mode)
  - source-test-pairing.md.example (set/bidirectional mode)

- Completely rewrite deepwork_rules.define step for v2 format:
  - Detection mode selection (trigger/safety, set, pair)
  - Variable pattern syntax ({path}, {name})
  - Updated examples and file location guidance

- Migrate this repo's bespoke rules to v2:
  - readme-accuracy.md
  - architecture-documentation-accuracy.md
  - standard-jobs-source-of-truth.md
  - version-and-changelog-update.md

- Remove deprecated src/deepwork/templates/default_rules.yml

- Update integration tests for v2 directory structure

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Hooks using JSON output format should always exit with code 0.
The blocking behavior is controlled by the "decision" field in the
JSON output, not the exit code.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add prominent warning comments to test files that verify Claude Code hook
JSON format and exit code contracts. These comments reference the official
documentation and clearly mark tests that should not be modified without
consulting the hook specification.

Files updated:
- tests/shell_script_tests/test_hooks_json_format.py
- tests/shell_script_tests/test_hook_wrappers.py
- tests/unit/test_hook_wrapper.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Consolidate test_hooks_json_format.py and test_hook_wrappers.py into a
single test_hooks.py file with logical organization:

- TestClaudeHookWrapper / TestGeminiHookWrapper: Platform wrapper scripts
- TestRulesStopHook / TestUserPromptSubmitHook: Rules-specific hooks
- TestHooksWithTranscript: Transcript input handling
- TestHookExitCodes: Exit code contract tests (DO NOT EDIT)
- TestHookWrapperIntegration: Integration tests with Python hooks
- TestRulesCheckModule: Python module tests

Also moved hooks_dir and src_dir fixtures to conftest.py for sharing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Add manual test files for testing hook/rule functionality

Creates manual_tests/claude/ directory with test files that exercise
different rule styles:
- Trigger/Safety mode (basic conditional)
- Set mode (bidirectional correspondence)
- Pair mode (directional correspondence)
- Command action (automatic command execution)
- Multi-safety (multiple safety patterns)

Each test file includes documentation explaining what it tests,
how to trigger it, and expected behavior. Corresponding rule
definitions added to .deepwork/rules/.

* Move manual test files from manual_tests/claude/ to manual_tests/

Flatten directory structure as requested. Updated all rule definitions
to reference the new paths.

* Reorganize manual tests into subfolders per test type

Group related files together:
- test_trigger_safety_mode/
- test_set_mode/
- test_pair_mode/
- test_command_action/
- test_multi_safety/

Updated rule definitions and README to match new structure.

* Add compare_to: prompt to manual test rules

This ensures rules evaluate against changes since the last prompt
rather than against the merge-base, allowing them to fire during
the current conversation when files are edited.

* Add sub-agent testing instructions to manual tests README

Explains that the best way to run these tests is as sub-agents
using a fast model (haiku), with example prompts and verification
commands.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Update manual test files with both-case test instructions

- Updated README with test matrix showing expected results
- Added TEST CASE sections to each test file documenting both
  "should fire" and "should NOT fire" scenarios
- Added test results tracking table to README

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
@nhorton nhorton force-pushed the claude/policy-system-planning-T8939 branch from ca40769 to 66f2032 Compare January 17, 2026 23:58
@nhorton nhorton merged commit b7f8cdb into main Jan 17, 2026
4 checks passed
@nhorton nhorton deleted the claude/policy-system-planning-T8939 branch January 17, 2026 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants