Skip to content

Conversation

@nearestnabors
Copy link
Contributor

@nearestnabors nearestnabors commented Jan 23, 2026

Summary

  • Adds safeguards to prevent the vale-style-review automation from removing code blocks from documentation
  • The LLM was making overly aggressive suggestions that removed entire code examples (as seen in PR Editorial improvements for #698 #700)

Changes

  • Add getLinesInCodeBlocks() function to detect lines inside fenced code blocks
  • Skip any suggestion targeting lines inside code blocks
  • Reject suggestions where suggested text is <70% of original length
  • Strengthen LLM prompt with explicit "NEVER MODIFY CODE" instructions
  • Extract validation helpers to reduce cognitive complexity

Test plan

  • Run pnpm vale:review --pr <number> on a PR with code blocks and verify no code is modified
  • Verify suggestions are still generated for prose content outside code blocks

🤖 Generated with Claude Code


Note

Hardens the vale-style-review script to avoid destructive edits and never touch code blocks.

  • Detects fenced code blocks (```/~~~) per CommonMark (indent ≤3, tab stops of 4) via new helpers: countLeadingWhitespace, `countLeadingFenceChars`, `isFenceCandidate`, `isValidClosingFence`, `getLinesInCodeBlocks`; skips any suggestions on those lines
  • Strengthens LLM prompt with explicit “NEVER MODIFY CODE” and guidance to omit code-related issues
  • Adds validation helpers: hasValidFields, wouldRemoveContent (≥70% length), isDestructiveChange (>50% length change) and applies them in formatReviewComments
  • Introduces constants MIN_SUGGESTED_LENGTH_RATIO, MAX_FENCE_INDENT, TAB_STOP_WIDTH; adjusts logging and filtering but preserves overall workflow

Written by Cursor Bugbot for commit 606e9b4. This will update automatically on new commits. Configure here.

The LLM was making overly aggressive suggestions that removed entire
code blocks from documentation. This adds multiple safeguards:

- Add getLinesInCodeBlocks() to detect lines inside fenced code blocks
- Skip any suggestion targeting lines inside code blocks
- Reject suggestions where suggested text is <70% of original length
- Strengthen LLM prompt with explicit "NEVER MODIFY CODE" instructions
- Extract validation helpers to reduce cognitive complexity

Fixes the issue where PR #700 removed 173 lines of code examples.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Jan 23, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
docs Ready Ready Preview, Comment Jan 23, 2026 4:23pm

Request Review

cursor[bot]

This comment was marked as outdated.

…tyle-review

- Make isDestructiveChange use trimmed lengths to match wouldRemoveContent
- Update prompt to reflect actual validation logic (allows 30% shorter text)

This fixes false positives where whitespace-only changes were incorrectly rejected,
and aligns the LLM prompt with the actual 70% length threshold in the code.
@cursor
Copy link

cursor bot commented Jan 23, 2026

Bugbot Autofix resolved both bugs found in the latest run.

  • ✅ Fixed: Inconsistent length validation with trimmed vs untrimmed strings
    • Updated isDestructiveChange to use trimmed lengths, making it consistent with wouldRemoveContent.
  • ✅ Fixed: Prompt contradicts actual validation logic for length
    • Changed prompt from 'NEVER be shorter' to 'should not be significantly shorter' to match the actual 70% threshold.

cursor[bot]

This comment was marked as outdated.

The function now tracks the opening backtick count and only closes a code block when encountering a line with at least as many backticks. This prevents nested code blocks (e.g., 4+ backticks wrapping 3-backtick blocks) from being incorrectly detected.
@cursor
Copy link

cursor bot commented Jan 23, 2026

Bugbot Autofix resolved the bug found in the latest run.

  • ✅ Fixed: Nested code blocks incorrectly detected
    • Fixed by tracking opening backtick count and only closing code blocks when encountering lines with at least as many backticks, preventing nested blocks from being incorrectly parsed.

cursor[bot]

This comment was marked as outdated.

- Enforce 3-space indentation limit for code fences per CommonMark spec
- Validate closing fences only accept whitespace after backticks
- Prevent misidentification of indented backticks as fence markers
- Ensure non-whitespace content after backticks keeps fence open
@cursor
Copy link

cursor bot commented Jan 23, 2026

Bugbot Autofix resolved both of the 2 bugs found in the latest run.

  • ✅ Fixed: Closing fence incorrectly accepts non-whitespace content
    • Added validation to ensure closing fences only accept whitespace after backticks per CommonMark specification.
  • ✅ Fixed: Indentation limit not enforced for code fences
    • Implemented leading space counting to enforce the 3-space indentation limit for code fence markers per CommonMark specification.

cursor[bot]

This comment was marked as outdated.

- Extract magic numbers to constants (MAX_FENCE_INDENT, TAB_STOP_WIDTH)
- Reduce cognitive complexity in getLinesInCodeBlocks by extracting helper functions
- Fix line formatting in isDestructiveChange

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is ON. A Cloud Agent has been kicked off to fix the reported issue.

The getLinesInCodeBlocks function now detects both backtick ( and ~~~
- Enhanced isValidClosingFence to match opening fence character
- Updated getLinesInCodeBlocks to track fence type
@cursor
Copy link

cursor bot commented Jan 23, 2026

Bugbot Autofix resolved the bug found in the latest run.

  • ✅ Fixed: Code block detection misses tilde-fenced blocks
    • Updated code block detection to support both backtick (```) and tilde (~~~) fenced code blocks per CommonMark specification, ensuring Vale suggestions are never applied to code inside either fence type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/.vale.ini

Line 18 in 35be995

BlockIgnores = (?s)```[\s\S]*?```
should be telling Vale to ignore code blocks already.

I think maybe it's missing parentheses around the rule:

(?s)([\s\S]*?)

or

(?s)*([\s\S]*?)

if there may be leading whitespace (I don't think that would be true for a code block)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code:

The issue: Vale's BlockIgnores regex doesn't work for fenced code blocks because:

  1. For .md files: Vale parses Markdown natively and creates code scopes. IgnoredScopes = code
    works for most content, BUT lines starting with # inside code blocks get misidentified as
    headings.
  2. For .mdx files: Vale doesn't have native MDX support - it treats MDX as plain text, so:
    - No code scopes are created → IgnoredScopes = code doesn't help
    - BlockIgnores regex is supposed to work but has known issues
    (BlockIgnores not working for .mdx files errata-ai/vale#115, Vale is picking up errors for stuff in code blocks in GitHub markdown errata-ai/vale#387)

The bottom line: The commenter's suggestion about parentheses won't fix it - it's a Vale
limitation for MDX files. That's exactly why this PR's vale-style-review.ts script adds its own
getLinesInCodeBlocks() function to detect code blocks independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants