Skip to content

Conversation

@northdpole
Copy link
Collaborator

No description provided.

@PRAteek-singHWY
Copy link
Contributor

PRAteek-singHWY commented Feb 1, 2026

@northdpole
Thanks a lot for sharing this sir, this is extremely helpful and very well structured.

I've gone through the RFC and it gives a clear architectural and experimental framework to build the proposal around. I'll spend some time digesting it in detail and start aligning my work proposal with this design and the pre-code experiments outlined here.

@PRAteek-singHWY
Copy link
Contributor

@northdpole

Thanks for putting this together Sir, the experimental framework is really clear.

I’m particularly interested in Module C (The Librarian) and want to start with the suggested pre-code experiments before proposing any concrete design or implementation.

The negation problem stands out — I’ve worked on gap analysis features before (#716) and have seen how basic similarity metrics can struggle with logical inversions in requirements (e.g., “Use X” vs “Do NOT use X”).

Plan:
I’ll start with the ASVS re-classification experiment:

  • Extract 50 ASVS requirements and strip metadata
  • Baseline: vector search with cosine similarity
  • Comparison: cross-encoder re-ranking (ms-marco-MiniLM-L-6-v2)
  • Target: >20% accuracy improvement on negative requirements

If the experiment is successful, I’m also interested in exploring hybrid search (vector + BM25), especially for cases like CVE identifiers where pure vector search often underperforms.

I'll take this up step by step .

I’ll share experiment results and observations before proposing any implementation.

I’m using AI tools (similar to Cursor/Windsurf) and have read Section 3.

Thank you .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants