Skip to content

Conversation

@aswinshakil
Copy link
Member

What changes were proposed in this pull request?

SnapshotDeletingService incorrectly checks the Ratis Buffer Limit when we invoke submitSnapshotMoveDeletedKeys , it adds all the keys to the builder and then checks the size of the message. Later, it again adds more deletedKeys, deletedDirs and renamedKeys on it, The message size is already over the ratis buffer limit, instead of creating a new builder we re-use the same builder. Ideally we should stop iterating through the snapshot-related tables when we hit the buffer limit even before going to submitSnapshotMoveDeletedKeys

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14432

How was this patch tested?

Existing Tests

@aswinshakil aswinshakil requested a review from smengcl January 21, 2026 17:46
@aswinshakil aswinshakil self-assigned this Jan 21, 2026
@aswinshakil aswinshakil added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Jan 21, 2026
@smengcl smengcl requested a review from Copilot January 21, 2026 21:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect Ratis buffer limit checking in SnapshotDeletingService. The old implementation would add all snapshot-related entries to a request builder before checking if the message size exceeded the buffer limit, then attempted to recover by truncating the lists. This approach was flawed as it could create oversized messages.

Changes:

  • Replaced the flawed size-checking logic with progressive batch building that checks size limits before adding each entry
  • Introduced proper batching across deletedKeys, renamedKeys, and deletedDirs with early termination on failures
  • Improved error handling by returning the count of successfully submitted entries for accurate retry tracking

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@aswinshakil aswinshakil requested a review from smengcl January 27, 2026 19:10
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Could you add a test case for this fix? (which would have failed without this fix)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants