Skip to content

Conversation

@400Ping
Copy link
Member

@400Ping 400Ping commented Jan 23, 2026

Purpose of PR

  • lib.rs: obtain PyTorch’s current CUDA stream pointer and route CUDA-tensor encoding through stream-aware core APIs to prevent cross-stream races.
  • lib.rs: add encode_from_gpu_ptr_with_stream and encode_batch_from_gpu_ptr_with_stream, and synchronize on the provided stream; keep default-stream entry points intact.
  • amplitude.rs: add a stream-aware GPU norm path so normalization and kernels run on the same stream before host reads.

Related Issues or PRs

Related to #726

Changes Made

  • Bug fix
  • New feature
  • Refactoring
  • Documentation
  • Test
  • CI/CD pipeline
  • Other

Breaking Changes

  • Yes
  • No

Checklist

  • Added or updated unit tests for all changes
  • Added or updated documentation for all changes
  • Successfully built and ran all unit tests or manual tests locally
  • PR title follows "MAHOUT-XXX: Brief Description" format (if related to an issue)
  • Code follows ASF guidelines

Signed-off-by: 400Ping <fourhundredping@gmail.com>
@400Ping 400Ping changed the title [QDP] GPU pointer validation stream sync [QDP] GPU Pointer Validation Stream Sync Jan 23, 2026
@400Ping 400Ping closed this Jan 23, 2026
@400Ping 400Ping reopened this Jan 23, 2026
@400Ping 400Ping changed the title [QDP] GPU Pointer Validation Stream Sync [QDP] PyTorch CUDA stream‑aware encode for GPU tensors Jan 23, 2026
@rich7420
Copy link
Contributor

@400Ping thanks for the patch!
plz fix ci errors and conflicts.

@400Ping
Copy link
Member Author

400Ping commented Jan 25, 2026

@CheyuWu PTAL

Signed-off-by: 400Ping <fourhundredping@gmail.com>
Signed-off-by: 400Ping <fourhundredping@gmail.com>
Copy link
Contributor

@rich7420 rich7420 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@400Ping thanks for the patch!
overall LGTM
But I think we could add more unit tests for new feature here.

Signed-off-by: 400Ping <jiekaichang@apache.org>
@400Ping 400Ping requested a review from rich7420 January 28, 2026 16:05
@guan404ming guan404ming added this to the Qumat 0.5.1 milestone Jan 29, 2026
@rich7420
Copy link
Contributor

rich7420 commented Jan 29, 2026

I think in this pr should add tests for the feature. like I mentioned previously.

Copy link
Contributor

@viiccwen viiccwen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM!
left comments.

@cursor cursor bot force-pushed the qdp/pytorch-direct-gpu-stream-sync branch from cd52809 to 4b71ad0 Compare January 29, 2026 10:05
@400Ping 400Ping force-pushed the qdp/pytorch-direct-gpu-stream-sync branch from cd52809 to 4b71ad0 Compare January 29, 2026 11:41
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants