-
Notifications
You must be signed in to change notification settings - Fork 979
[QDP] PyTorch CUDA stream‑aware encode for GPU tensors #930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: 400Ping <fourhundredping@gmail.com>
|
@400Ping thanks for the patch! |
|
@CheyuWu PTAL |
Signed-off-by: 400Ping <fourhundredping@gmail.com>
rich7420
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@400Ping thanks for the patch!
overall LGTM
But I think we could add more unit tests for new feature here.
|
I think in this pr should add tests for the feature. like I mentioned previously. |
viiccwen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM!
left comments.
cd52809 to
4b71ad0
Compare
cd52809 to
4b71ad0
Compare
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Signed-off-by: 400Ping <jiekaichang@apache.org>
Purpose of PR
lib.rs: obtain PyTorch’s current CUDA stream pointer and route CUDA-tensor encoding through stream-aware core APIs to prevent cross-stream races.lib.rs: addencode_from_gpu_ptr_with_streamandencode_batch_from_gpu_ptr_with_stream, and synchronize on the provided stream; keep default-stream entry points intact.amplitude.rs: add a stream-aware GPU norm path so normalization and kernels run on the same stream before host reads.Related Issues or PRs
Related to #726
Changes Made
Breaking Changes
Checklist