perf: Add attribute caching to reduce runtime patching overhead #23

yeahdongcn · 2026-01-29T09:06:19Z

Description

This PR adds caching optimizations to reduce the overhead of torchada's runtime patching for frequently accessed attributes.

Changes

Add attribute caching to _CudaModuleWrapper for torch.cuda.* access
Add attribute caching to _CudartWrapper for torch.cuda.cudart() function lookups
Add attribute caching to _CDLLWrapper for ctypes function name translation
Cache torch.backends.cuda.is_built() result (constant at runtime)
Add string translation cache to _translate_device() for common device strings

Performance Improvements

Operation	Before	After	Speedup
`torch.cuda.Stream` (attr)	842ns	131ns	6.4x
`torch.cuda.device_count()`	855ns	147ns	5.8x
`cudart.cudaHostRegister`	385ns	84ns	4.6x
`torch.backends.cuda.is_built()`	301ns	154ns	2.0x

All operations now complete in <700ns, which is negligible compared to GPU kernel launch times (5,000-20,000ns).

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

yeahdongcn · 2026-01-29T11:35:18Z

Also tested along with SGLang, everything works as expected.

perf: Add attribute caching to reduce runtime patching overhead

94ee94d

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

yeahdongcn requested a review from yafengio January 29, 2026 09:06

yeahdongcn added 2 commits January 29, 2026 18:42

upd

bdf8863

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

Bump version

383222b

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

yafengio approved these changes Jan 29, 2026

View reviewed changes

yeahdongcn merged commit 09d0e02 into main Jan 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Add attribute caching to reduce runtime patching overhead #23

perf: Add attribute caching to reduce runtime patching overhead #23

yeahdongcn commented Jan 29, 2026

Uh oh!

yeahdongcn commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf: Add attribute caching to reduce runtime patching overhead #23

perf: Add attribute caching to reduce runtime patching overhead #23

Conversation

yeahdongcn commented Jan 29, 2026

Description

Changes

Performance Improvements

Uh oh!

yeahdongcn commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants