-
-
Notifications
You must be signed in to change notification settings - Fork 376
fix/nested shard reads #3655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix/nested shard reads #3655
Conversation
| "array_fixture", | ||
| [ | ||
| ArrayRequest(shape=(128,) * 3, dtype="uint16", order="F"), | ||
| ArrayRequest(shape=(127, 128, 129), dtype="uint16", order="F"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by making the shape of the array irregular w.r.t to the chunk shape, we hit the partial decode path, which required for evoking the bug reported in #3652
| spath, | ||
| data=data, | ||
| chunks=(64,) * data.ndim, | ||
| compressors=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setting compressors to None here is also required to trigger the partial decode path.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3655 +/- ##
==========================================
- Coverage 60.88% 60.55% -0.34%
==========================================
Files 86 86
Lines 10182 10226 +44
==========================================
- Hits 6199 6192 -7
- Misses 3983 4034 +51
🚀 New features to boost your workflow:
|
2cfd6d5 to
4379a66
Compare
…x/nested-shard-reads
… into fix/nested-shard-reads
Fixes #3652 by properly allowing byte range requests in the partial decoding pathway of the sharding codec.
The failure was caused by a certain store-like class (defined just for sharding) failing to handle byte range requests.
There are 2 ways to fix this. One leaves a lot of the cruft in the sharding codec intact and just adds modifies some existing methods. But the other approach removes cruft, and re-uses stuff we have already defined elsewhere in the codebase: the Store API.
That's what this PR does.Edit -- this PR now keeps the cruft, with the goal of keeping things small and simple.In a later PR...,
ShardingByteGetterandShardingByteSettercan be removed, and instead the sharding codec creates Store classes as part of its decoding process. When reading a full shard, a_ShardReaderStoreis created, which maps string keys (the names of chunks) onto contiguous ranges in a buffer (the shard bytes). When reading a partial shard, the separate chunk byte regions are used to create aMemoryStore.Using the
StoreAPI here means nested sharding just works, and we get an easy way to add things like caching later.