-
Notifications
You must be signed in to change notification settings - Fork 51
Chunkwise image loader #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Chunkwise image loader #279
Conversation
… image-reader-chunkwise
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #279 +/- ##
==========================================
+ Coverage 55.16% 62.91% +7.74%
==========================================
Files 26 27 +1
Lines 2844 3117 +273
==========================================
+ Hits 1569 1961 +392
+ Misses 1275 1156 -119
🚀 New features to boost your workflow:
|
melonora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! I have 2 minor suggestions. I also saw that you use the width by height convention. Personally, I don't have a strong opinion here, though we could also stick to array api conventions. @LucaMarconato WDYT? Pre-approving for now.
melonora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry had to change due to rethinking memmap. This does not always work, for example when dealing with compressed tiffs as far as I am aware.
Co-authored-by: Wouter-Michiel Vierdag <w-mv@hotmail.com>
|
Hi @LucaMarconato - Just saw that you pushed some updates. Could you comment on the current implementation/is this PR still interesting for you? |
|
@lucas-diedrich yes still interested in it, I have time now to go through it. I added my "code review" in terms of "TODO" items for myself. I'll check the code and push some modifications if I see that some minor fixes are needed; I'll comment eventual larger changes, but the code looks good. I'll also do a benchmark with |
|
There was a problem with the dimension of the returned data. |
|
@lucas-diedrich I'm done with edits, please double check the changes if you have time. The main comment, added in the notes in In summary, redefining I will now do the benchmark with |
…e computation order of chunks/assembly
|
Hi @LucaMarconato, thanks for implementing the changes and apologies for the confusing coords convention! The tests are currently failing when I pass an asymmetric chunk size (e.g |
…tion. Switch axes order everywhere from (x, y) to (y, x)
…d documentation 1. Enforce standard dimension order convention (y, x). 2. Change local variable names to better distinguish between chunk-level coordinates and pixel-level coordinates - All chunk indices are indicated as such with the prefix. - The pixel-coordinates of individual chunks are now consistently named (y, x, height, width)
…e it self-documenting
Description
This PR addresses the challenge that the currently implemented and planned image loaders require loading imaging data entirely into memory, typically as NumPy arrays. Given the large size of microscopy datasets, this is not always feasible.
To mitigate this issue, and as discussed with @LucaMarconato, this PR aims to introduce a generalizable approach for reading large microscopy files in chunks, enabling efficient handling of data that does not fit into memory.
Some related discussions.
Strategy
In this PR, we focus on
.tiffimages, as implemented in the_tiff_to_chunksfunction.tifffile.memmap)_compute_chunks)dask.arraywhich is memory-mapped and avoids memory overflow (_read_chunks)dask.array(viadask.array.block)The strategy is implemented in
src/spatialdata_io/readers/generic.pyandsrc/spatialdata_io/readers/_utils/_image.pyFuture extensions
The strategy can be implemented for any image type, as long as it is possible to implement
We have implemented similar readers for openslide-compatible whole slide images and the Carl-Zeiss microscopy format.