PrePATH: A Toolkit for Preprocessing Whole Slide Images

Tip

🚀 Contribute Your Foundation Model! We welcome submissions of new pathology foundation models to our benchmark. 👉 Submit Your Model Here — Help advance the field by adding your model to PrePATH!

PrePATH is a comprehensive preprocessing toolkit for whole slide images (WSI), built upon CLAM and ASlide.

TODO

H0-mini
OpenMidnight
TITAN (Slide level)

Installation

Prerequisites

Anaconda or Miniconda
openslide-tools (system dependency)

Setup Instructions

The following instructions demonstrate installation for the GPFM model. For other foundation models, please refer to their respective repositories for environment-specific requirements.

git clone https://github.com/birkhoffkiki/PrePATH.git
cd PrePATH
conda create --name gpfm python=3.10
conda activate gpfm
pip install -r requirements/gpfm.txt
cd models/ckpts/
wget https://github.com/birkhoffkiki/GPFM/releases/download/ckpt/GPFM.pth

Notes:

ASlide should be installed as a Python package from GitHub and is included in requirements/gpfm.txt.
Environment configurations for other foundation models should be referenced from their respective repositories.

Usage

⚡Using PrePATH to extract Patch-Level Features

Step 1: Coordinate Extraction

Extract coordinates of foreground patches from whole slide images:

# Configure variables in the script before execution
bash scripts/get_coors/example.sh

Step 2: Feature Extraction

Extract patch-level features using the selected foundation model:

# Refer to the script for detailed configuration options
bash scripts/extract_feature/one_gpu_example.sh

If you have multiple GPUs, you can use the exe.sh script for parallel processing:

bash scripts/extract_feature/exe.sh

Step 3: (Optional) Extract patches and pack them into HDF5 files

This is useful for pretraining or if you meet the Corrupt JPEG data error during feature extraction.
This may happen for kfb or sdpc images due to limited support in multiprocessing.

# Refer to the script for detailed configuration options
bash scripts/crop_image/example_packed2h5.sh

Step 4: (Optional) Extract features from HDF5 packed patches

If you have packed patches into HDF5 files in Step 3, you can extract features from them directly:

# Refer to the script for detailed configuration options
bash scripts/extract_feature/one_gpu_from_h5_example.sh

⚡Extract patches directly without feature extraction (e.g., for pretraining)

Step 1: Coordinate Extraction

Extract coordinates of foreground patches from whole slide images:

# Configure variables in the script before execution
bash scripts/get_coors/example.sh

Step 2: Patch Extraction

Extract patches based on the coordinates:
We strongly recommend packing all patches using the HDF5 method for efficient storage and retrieval.

# Refer to the script for detailed configuration options
bash scripts/crop_image/example_packed2h5.sh

Supported Foundation Models

Note: Each foundation model requires its corresponding Python environment to be properly configured.

Model	Identifier	Reference
ResNet50	`resnet50`	Standard ImageNet pretrained model
GPFM	`gpfm`	GitHub
CTransPath	`ctranspath`	GitHub
PLIP	`plip`	GitHub
CONCH	`conch`	HuggingFace
CONCH-1.5	`conch15`	HuggingFace
UNI	`uni`	HuggingFace
UNI-2	`uni2`	HuggingFace
mSTAR	`mstar`	GitHub
Phikon	`phikon`	HuggingFace
Phikon2	`phikon2`	HuggingFace
Virchow-2	`virchow2`	HuggingFace
Prov-GigaPath	`gigapath`	HuggingFace
CHIEF	`chief`	GitHub
H-Optimus-0	`h-optimus-0`	HuggingFace
H0-mini	`h0-mini`	HuggingFace
H-Optimus-1	`h-optimus-1`	HuggingFace
OpenMidnight	`openmidnight`	HuggingFace
Lunit	`lunit`	GitHub
Hibou-L	`hibou-l`	GitHub
MUSK	`musk`	HuggingFace
OmiCLIP	`omiclip`	Github
PathoCLIP	`pathoclip`	Github

Supported WSI Formats

PrePATH supports the following whole slide image formats:

KFB (.kfb)
SDPC (.sdpc)
TRON (.tron)
All formats supported by OpenSlide (including .svs, .tiff, .ndpi, .vms, .vmu, .scn, .mrxs, .tif, .bif, and others)

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
assets		assets
configs		configs
datasets		datasets
documents		documents
models		models
presets		presets
requirements		requirements
scripts		scripts
utils		utils
wsi_core		wsi_core
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
create_patches_fp.py		create_patches_fp.py
extract_features_fp_fast.py		extract_features_fp_fast.py
extract_features_fp_from_packed_h5.py		extract_features_fp_from_packed_h5.py
extract_images_and_pack2h5.py		extract_images_and_pack2h5.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrePATH: A Toolkit for Preprocessing Whole Slide Images

TODO

Installation

Prerequisites

Setup Instructions

Usage

⚡Using PrePATH to extract Patch-Level Features

Step 1: Coordinate Extraction

Step 2: Feature Extraction

Step 3: (Optional) Extract patches and pack them into HDF5 files

Step 4: (Optional) Extract features from HDF5 packed patches

⚡Extract patches directly without feature extraction (e.g., for pretraining)

Step 1: Coordinate Extraction

Step 2: Patch Extraction

Supported Foundation Models

Supported WSI Formats

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

birkhoffkiki/PrePATH

Folders and files

Latest commit

History

Repository files navigation

PrePATH: A Toolkit for Preprocessing Whole Slide Images

TODO

Installation

Prerequisites

Setup Instructions

Usage

⚡Using PrePATH to extract Patch-Level Features

Step 1: Coordinate Extraction

Step 2: Feature Extraction

Step 3: (Optional) Extract patches and pack them into HDF5 files

Step 4: (Optional) Extract features from HDF5 packed patches

⚡Extract patches directly without feature extraction (e.g., for pretraining)

Step 1: Coordinate Extraction

Step 2: Patch Extraction

Supported Foundation Models

Supported WSI Formats

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages