wsipipe.preprocess.patching package
Patches are generated according to settings of patch finders. Patches are then stored as patchsets.
patch_finder module
Patch Finders describe how patches are created for a slide.
They work on a labelled image, that is a numpy array with integers giving the annotation category for each pixel.
The input labelled image can be at any level of the pyramid for which a numpy array for that size can fit into memory.
A patch finder will create a dataframe with columns x, y, label where x and y represents the top left corner of the patch and label is the label applied to the patch.
- class GridPatchFinder(labels_level, patch_level, patch_size, stride, border=0, jitter=0, remove_background=True, pool_mode='max')[source]
Bases:
PatchFinder
- Parameters
labels_level (int) –
patch_level (int) –
patch_size (int) –
stride (int) –
border (int) –
jitter (int) –
remove_background (bool) –
pool_mode (str) –
- class PatchFinder[source]
Bases:
object
Generic patch finder class
- Parameters
labels_image (np.array) – The whole slide image represented as a 2d numpy array, the classification is given by an integer. For example an image such as those output by AnnotationSet.render
slide_shape (Size) – The size of the WSI at the level at which the labels are rendered. This may be different to the labels image shape, as the labels image may not include blank parts of the slide in the bottom right.
- abstract property labels_level
- class RandomPatchFinder(labels_level, patch_level, patch_size, border=0, npatches=1000, pool_mode='mode')[source]
Bases:
PatchFinder
- Parameters
labels_level (int) –
patch_level (int) –
patch_size (int) –
border (int) –
npatches (int) –
pool_mode (str) –
patchset module
PatchSets are sets of patches and all the information required to create them from the slides.
- Many patches in the set may use the same details, (which we call PatchSettings):
the path of the slide to read from
the level of the slide at which to create the patch
the size of the patch to be created
how to load the slide
- To create an individual patch, you need to know:
the top left position of the patch
the label to be applied to the patch
Therefore the PatchSets are a dataframe and a settings list.
- The settings list is a list of PatchSettings each of which contains:
slide_path, level, patch_size, loader
- In the dataframe each row represents a patch and contains columns:
x (top), y (left), label, settings (index to list)
- class PatchSet(df, settings)[source]
Bases:
object
- Parameters
df (pandas.DataFrame) –
settings (List[PatchSetting]) –
- description()[source]
Returns basic summary of patchset
returns the labels and the total number of patches of each label
- export_patches(output_dir)[source]
Creates all patches in a patch set
Writes patches in subdirectories of their label Patches are name slide_path_x_y_level_patch_size.png
- Parameters
output_dir (Path) – the directory in which the patches are saved
- Return type
None
- class PatchSetting(level, patch_size, slide_path, loader)[source]
Bases:
object
Patch Setting Definition
- Parameters
level (int) – The level at which patches are extracted
patch_size (int) – The size of patches to be created assumes square
slide_path (Path) – the path to the whole slide image
loader (Loader) – A method for loading the slide
- classmethod from_sdict(sdict)[source]
Converts a dictionary to a PatchSetting
- Parameters
sdict (dict) –
- level: int
- patch_size: int
- slide_path: Path
patchset_utils module
Utilities for creating sets of patches
- combine(patchsets)[source]
Combines multiple patchsets into one
This gives a combined dataframe with all patches in a dataset, for example to use to sample patches. It also renumbers settings so that indexes in dataframe match correct setting in combined_settings list
- Parameters
patchsets (List[PatchSets]) – A list of PatchSets
- Returns
A combined patchset
- Return type
- load_patchsets_from_directory(patchsets_dir)[source]
Loads PatchSets from a directory
Loads patchsets for a whole dataset stored in subdirectories of patchsets_dir
- Parameters
patchsets_dir (Path) – a path to a directory containing subdirectories with PatchSets
- Returns
A list of PatchSets one for each slide
- Return type
patchset (List[PatchSet])
- make_and_save_patchsets_for_dataset(dataset, loader, tissue_detector, patch_finder, output_dir, project_root=PosixPath('/'))[source]
Creates PatchSets for all slides in a dataset
For each slide in the dataset this creates the PatchSet then saves it in a sub directory of the output_dir
- Parameters
dataset (pd.DataFrame) – a dataframe containing columns slide and annotation
loader (Loader) – loader to use to load slide and annotations
tissue_detector (TissueDetector) – tissue detector to use to remove background
patch_finder (PatchFinder) – patch finder to use to create patches
output_dir (Path) – a directory to save the patchsets in
project_root (Path, optional) – paths will be stored relative to the project root. Defaults to root (absolute paths)
- Returns
A list of PatchSets one for each slide
- Return type
patchset (List[PatchSet])
- make_patchset_for_slide(slide_path, annot_path, loader, tissue_detector, patch_finder, project_root=PosixPath('/'))[source]
Creates a patchset for a single slide
This creates a PatchSet for a single slide.
- Parameters
slide_path (Path) – path to whole slide image
annot_path (Path) – annotation information for slide
loader (Loader) – loader to use to load slide and annotations
tissue_detector (TissueDetector) – tissue detector to use to remove background
patch_finder (PatchFinder) – patch finder to use to create patches
project_root (Path, optional) – paths will be stored relative to the project root. Defaults to root (absolute paths)
- Returns
A PatchSet for the slide
- Return type
patchset (PatchSet)
- make_patchsets_for_dataset(dataset, loader, tissue_detector, patch_finder, project_root=PosixPath('/'))[source]
Creates PatchSets for all slides in a dataset
For each slide in the dataset this creates the PatchSet
- Parameters
dataset (pd.DataFrame) – a dataframe containing columns slide and annotation
loader (Loader) – loader to use to load slide and annotations
tissue_detector (TissueDetector) – tissue detector to use to remove background
patch_finder (PatchFinder) – patch finder to use to create patches
project_root (Path, optional) – paths will be stored relative to the project root. Defaults to root (absolute paths)
- Returns
A list of PatchSets one for each slide
- Return type
patchset (List[PatchSet])
- visualise_patches_on_slide(ps, vis_level, project_root=PosixPath('/'))[source]
Draws patches on a thumbnail of the slide
Visualise where on the slide the patches occur. Assumes a patch set for one slide with only one set of setting
- Parameters
ps (PatchSet) – A PatchSet for one slide
vis_level (int) – the level at which to create a slide image to draw patches on
project_root (Path) –
- Returns
A thumbnail of the slide with patch locations drawn on
- Return type
thumb (Image)