simlearner3d.processing

Processing includes images normalization, standardization as well as augmentation routines adapted for stereo matching

simlearner3d.processing.datamodule.hdf5

class simlearner3d.processing.datamodule.hdf5.HDF5StereoDataModule(data_dir: str, split_csv_path: str, hdf5_file_path: str, tile_width: numbers.Number = 1024, tile_height: numbers.Number = 1024, patch_size: numbers.Number = 768, sign_disp_multiplier: numbers.Number = 1, masq_divider: numbers.Number = 1, subtile_overlap_train: numbers.Number = 0, subtile_overlap_predict: numbers.Number = 0, batch_size: int = 12, num_workers: int = 1, prefetch_factor: int = 2, transforms: Optional[Dict[str, List[Callable]]] = None, **kwargs)[source]

Datamodule to feed train and validation data to the model.

property dataset: simlearner3d.processing.dataset.hdf5.HDF5Dataset

Abstraction to ease HDF5 dataset instantiation.

Parameters: image_paths_by_split_dict (IMAGE_PATHS_BY_SPLIT_DICT_TYPE, optional) – Maps split (val/train/test) to file path. If specified, the hdf5 file is created at dataset initialization time. Otherwise,a precomputed HDF5 file is used directly without I/O to the HDF5 file. This is usefule for multi-GPU training, where data creation is performed in prepare_data method, and the dataset is then loaded again in each GPU in setup method. Defaults to None.
Returns: the dataset with train, val, and test data.
Return type: HDF5Dataset

prepare_data(stage: Optional[str] = None)[source]: Prepare dataset containing train, val, test data.

setup(stage: Optional[str] = None) → None[source]: Instantiate the (already prepared) dataset (called on all GPUs).

test_dataloader()[source]

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

test()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader()[source]

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set reload_dataloaders_every_n_epochs to a positive integer.

For data processing use the following pattern:

download in prepare_data()

process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

fit()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

val_dataloader()[source]

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set reload_dataloaders_every_n_epochs to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

fit()
validate()
prepare_data()
setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

simlearner3d.processing.dataset.hdf5

class simlearner3d.processing.dataset.hdf5.HDF5Dataset(hdf5_file_path: str, image_paths_by_split_dict: typing.Dict[typing.Union[typing.Literal['train'], typing.Literal['val'], typing.Literal['test']], typing.List[typing.Tuple[str]]], images_pre_transform: typing.Callable = <function read_images_and_create_full_data_obj>, tile_width: numbers.Number = 1024, tile_height: numbers.Number = 1024, patch_size: numbers.Number = 768, subtile_width: numbers.Number = 50, sign_disp_multiplier: numbers.Number = 1, masq_divider: numbers.Number = 1, subtile_overlap_train: numbers.Number = 0, train_transform: typing.Optional[typing.List[typing.Callable]] = None, eval_transform: typing.Optional[typing.List[typing.Callable]] = None)[source]

Single-file HDF5 dataset for collections of large LAS tiles.

property samples_hdf5_paths: Index all samples in the dataset, if not already done before.

simlearner3d.processing.dataset.hdf5.create_hdf5(image_paths_by_split_dict: dict, hdf5_file_path: str, tile_width: numbers.Number = 1024, tile_height: numbers.Number = 1024, patch_size: numbers.Number = 768, subtile_width: numbers.Number = 50, subtile_overlap_train: numbers.Number = 0, images_pre_transform: typing.Callable = <function read_images_and_create_full_data_obj>)[source]

Create a HDF5 dataset file from left, right , disparities and masqs.

Args: image_paths_by_split_dict ([IMAGE_PATHS_BY_SPLIT_DICT_TYPE]): should look like

image_paths_by_split_dict = {‘train’: [(‘dir/left.tif’,’dir/right.tif’,’dir/disp1.tif’,’dir/msq1.tif’),…..], ‘test’: […]},

hdf5_file_path (str): path to HDF5 dataset, tile_width: (Number, optional): width of an image tile. 1024 by default, tile_height: (Number, optional): height of an image tile. 1024 by default, patch_size: (Number, optional): considered subtile size for training subtile_width: (Number, optional): effective width of a subtile (i.e. receptive field). 50 by default, pre_filter: Function to filter out specific subtiles. “pre_filter_below_n_points” by default, subtile_overlap_train (Number, optional): Overlap for data augmentation of train set. 0 by default, images_pre_transform (Callable): Function to load images and GT and create one Data Object.

simlearner3d.processing.dataset.toy_dataset

Generation of a toy dataset for testing purposes.

class simlearner3d.processing.dataset.toy_dataset.TASK_NAMES(value)[source]: An enumeration.

simlearner3d.processing.dataset.toy_dataset.make_toy_dataset_from_test_file()[source]

Prepare a toy dataset from a single, small LAS file.

The file is first duplicated to get 2 LAS in each split (train/val/test), and then each file is splitted into .data files, resulting in a training-ready dataset loacted in td_prepared

Parameters

src_las_path¶ (str) – input, small LAS file to generate toy dataset from
split_csv¶ (str) – Path to csv with a basename (e.g. ‘123_456.las’) and
`split`¶ (a) –
prepared_data_dir¶ (str) – where to copy files (raw subfolder) and to prepare
files¶ (dataset) –

Returns

path to directory containing prepared dataset.

Return type

str

simlearner3d.processing.dataset.utils

simlearner3d.processing.dataset.utils.find_file_in_dir(data_dir: str, basename: str) → str[source]

Query files matching a basename in input_data_dir and its subdirectories. :param _sphinx_paramlinks_simlearner3d.processing.dataset.utils.find_file_in_dir.input_data_dir: data directory :type _sphinx_paramlinks_simlearner3d.processing.dataset.utils.find_file_in_dir.input_data_dir: str

Returns: first file path matching the query.
Return type: [str]

simlearner3d.processing.transforms.compose

class simlearner3d.processing.transforms.compose.CustomCompose(transforms: List[Callable])[source]: Composes several transforms together. Edited to bypass downstream transforms if None is returned by a transform. :param _sphinx_paramlinks_simlearner3d.processing.transforms.compose.CustomCompose.transforms: List of transforms to compose. :type _sphinx_paramlinks_simlearner3d.processing.transforms.compose.CustomCompose.transforms: List[Callable]

simlearner3d.processing.transforms.transforms

class simlearner3d.processing.transforms.transforms.StandardizeIntensity[source]

Standardize gray scale image values.

standardize_channel(channel_data: torch.Tensor, clamp_sigma: int = 3)[source]: Sample-wise standardization y* = (y-y_mean)/y_std. clamping to ignore large values.

class simlearner3d.processing.transforms.transforms.StandardizeIntensityCenterOnZero[source]: Standardize gray scale image values.

class simlearner3d.processing.transforms.transforms.ToTensor(input: numpy.ndarray)[source]: Turn np.arrays into Tensor.

simlearner3d.processing

simlearner3d.processing.datamodule.hdf5

simlearner3d.processing.dataset.hdf5

simlearner3d.processing.dataset.toy_dataset

simlearner3d.processing.dataset.utils

simlearner3d.processing.transforms.compose

simlearner3d.processing.transforms.transforms

simlearner3d.processing.transforms.augmentations