BrodenHandle

class hybrid_learning.datasets.custom.broden.BrodenHandle(labels, dataset_root, annotations=None, annotations_fp=None, prune_na=True, prune_na_rule='all', broden_split=None, max_num_samples=None, shuffle=False, **dataset_args)[source]

Bases: BaseDataset

Handle to collect a sub-dataset of a dataset following Broden format.

Note

The original Broden dataset is not required for usage of this handle. Used datasets just must use a format as is used by the Broden dataset. In the following, the format specifics relevant for the datasets that can be handled are explained, using the original Broden Dataset as role model. (No code from the original datasets was used.)

About the Original Broden Dataset

The Broden dataset is the broad and densely labeled dataset initially prepared for the paper Network Dissection. It is a combination of the following datasets:

ADE (scene, object, part)
Pascal-Context (object)
Pascal-Part (part)
OpenSurfaces (material)
DTD (texture)
and a generated color dataset, with 11 human selected colors

The original Broden data features both pixel-level semantic segmentation annotations (for categories see SEG_CATS), and image-level classification annotations (for categories see CLS_CATS).

The annotations attribute stores the raw annotation information as pandas.DataFrame as it is loaded from the index file (see INDEX_CSV_FILE) within the dataset_root. For the format of the annotations see annotations directly.

Note

To create sub-sets, one can also provide the annotations information on init.

Default Output Format

The getitem() method yields tuples of input image and a dictionary {label_name: annotation} containing the annotations for all specified labels. For the exact output format of the annotations have a look at the getitem() doc. By default, for classification, the annotation is bool, and for segmentation, it is a numpy.ndarray binary mask for the label. If the label information is missing for the selected item, None is returned instead. This output is transformed by transforms before yielding it as output of __getitem__().

Note

To collect a single custom label/merged annotations from the Broden dataset, refer to the custom_label() builder.
To modify the internal annotations table after init, use prune() or directly modify annotations.

Public Data Attributes:

`CAT_SEP`	Separator string if the category is specified for a label.
`LABEL_CSV_FILE`	Path to the file containing meta-information about the labels, relative to a dataset root.
`INDEX_CSV_FILE`	Path to the file containing the annotation information, relative to a dataset root.
`IMAGES_ROOT`	Root directory for annotated image files.
`SEG_CATS`	Categories that provide segmentation data.
`CLS_CATS`	Categories that provide classification data.

Inherited from : py: class:BaseDataset

settings

Settings of the instance.

Public Methods:

`standard_prune`([max_num_samples, prune_na, ...])	Apply the specified standard pruning operations.
`cut_to`([max_num_samples, shuffle])	Reduce the number of samples to the first `max_num_samples`, and optionally shuffle.
`save_annotations_table`(annotations_fp)	Save the current `annotations` to a CSV file at given `annotations_fp`.
`parse_label`(label_spec, label_infos)	Given a label specifier, parse it to a `BrodenLabel` given `label_infos`.
`getitem`(i)	Provide tuple of input image and dictionary with annotations for all labels.
`load_anns`(i)	Load all annotation information for row `i`.
`descriptor`(i)	Return the relative image file path for item `i`.
`image_filepath`(i)	Get the path to the image file for row `i`.
`load_ann`(label[, i, raw_ann_row, ...])	Load the annotation information for `label` at row `i`.
`process_seg_mask`(label, rgb_masks)	Collect the binary segmentation mask for `label` from given relative file paths.
`prune`(condition[, by_target, show_progress_bar])	Prune all items that fulfill `condition` from this dataset.
`balance`(condition[, proportion, by_target, ...])	Restrict this dataset to a subset with an exact `proportion` fulfilling `condition`.
`shuffle`()	Shuffle the held annotations and return self.

Inherited from : py: class:BaseDataset

`getitem`(i)	Provide tuple of input image and dictionary with annotations for all labels.
`descriptor`(i)	Return the relative image file path for item `i`.

Special Methods:

`__init__`(labels, dataset_root[, ...])	Init.
`__len__`()	Number of data points in the dataset; to be implemented in subclasses.

Inherited from : py: class:BaseDataset

`__init__`(labels, dataset_root[, ...])	Init.
`__len__`()	Number of data points in the dataset; to be implemented in subclasses.
`__getitem__`(idx)	Get item from `idx` in dataset with transformations applied.
`__repr__`()	Nice printing function.

Inherited from : py: class:Dataset

`__getitem__`(idx)	Get item from `idx` in dataset with transformations applied.
`__add__`(other)

__init__(labels, dataset_root, annotations=None, annotations_fp=None, prune_na=True, prune_na_rule='all', broden_split=None, max_num_samples=None, shuffle=False, **dataset_args)[source]

Init.

For further arguments see the details in standard_prune().

Warning

Currently, no labels with duplicate names are allowed. Therefore, a label may only occur for one category.

Parameters

labels (Sequence[BrodenLabel]) – list of labels to collect for each sample.
dataset_root (str) – the path to the root directory holding the annotation files and the images/ directory with the images and segmentations
annotations (Optional[DataFrame]) – optional initializer for annotations, which is by default loaded from INDEX_CSV_FILE; use to create sub-sets
annotations_fp (Optional[str]) – optional path to the annotations file; by default, dataset_root/index.csv is assumed
dataset_args – arguments to BaseDataset.
prune_na (bool) –
prune_na_rule (str) –
broden_split (Optional[str]) –
max_num_samples (Optional[int]) –
shuffle (bool) –

__len__()[source]: Number of data points in the dataset; to be implemented in subclasses.

static _to_cat_info(cat_info_str)[source]

Transform category info str of cat1(freq1);cat2(freq2);… to a dict.

Parameters: cat_info_str (str) –

balance(condition, proportion=0.5, by_target=False, show_progress_bar=False)[source]

Restrict this dataset to a subset with an exact proportion fulfilling condition. For this, annotations is modified accordingly. After splitting the dataset by condition, the half which is too large to fulfill proportion is reduced by random sub-sampling, determining the final size of the dataset.

If there is only one class in the dataset, only shuffling is applied.

Parameters

condition (Callable[[Tuple[Any, Any]], bool]) – callable that accepts the output of __getitem__() and returns a bool stating whether this item belongs to the first split
proportion (float) – the aimed-for proportion of the first split on the final dataset
show_progress_bar (bool) – whether to show a progress bar while collecting the selector for condition
by_target (bool) – only load the target annotations of each item (the transforms are applied with dummy input) and apply condition to the target; asserts that transforms yields a tuple of (input, target); this is useful to avoid the costly loading of input images if they do not contribute to the transformations or the condition.

Returns

self

Return type

BrodenHandle

classmethod custom_label(dataset_root, label, prune_empty=True, balance_pos_to=None, verbose=True, **init_args)[source]

Return a BrodenHandle instance with output restricted to single label.

The transformations in transforms will be chosen such that __getitem__() outputs a tuple of (input_image, annotation) where

input_image is encoded as torch.Tensor
annotation is a torch.Tensor holding either the binary mask for the specified label or the bool classification value.

The label may either be a label as would be specified in __init__ or a formula specification string representing a formula of Boolean Merge operations that can be parsed using a BooleanLogic.

Parameters

dataset_root (str) – the dataset_root parameter for init of the BrodenHandle
label (str) – the label to restrict to; may either be a valid string label name, a valid BrodenLabel, or a valid string representation of a Merge operation the all_in_keys of which are all valid string label names;
init_args – further init arguments to the BrodenHandle
balance_pos_to (Optional[float]) – if a value given, balance the resulting BrodenHandle instance such that the proportion of the True entries is this value; only use for classification examples
prune_empty (Union[bool, str]) – whether to prune empty entries (None values and empty masks) using prune()
verbose (bool) – show progress bars

Returns

BrodenHandle instance for dataset_root with transforms and labels selected such that the output of getitem() is transformed to the format specified above

cut_to(max_num_samples=None, shuffle=False)[source]

Reduce the number of samples to the first max_num_samples, and optionally shuffle.

Parameters

max_num_samples (Optional[int]) –
shuffle (bool) –

Return type

BrodenHandle

descriptor(i)[source]

Return the relative image file path for item i. This is unique within a Broden dataset and can be used as an ID e.g. for caching.

Parameters: i (int) –
Return type: str

getitem(i)[source]

Provide tuple of input image and dictionary with annotations for all labels. (See labels). Used for __getitem__().

The output format is a tuple of (input_image, {label_name: annotation}). The return type is as follows: The input image is an RGB image as Image; For the annotations dictionary holds:

Each label from labels is considered, and the annotation for a label is
- for classification: a bool value
- for segmentation: a binary mask as numpy.ndarray
In case the label is not available, its value in the annotations dict is None.

is a tuple of the input Image and the annotations dict.

Returns: tuple of input image and annotations dict
Parameters: i (int) –
Return type: Tuple[Image, Dict[str, Union[bool, ndarray]]]

image_filepath(i)[source]

Get the path to the image file for row i. Information is retrieved from annotations.

Parameters: i (int) –
Return type: str

static label_info_for(label_name, label_infos)[source]

Obtain information for label given by name from label information. A label may have samples in different categories.

The output features the following information (compare Broden README):

Number

the label ID (used for annotation in the segmentation masks)

Name

the trivial unique name

Category

the categories the labels have samples in, specified as semi-colon separated list of entries in {'color', 'object', 'material', 'part', 'scene', 'texture'}, each entry followed by the total amount of samples for the label for that category; use _to_cat_info() to process those

Frequency

total number of images having that label over all categories

Coverage

the mean(?) pixels per image

Syns

synonyms

Parameters

label_name (str) – the name of the label
label_infos (DataFrame) – the meta-information on all Broden labels as can by default be loaded from LABEL_CSV_FILE.

Returns

pandas.Series with above fields filled

Raises

ValueError if the label is not unique or cannot be found

Return type

Series

load_ann(label, i=None, raw_ann_row=None, loaded_rgb_masks=None)[source]

Load the annotation information for label at row i. Information is retrieved from annotations. If the annotation information is missing for the given label category, return None.

Note

If loaded_rgb_masks is given, this function has the side effect of updating this dict with newly loaded masks! This is used to speed up loading of several labels from the same mask.

Parameters

label (BrodenLabel) – the label to restrict the annotation to
i (Optional[int]) – the index of the row in the annotations information annotations which holds the information for this single annotation of interest
raw_ann_row (Optional[Series]) – optionally directly hand over the row of interest instead of providing its index (see i)
loaded_rgb_masks (Optional[Dict[str, List[Image]]]) – RGB segmentation masks loaded so far (for speed-up); gets updated with any newly loaded masks

Returns

One of

None if category information is missing,
the binary segmentation mask for the label in case of a segmentation category,
the boolean truth value whether the label holds for the image in case of a classification category

Return type

Optional[Union[bool, ndarray]]

classmethod load_annotations_table(annotations_fp)[source]

Load the annotation information from the file at annotations_fp. For simplicity of parsing, all category and the "image" column are parsed to string.

Parameters: annotations_fp (str) – the path to the annotations .csv file
Returns: annotations table with correct types of the category columns
Return type: DataFrame

load_anns(i)[source]

Load all annotation information for row i. Information is retrieved from annotations. For details on the output format see load_ann().

Parameters: i (int) –
Return type: Dict[str, Union[bool, ndarray]]

parse_label(label_spec, label_infos)[source]

Given a label specifier, parse it to a BrodenLabel given label_infos.

Parameters

label_spec (Union[str, BrodenLabel]) – the label specifier to turn into a BrodenLabel
label_infos (DataFrame) – the meta-information about all Broden labels; contains the information about available labels

Returns

the BrodenLabel instance with information of the label_spec

Return type

BrodenLabel

process_seg_mask(label, rgb_masks)[source]

Collect the binary segmentation mask for label from given relative file paths. Pixels belonging to the given label are 1, others 0.

Parameters

label (BrodenLabel) – the label to look for (number needed)
rgb_masks (List[Image]) – a list of RGB masks with label information encoded in red and green channel; for details on encoding see to_seg_mask()

Returns

binary segmentation mask for label merged from the segmentation masks at given file paths

Raises

ValueError for invalid label category

Return type

ndarray

prune(condition, by_target=False, show_progress_bar=False)[source]

Prune all items that fulfill condition from this dataset. For this, annotations is modified accordingly.

Parameters

condition (Callable[[Tuple[Any, Any]], bool]) – callable that accepts the output of __getitem__() and returns a bool stating whether this item is to be pruned
show_progress_bar (bool) – whether to show a progress bar while collecting the selector for condition
by_target (bool) – only load the target annotations of each item (the transforms are applied with dummy input) and apply condition to the target; asserts that transforms yields a tuple of (input, target); this is useful to avoid the costly loading of input images if they do not contribute to the transformations or the condition.

Returns

this instance (with modified annotations)

Return type

BrodenHandle

save_annotations_table(annotations_fp)[source]

Save the current annotations to a CSV file at given annotations_fp.

Parameters: annotations_fp (str) –

shuffle()[source]

Shuffle the held annotations and return self.

Return type: BrodenHandle

standard_prune(max_num_samples=None, prune_na=True, prune_na_rule='all', broden_split=None, shuffle=False)[source]

Apply the specified standard pruning operations. Pruning is applied to the annotations table.

Parameters

prune_na (bool) – whether to prune all entries (rows) from the annotations table in which 'all' or 'any' of the covered label categories are NaN (see also prune_rule)
prune_na_rule (str) –
if prune_na is True, rule by which to select candidates for pruning:
- 'all': all categories occurring in the specified labels must be NaN
- 'any': any must be NaN
broden_split (Optional[str]) – the original dataset had a fixed split into training ('train') and validation ('val') data; choose the corresponding original split (see also annotations, where the split meta-information is stored in)
max_num_samples (Optional[int]) – the maximum number of samples to select; if set to None, no restriction is applied
shuffle (bool) – whether to shuffle the dataset (before restricting to max_num_samples)

Returns

self

Return type

BrodenHandle

static to_seg_mask(seg, label_num)[source]

Given a Broden RGB segmentation, reduce it to a binary mask for label_num.

Broden segmentations are saved as RGB images, where the the label number of a pixel is (256 * green + red) with red the red channel value of the pixel, and green its green channel value. A label number of 0 means background.

The label number is the 'number' field from label_info_for respectively the BrodenLabel.number attribute. One can either specify a single label number as int, or an iterable of label numbers.

Parameters

seg (Image) – the original RGB segmentation mask encoded as described above
label_num (int) – the label number to restrict the mask to

Returns

union of binary segmentation masks for given label numbers

Return type

ndarray

CAT_SEP = '>>': Separator string if the category is specified for a label. Then the format is "{label}{sep}{category}".

CLS_CATS = ('scene', 'texture'): Categories that provide classification data.

IMAGES_ROOT: str = 'images': Root directory for annotated image files. Relative to the dataset_root. Annotations can be found in INDEX_CSV_FILE.

INDEX_CSV_FILE: str = 'index.csv': Path to the file containing the annotation information, relative to a dataset root. For the encoding see the documentation of this class.

LABEL_CSV_FILE: str = 'label.csv': Path to the file containing meta-information about the labels, relative to a dataset root. For details on the encoding see label_info_for().

SEG_CATS = ('object', 'part', 'color', 'material'): Categories that provide segmentation data.

__parameters__ = ()

annotations: pd.DataFrame

The actual annotation (meta-)information. The columns used here are described below.

Preliminary Remarks

All file-paths are relative to dataset_root /images.
Several files or class labels may be given, separated by semi-colon.
A mask for a category is an RGB-image encoding segmentation masks for all different labels of that category. For the encoding see process_seg_mask().
An annotation may have labels in different categories (i.e. entries in these category columns). If annotation information for a category is missing, this column is None.

The Columns

The following columns are used here:

image: The file-path to the original image file of this annotation
split: The dataset split for which this annotation was used (train or val)
category columns:
- color: color mask file-path
- object: object mask file-path (semantic object segmentation)
- part: part mask file-path (same as object masks, only parts belong to a super-object)
- material: material mask file-path
- scene: label number of the depicted scene
- texture: texture label numbers

labels: List[BrodenLabel]: The labels to load the values for in each line of the Broden annotations.