
class hybrid_learning.datasets.custom.coco.base.COCODataset(dataset_root=None, annotations_fp=None, split=None, img_size=(400, 400), device=None, **kwargs)[source]

Bases: ABC, BaseDataset

Attributes and functions common for keypoint based datasets derived from MS COCO.

The handle encompasses functionality for:

Item retrieval

hybrid_learning.datasets.base.BaseDataset.__getitem__() must be implemented in subclasses; available items:


Use subset() for subsetting according to given conditions. To create/store annotations reflecting this subset, use to_raw_anns().

Public Data Attributes:


IDs of COCO image licenses that allow for commercial use.


Default root directory template for image files that accepts the split ('train' or 'val').


Default template for the annotation file path that accepts the split ('train' or 'val') and the root directory.


Default target size of images to use for the default transforms as (height, width).


Return information to init new dataset.


The mapping of image IDs to license descriptions and URLs.

Inherited from : py: class:BaseDataset


Public Methods:

subset(*[, license_ids, body_parts, num, ...])

Restrict the items by the given selection criteria and an optional custom condition.


Wrapper around subset() that only shuffles the instance.

to_raw_anns([description, save_as])

Create the content of a new valid annotations file restricted to the current image IDs.

copy_to([root_root, description, overwrite, ...])

Create a new dataset by copying used images and annotations to new root folder.


Load unmodified image by index in dataset.


Return the image file name for the item at index i.


Path to image file at index i.


Get attribution information for image at index i.


Load the dict with meta information for image at index i.


Return the list of raw annotations for image at index i.


Item selection differs depending on the desired annotation data.

Inherited from : py: class:BaseDataset


Special Methods:

__init__([dataset_root, annotations_fp, ...])



Inherited from : py: class:BaseDataset

Nice printing function.

Inherited from : py: class:Dataset


__init__(dataset_root=None, annotations_fp=None, split=None, img_size=(400, 400), device=None, **kwargs)[source]



Length is given by the length of the index mapping.

copy_to(root_root=None, description=None, overwrite=False, dataset_root=None)[source]

Create a new dataset by copying used images and annotations to new root folder.

The following files and folders will be created:

  • new_root_root/annotations/: annotations root folder

  • new_root_root/annotations/<anns_file>: An annotations file of the same basename as annotations_fp is created and stored in the annotations folder (see to_raw_anns()).

  • new_root_root/images/<img_root>/: An images root is created of the same basename as hybrid_learning.datasets.base.BaseDataset.dataset_root.

  • new_root_root/images/<img_root>/<img_file>: Each image file used in this dataset is copied to the new images root keeping the file basename.

  • root_root (Optional[str]) – root directory under which to create the annotations root and new dataset_root

  • description (Optional[str]) – description used in the annotations info; see to_raw_anns()

  • overwrite (bool) – do not raise if file or folder exist

  • dataset_root (Optional[str]) – if new_root_root is not given, it is assumed to be dataset_root/../...


a dict with the dataset_root and annotations_fp settings to init the new dataset

Return type

Dict[str, str]


Return the image file name for the item at index i. This is unique within the COCO dataset and may serve as ID e.g. for caching.


i (int) –

Return type


abstract classmethod get_default_transforms(img_size, device=None)[source]

Create the default transformation for this dataset depending on img_size.

Return type


abstract getitem(item)[source]

Item selection differs depending on the desired annotation data. Used for __getitem__().


item (int) –


Get attribution information for image at index i. This encompasses:


the image file path


the static flickr URL (image source)


link to the image flickr page featuring author information


the license name


link to the license


the COCO image ID


the COCO license ID


i (int) –

Return type

Dict[str, Union[int, str]]

The license information is taken from license_mapping(). See also


Path to image file at index i.


i (int) –

Return type



Load the dict with meta information for image at index i.


i (int) –

Return type

Dict[str, Any]

classmethod img_id_iterator(coco, img_ann_ids=None, license_ids=(4, 5, 6, 7, 8), num=None, shuffle=False, condition=None, show_progress_bar=True)[source]

Generator that iterates over image IDs of the COCO dataset that fulfill given selection criteria. For details see subset().

  • coco (Optional[COCO]) – coco handler to get img_ann_ids

  • img_ann_ids (Optional[Sequence[Tuple[int, Sequence[int]]]]) – image and annotation IDs to select from; format should be as for img_ann_ids; defaults to IDs of all images and annotations given by coco

  • license_ids (Optional[Iterable[int]]) – IDs of accepted licenses; if set to None, all licenses are accepted

  • num (Optional[int]) – number of images to produce (take first num ones)

  • shuffle (bool) – whether to shuffle the IDs (before applying effect of num)

  • condition (Optional[Callable[[Dict[str, Any], Dict[str, Any]], bool]]) – callable that accepts image meta data and annotation metadata, and returns a bool stating whether to skip the annotation instance or not

  • show_progress_bar (bool) – whether to show a progress bar while iterating over all image IDs

Return type

Generator[Tuple[int, List[int]], None, None]


Load unmodified image by index in dataset.


i (int) –

Return type



Return the list of raw annotations for image at index i.


i (int) –

Return type

List[Dict[str, Any]]


Wrapper around subset() that only shuffles the instance.



Return type


subset(*, license_ids=(4, 5, 6, 7, 8), body_parts=None, num=None, shuffle=False, condition=None, show_progress_bar=True)[source]

Restrict the items by the given selection criteria and an optional custom condition. Operation changes img_ann_ids. Selection criteria are:

  • Len: Maximum total number of images (and whether to shuffle before selecting the first X IDs)

  • License: IDs of licenses one of which the image must have

  • Contained body parts: body parts (collections of of keypoint names),

    one of which must be fully contained in the image; e.g. [["left_eye"], ["right_eye"]] means either left_eye or right_eye must be visible, while [["left_eye", "right_eye"]] means both must be.

  • Any custom condition specified via condition.

  • license_ids (Optional[Iterable[int]]) – IDs of accepted licenses; if set to None, all licenses are accepted

  • body_parts (Optional[Union[Sequence[str], Sequence[Sequence[str]]]]) – sequence of body parts any of which must be visible in the image; a body part is a sequence of string keypoint names; if just one body part is given, this may be provided as list of strings, e.g. ["left_eye", "right_eye"]

  • num (Optional[int]) – number of images to produce (take first num ones)

  • shuffle (bool) – whether to shuffle the IDs (before applying num)

  • condition (Optional[Callable[[Dict[str, Any], Dict[str, Any]], bool]]) – callable that accepts image meta data and annotation metadata, and returns a bool stating whether to skip the annotation instance or not

  • show_progress_bar (bool) – whether to show the progress of image checking

Return type


to_raw_anns(description=None, save_as=None)[source]

Create the content of a new valid annotations file restricted to the current image IDs. Optionally also save to save_as. The content is based on the contents of annotations_fp.

If the restricted content is stored into a JSON annotations file, it can be used to init further instances that are already restricted to the current images. Useful in combination with subset().

  • description (Optional[str]) – change the 'description' under 'info' to description; no change if set to None

  • save_as (Optional[str]) – if not None, dump the new annotations to a file located there (will overwrite)


dict in the format of a COCO annotations file with images and annotations restricted to those used in this dataset instance

ANNOTATION_FP_TEMPL: str = '{root}\\annotations\\person_keypoints_{split}2017.json'

Default template for the annotation file path that accepts the split ('train' or 'val') and the root directory.

COMMERCIAL_LICENSE_IDS: Tuple[int] = (4, 5, 6, 7, 8)

IDs of COCO image licenses that allow for commercial use.

DATASET_ROOT_TEMPL: str = '..\\dataset\\coco\\images\\{split}2017'

Default root directory template for image files that accepts the split ('train' or 'val').

DEFAULT_IMG_SIZE: Tuple[int, int] = (400, 400)

Default target size of images to use for the default transforms as (height, width).

File path to the COCO annotations json file with image and keypoint annotations.

coco: COCO

Internal COCO handle.

img_ann_ids: List[Tuple[int, List[int]]]

Mapping of indices in this dataset to COCO image and annotation IDs. Each entry in the list is a tuple of the form (image_id, [annotation_id, ...]) where the annotations belong to the corresponding image.

property license_mapping: Dict[int, Dict[str, Any]]

The mapping of image IDs to license descriptions and URLs. This is extracted from the annotations file loaded by coco.


dict {ID: license_info}

property settings: Dict[str, Any]

Return information to init new dataset.