BaseDataset
- class hybrid_learning.datasets.base.BaseDataset(split=None, dataset_root=None, transforms=None, transforms_cache=None, after_cache_transforms=None, device='cpu')[source]
Bases:
DatasetAbstract base class for tuple datasets with storage location.
Derived datasets should yield tuples of
(input, target). The transformationtransformsis applied to data tuples before return from__getitem__()can be controlled. The default fortransformsis given by The default fortransformsis given by_default_transforms. Override in sub-classes if necessary. The default combination of collected dataset tuples and_default_transformsshould yield a tuple oftorch.Tensoror dicts thereof.The
hybrid_learning.datasets.base.BaseDataset.dataset_rootis assumed to provide information about the storage location. Best, all components (input data, annotations, etc.) should be stored relative to this root location.The transformed tuple values are cached by
transforms_cacheif it is given. Then values are only collected and transformed if they cannot be loaded from the cache. To get the cache descriptor for an entry the, thedescriptor()method is consulted. Make sure to override this appropriately (e.g. by image ID or image file name).Note
In case a
CacheTupleis used, make sure thatNoneis returned if any tuple value isNone.Public Data Attributes:
Settings of the instance.
Public Methods:
getitem(idx)Get data item tuple from
idxin this dataset.descriptor(i)Return a unique descriptor for the item at position
i.Special Methods:
__init__([split, dataset_root, transforms, ...])Init.
__len__()Number of data points in the dataset; to be implemented in subclasses.
__getitem__(idx)Get item from
idxin dataset with transformations applied.__repr__()Nice printing function.
Inherited from : py: class:Dataset
__getitem__(idx)Get item from
idxin dataset with transformations applied.__add__(other)
- __getitem__(idx)[source]
Get item from
idxin dataset with transformations applied. Transformations must be stored as single tuple transformation intransforms.- Returns
tuple output of
getitem()transformed bytransforms- Parameters
idx (int) –
- __init__(split=None, dataset_root=None, transforms=None, transforms_cache=None, after_cache_transforms=None, device='cpu')[source]
Init.
- Parameters
split (Optional[DatasetSplit]) – The split of the dataset (e.g.
DatasetSplit.TRAIN,DatasetSplit.VAL,DatasetSplit.TEST).dataset_root (Optional[str]) – The location where to store the dataset.
transforms (Optional[Callable]) – The transformations to be applied to the data when loaded; defaults to
_default_transformstransforms_cache (Optional[Cache]) – optional cache instance for caching transformed tuples; must return
Nonein case one of the tuple values has not been cached yet; seetransforms_cacheafter_cache_transforms (Optional[Callable]) – transformations applied after consulting the cache (no matter, whether the tuples was retrieved from cache or not); by default, tensor gradients are disabled and tensors are moved to a common device
device (Optional[Union[str, device]]) – device to use in the default
after_cache_transforms
- abstract descriptor(i)[source]
Return a unique descriptor for the item at position
i. This can e.g. be an image ID or the image file name. It is used for caching.
- abstract getitem(idx)[source]
Get data item tuple from
idxin this dataset.- Parameters
idx (int) – index to retrieve data point from
- Returns
tuple
(input, label)withinputone of: image (asPIL.Image.Image), Radar/Lidar point cloudlabelone of:None
class label (as
torch.Tensororbool),semantic segmentation map (as
PIL.Image.Imageortorch.Tensorcompatible with torchvision transforms),bounding box
string-indexed dict of combinations
- Return type
Tuple[Union[Tensor, Image], Union[Tensor, Image, Dict[Tensor, Image]]]
- __parameters__ = ()
- after_cache_transforms: Callable
Transformation function applied after consulting the cache (no matter, whether the tuples was retrieved from cache or not). Use these transformations instead of
transformsto ensure the transformation is always applied, regardless of caching. By default, tensor gradients are disabled and tensors are moved to a common device (see_get_default_after_cache_trafo()).
- dataset_root: str
Assuming the dataset is saved in some storage location, a root from which to navigate to the dataset information.
- property settings: Dict[str, Any]
Settings of the instance.
transformsinfo is skipped if set to default.
- split: Optional[DatasetSplit]
Optional specification what use-case this dataset is meant to represent, e.g. training, validation, or testing.
- transforms: Callable
Transformation function applied to each item tuple before return. Applied in
__getitem__(). Default transformations are sub-class-specific. Items transformed usingtransformscan be cached by settingtransforms_cache. If the transformations should be applied always, regardless of caching, useafter_cache_transforms.
- transforms_cache: Optional[Cache]
Cache for the transformed
(input, target)tuples. If set,__getitem__()will first try to load the tuple from cache before loading and transforming it normally. Items not in the cache are put in there aftertransformsis applied.