BatchIntersectDecode2D

class hybrid_learning.datasets.transforms.encoder.BatchIntersectDecode2D(proto_shape=None, kernel_size=None)[source]

Bases: BatchConvOp

Given batch of IoU encoded masks, estimates the original segmentation mask.

This estimation is done by

  1. “bloating” each pixel: Create a mask with the proto_shape at the pixel’s location, with the proto_shape weighted by the pixel value

  2. adding up all bloated pixel masks to obtain one mask

Consider the convolution that describes the IoU encoding of the given mask. Then above steps can be simplified to a convolution with kernel and padding the ones from the encoding convolution but

  • kernel and padding flipped along each dimension; in 2D, the two flips are equivalent to a rotation by 180°,

  • kernel normalized by L1 norm (after that, the values in the convolution sum up to 1)

Derivation

The decoder formulas can be derived as follows: Consider a pixel \(p\) in the to be estimated segmentation mask, and its coordinates \((p_{a})_{a \in \text{axes}}\) with image axes being the axes describing a single image, e.g. in 2D (width, height). The bloat mask of a pixel \(p^{iou}\) in the IoU encoded mask can only contribute to the value of \(p\) if the proto_shape centered at \(p^{iou}\) would reach to \(p\), i.e. if the distance to \(p\) in each image axis \(a\) is

\[\begin{split}- \text{ceil}(0.5 \cdot (\text{proto_shape_size}[a] - 1)) &\leq p_{a} - p^{iou}_{a} \\ &\leq \text{floor}(0.5 \cdot (\text{proto_shape_size}[a] - 1))\end{split}\]

where \(p_{a} - p^{iou}_{a} < 0\) means \(p^{iou}\) is left of \(p\), and right else. This describes a kernel of the same size as the IoU encoding kernel but with padding flipped in each dimension. The kernel entries, i.e. the contribution of \(p^{iou}\) in kernel position \((pos_{a})_{a \in \text{axes}}\) to \(p\), are:

\[\begin{split}\text{proto_shape}[ (p_{a} - p^{iou}_{a} + \text{floor}(0.5 \cdot (\text{proto_shape_size}[a] - 1))_{a} ] \\ = \text{proto_shape}[(\text{proto_shape_size}[a] - pos_{a})_{a}]\end{split}\]

which is the proto_shape kernel of the IoU encoding but flipped in each dimension.

Public Data Attributes:

decoding_proto_shape

The kernel of the decoder (flipped and normalized IoU encoding proto shape).

proto_shape

The (L1-normalized) proto shape used for the IoU encoding.

Inherited from : py: class:BatchConvOp

proto_shape

The (L1-normalized) proto shape used for the IoU encoding.

kernel_size

The kernel size of the proto-type shape.

settings

Settings to reproduce the instance.

Inherited from : py: class:BatchWindowOp

AREA_DIMS

Indices of axes in which the image area is defined.

kernel_size

The kernel size of the proto-type shape.

settings

Settings to reproduce the instance.

Inherited from : py: class:Module

dump_patches

This allows better BC support for load_state_dict().

T_destination

alias of TypeVar('T_destination', bound=Mapping[str, Tensor])

Public Methods:

conv_op(masks)

Forward pass: Apply decoding convolution

Inherited from : py: class:BatchWindowOp

forward(masks)

Wrapper for the convolutional operation on batch of masks.

conv_op(masks)

Forward pass: Apply decoding convolution

Inherited from : py: class:Module

forward(masks)

Wrapper for the convolutional operation on batch of masks.

register_buffer(name, tensor[, persistent])

Adds a buffer to the module.

register_parameter(name, param)

Adds a parameter to the module.

add_module(name, module)

Adds a child module to the current module.

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self.

cuda([device])

Moves all model parameters and buffers to the GPU.

xpu([device])

Moves all model parameters and buffers to the XPU.

cpu()

Moves all model parameters and buffers to the CPU.

type(dst_type)

Casts all parameters and buffers to dst_type.

float()

Casts all floating point parameters and buffers to float datatype.

double()

Casts all floating point parameters and buffers to double datatype.

half()

Casts all floating point parameters and buffers to half datatype.

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

to_empty(*, device)

Moves the parameters and buffers to the specified device without copying storage.

to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

register_backward_hook(hook)

Registers a backward hook on the module.

register_full_backward_hook(hook)

Registers a backward hook on the module.

register_forward_pre_hook(hook)

Registers a forward pre-hook on the module.

register_forward_hook(hook)

Registers a forward hook on the module.

state_dict([destination, prefix, keep_vars])

Returns a dictionary containing a whole state of the module.

load_state_dict(state_dict[, strict])

Copies parameters and buffers from state_dict into this module and its descendants.

parameters([recurse])

Returns an iterator over module parameters.

named_parameters([prefix, recurse])

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

buffers([recurse])

Returns an iterator over module buffers.

named_buffers([prefix, recurse])

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

children()

Returns an iterator over immediate children modules.

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

modules()

Returns an iterator over all modules in the network.

named_modules([memo, prefix, remove_duplicate])

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

train([mode])

Sets the module in training mode.

eval()

Sets the module in evaluation mode.

requires_grad_([requires_grad])

Change if autograd should record operations on parameters in this module.

zero_grad([set_to_none])

Sets gradients of all model parameters to zero.

share_memory()

See torch.Tensor.share_memory_()

extra_repr()

Set the extra representation of the module

Special Methods:

__init__([proto_shape, kernel_size])

Init.

Inherited from : py: class:BatchWindowOp

__repr__()

Representation based on this instances settings.

Inherited from : py: class:Module

__init__([proto_shape, kernel_size])

Init.

__call__(*input, **kwargs)

Call self as a function.

__setstate__(state)

__getattr__(name)

__setattr__(name, value)

Implement setattr(self, name, value).

__delattr__(name)

Implement delattr(self, name).

__repr__()

Representation based on this instances settings.

__dir__()

Default dir() implementation.


Parameters
  • proto_shape (ndarray) –

  • kernel_size (Tuple[int, ...]) –

__init__(proto_shape=None, kernel_size=None)[source]

Init.

Parameters
  • proto_shape (Optional[ndarray]) – the proto shape used for IoU encoding in a form accepted by numpy.ndarray()

  • kernel_size (Optional[Tuple[int, ...]]) – if proto_shape is None, use all-ones rectangular shape of kernel_size

conv_op(masks)[source]

Forward pass: Apply decoding convolution

Parameters

masks (Tensor) –

Return type

Tensor

decoder_conv

Convolution to calculate the intersection for each location of proto_shape

property decoding_proto_shape: numpy.ndarray

The kernel of the decoder (flipped and normalized IoU encoding proto shape).

padding

Padding to obtain same size as input after convolution

property proto_shape: numpy.ndarray

The (L1-normalized) proto shape used for the IoU encoding.

training: bool