ConceptDetectionModel2D

class hybrid_learning.concepts.models.concept_models.concept_detection.ConceptDetectionModel2D(concept=None, model=None, layer_id=None, kernel_size=None, in_channels=None, concept_name=None, apply_sigmoid=True, apply_padding=True, ensemble_count=1, use_laplace=False, use_bias=True)[source]

Bases: Module

Pytorch model implementation of a concept embedding for 2D conv layers. The model itself simply is an ensemble (see ensemble_count) of convolutional layers with (optional) sigmoid activation (see apply_sigmoid). The goal of this model is to tell in each ensemble member from the activation map of a main_model, which spatial regions of the activation map belong to a given concept and which not. These regions are windows of the concept model kernel_size.

Additional features compared to a normal Conv2D layer:

  • Convenience:: During init :py:attr`in_channels` and kernel_size can be automatically determined from a given main model and concept data. Also, if apply_padding is set to True, a zero padding is automatically determined such that the output size of the convolution is the same as the input size (assuming constantly sized inputs).

  • Flexible architecture:: With the use_bias, the bias can be disabled during init (assumed to be constantly 0).

  • Storage of meta information:: If given during init, meta information like references to the main_model and the concept are kept for reproducibility.

  • Storage:: An ensemble can be turned into a generic save format that also captures meta and architecture specification (see to_embedding()).

The model forward works as follows:

Input

Activation map output of a 2D convolutional layer.

Output

List of heatmaps (one for each ensemble member) showing which centers of boxes of kernel_size belong to concept. The heatmap values are the sigmoid of a convolution operation if apply_sigmoid is True.

Parameters

Public Data Attributes:

concept

The concept for which this model was configured.

concept_name

The name of the associated concept if known.

main_model_stump

Stump of the main model for which this instance was configured.

main_model

Shortcut to access the main model.

layer_id

Layer to extract concept from.

kernel_size

Size of the convolution kernel.

in_channels

Number of input channels.

apply_sigmoid

Whether a sigmoid is applied to the output of the forward function before returning it.

apply_padding

Whether a zero-padding is applied to the input of the forward function.

settings

The current model settings as dictionary.

Inherited from : py: class:Module

dump_patches

This allows better BC support for load_state_dict().

T_destination

alias of TypeVar('T_destination', bound=Mapping[str, Tensor])

Public Methods:

reset_parameters()

Randomly (re)initialize weight and bias.

to_embedding()

Return the plain representation of the ensemble as list of ConceptEmbedding.

forward(inp)

Torch model forward evaluation method.

Inherited from : py: class:Module

forward(inp)

Torch model forward evaluation method.

register_buffer(name, tensor[, persistent])

Adds a buffer to the module.

register_parameter(name, param)

Adds a parameter to the module.

add_module(name, module)

Adds a child module to the current module.

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self.

cuda([device])

Moves all model parameters and buffers to the GPU.

xpu([device])

Moves all model parameters and buffers to the XPU.

cpu()

Moves all model parameters and buffers to the CPU.

type(dst_type)

Casts all parameters and buffers to dst_type.

float()

Casts all floating point parameters and buffers to float datatype.

double()

Casts all floating point parameters and buffers to double datatype.

half()

Casts all floating point parameters and buffers to half datatype.

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

to_empty(*, device)

Moves the parameters and buffers to the specified device without copying storage.

to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

register_backward_hook(hook)

Registers a backward hook on the module.

register_full_backward_hook(hook)

Registers a backward hook on the module.

register_forward_pre_hook(hook)

Registers a forward pre-hook on the module.

register_forward_hook(hook)

Registers a forward hook on the module.

state_dict([destination, prefix, keep_vars])

Returns a dictionary containing a whole state of the module.

load_state_dict(state_dict[, strict])

Copies parameters and buffers from state_dict into this module and its descendants.

parameters([recurse])

Returns an iterator over module parameters.

named_parameters([prefix, recurse])

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

buffers([recurse])

Returns an iterator over module buffers.

named_buffers([prefix, recurse])

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

children()

Returns an iterator over immediate children modules.

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

modules()

Returns an iterator over all modules in the network.

named_modules([memo, prefix, remove_duplicate])

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

train([mode])

Sets the module in training mode.

eval()

Sets the module in evaluation mode.

requires_grad_([requires_grad])

Change if autograd should record operations on parameters in this module.

zero_grad([set_to_none])

Sets gradients of all model parameters to zero.

share_memory()

See torch.Tensor.share_memory_()

extra_repr()

Set the extra representation of the module

Special Methods:

__init__([concept, model, layer_id, ...])

Init.

Inherited from : py: class:Module

__init__([concept, model, layer_id, ...])

Init.

__call__(*input, **kwargs)

Call self as a function.

__setstate__(state)

__getattr__(name)

__setattr__(name, value)

Implement setattr(self, name, value).

__delattr__(name)

Implement delattr(self, name).

__repr__()

Return repr(self).

__dir__()

Default dir() implementation.


__init__(concept=None, model=None, layer_id=None, kernel_size=None, in_channels=None, concept_name=None, apply_sigmoid=True, apply_padding=True, ensemble_count=1, use_laplace=False, use_bias=True)[source]

Init.

Parameters
  • model (Optional[Module]) – model the concept should be embedded in; used to create (and later accessible in) main_model_stump; used for kernel_size and in_channels auto-inference

  • layer_id (Optional[str]) – the layer index in state_dict(), the output of which is to be fed to the the concept model; used to create (and later accessible) in main_model_stump; used for kernel_size and in_channels auto-inference

  • concept (Optional[SegmentationConcept2D]) – Concept to train for; must be a segmentation concept featuring ground truth masks; used for kernel_size and in_channels auto-inference

  • in_channels (Optional[int]) – Number of filters of the Conv2d-Layer to analyse; the value is automatically determined if in_channels or kernel_size is None; an automatically generated value overwrites a given value with a warning

  • kernel_size (Optional[Tuple[int, int]]) – Size in activation map pixels of a window for which to assess whether it is part of the concept or not; by default it is determined by the relative sizes in the concept’s rel_size and the layer output size; if concept.rel_size is not set, kernel_size is set to (1, 1) with a warning

  • concept_name (Optional[str]) – The concept name identifier to use for concept_name; defaults to the name given in concept

  • apply_sigmoid (bool) – see apply_sigmoid

  • apply_padding (bool) – see apply_padding

  • ensemble_count (int) – number of deep ensemble models, see ensemble_count

  • use_laplace (bool) – if true, the covariance of the prediction are approximated using laplace

  • use_bias (bool) – see use_bias

forward(inp)[source]

Torch model forward evaluation method.

Parameters

inp (Tensor) –

Return type

Tensor

static from_embedding(embeddings_list, legacy_warnings=True, **kwargs)[source]

Initialize a concept localization model from an embedding. The weight and bias are obtained as follows:

Weight

The weight is the normal vector of the embedding

Bias

Given the embedding’s support_factor as \(b\), the bias calculates as (compare to_embedding()):

\[\text{bias} = - b \cdot (|\text{weight}|^2)\]
Parameters
  • embeddings_list (Union[ConceptEmbedding, Sequence[ConceptEmbedding]]) – the embeddings to use

  • legacy_warnings (bool) – whether to give warnings about legacy, non-captured embedding attributes

  • kwargs – any keyword arguments to the concept model (overwrite the values obtained from embedding)

Returns

a concept localization model initialized with the embedding information

Return type

ConceptDetectionModel2D

reset_parameters()[source]

Randomly (re)initialize weight and bias.

Return type

None

to_embedding()[source]

Return the plain representation of the ensemble as list of ConceptEmbedding. I.e.

As parameters

weight and bias of the concept layers, and

As meta info

the concept and main_model with layer_id.

Return type

List[ConceptEmbedding]

Note

This must be a deep copy to avoid overwriting in a consecutive training session.

The resulting embedding describes the decision hyperplane of the concept model. Its normal vector \(n\) is the concept layer weight. The orthogonal support vector given by \(b\cdot n\) for a scalar factor \(b\) must fulfill

\[\forall v: (v - b\cdot n) \circ n = d(v) = (v \circ \text{weight}) + \text{bias}\]

i.e.

\[n = \text{weight} \quad\text{and}\quad b = - \frac{\text{bias}} {|\text{weight}|^2}.\]

Here, \(d(v)\) is the signed distance measure of a vector from the hyperplane, i.e.

\[\begin{split}d(v) \begin{cases} > 0 & \text{iff vector yields a positive prediction,}\\ \equiv 0 & \text{iff vector on decision boundary hyperplane,}\\ < 0 & \text{iff vector yields a negative prediction.} \end{cases}\end{split}\]
activation: Optional[torch.nn.modules.activation.Sigmoid]

The activation layer to obtain heatmaps in [0,1]. Defaults to a sigmoid if apply_sigmoid is set to True during init. If set to None, no activation is applied.

property apply_padding: bool

Whether a zero-padding is applied to the input of the forward function. The padding should ensure that the input equals the output size.

property apply_sigmoid: bool

Whether a sigmoid is applied to the output of the forward function before returning it.

property concept: Optional[SegmentationConcept2D]

The concept for which this model was configured.

property concept_name: Optional[str]

The name of the associated concept if known. Defaults to the name of concept if given.

ensemble_count: int

Number of deep ensemble models. This is also the first dimension of the forward output. Each ensemble member simply is a separate convolutional layer, and all members are run in parallel.

property in_channels: int

Number of input channels. This is the number of output channels of layer to investigate.

property kernel_size: Tuple[int, ...]

Size of the convolution kernel. This is the assumed concept size in activation map pixels.

property layer_id: str

Layer to extract concept from. Shortcut to access the information from main_model_stump.

property main_model: torch.nn.modules.module.Module

Shortcut to access the main model. It is wrapped by main_model_stump.

property main_model_stump: ModelStump

Stump of the main model for which this instance was configured. The concept model is assumed to accept as input the output of this model stump (i.e. the corresponding layer of the main_model).

Implementation detail:: The actual attribute is wrapped into a tuple to hide the parameters, since these shall not be updated; see https://discuss.pytorch.org/t/how-to-exclude-parameters-from-model/6151

padding: Optional[torch.nn.modules.padding.ZeroPad2d]

The padding to apply before the convolution. Defaults to a padding such that the output size equals the input size if apply_padding is set to True during init. If set to None, no padding is applied.

property settings: Dict[str, Any]

The current model settings as dictionary.

training: bool
use_bias: bool

Whether the convolution should have and learn a bias, or the bias should be constantly 0.

use_laplace: bool

Whether training handles should use Laplace approximation.