loki2.models.base.vision_transformer

Vision Transformer implementation for Loki2.

This module provides a Vision Transformer (ViT) backbone architecture based on the implementation from CellViT++.

Vision Transformer, based on https://people.sc.fsu.edu/~jburkardt/presentations/truncated_normal.pdf

Module Contents

loki2.models.base.vision_transformer.trunc_normal_(tensor: torch.Tensor, mean: float = 0.0, std: float = 1.0, a: float = -2.0, b: float = 2.0) → torch.Tensor

Initialize tensor with truncated normal distribution.

Parameters:

tensor – Tensor to initialize.
mean – Mean of the normal distribution. Defaults to 0.0.
std – Standard deviation of the normal distribution. Defaults to 1.0.
a – Lower bound of truncation. Defaults to -2.0.
b – Upper bound of truncation. Defaults to 2.0.

Returns:

The initialized tensor (modified in-place).

Return type:

torch.Tensor

loki2.models.base.vision_transformer.drop_path(x: torch.Tensor, drop_prob: float = 0.0, training: bool = False) → torch.Tensor

Drop paths (Stochastic Depth) per sample.

Parameters:

x – Input tensor.
drop_prob – Drop probability. Defaults to 0.0.
training – Whether in training mode. Defaults to False.

Returns:

Output tensor with dropped paths applied.

Return type:

torch.Tensor

class loki2.models.base.vision_transformer.PatchEmbed(img_size: int = 224, patch_size: int = 16, in_chans: int = 3, embed_dim: int = 768)

Bases: torch.nn.Module

Image to Patch Embedding (without positional embedding).

Converts input images into patch embeddings using a convolutional layer.

Parameters:

img_size – Input image size. Defaults to 224.
patch_size – Patch token size (one dimension only, tokens are squared). Defaults to 16.
in_chans – Number of input channels. Defaults to 3.
embed_dim – Embedding dimension. Defaults to 768.

num_patches

img_size

patch_size

proj

forward(x)

class loki2.models.base.vision_transformer.Attention(dim: int, num_heads: int = 8, qkv_bias: bool = False, qk_scale: float = None, attn_drop: float = 0.0, proj_drop: float = 0.0)

Bases: torch.nn.Module

Attention Module (Multi-Head Attention, MHA)

Parameters:

dim (int) – Embedding dimension
num_heads (int, optional) – Number of attention heads. Defaults to 8.
qkv_bias (bool, optional) – If bias should be used for query (q), key (k), and value (v). Defaults to False.
qk_scale (float, optional) – Scaling parameter. Defaults to None.
attn_drop (float, optional) – Dropout for attention layer. Defaults to 0.0.
proj_drop (float, optional) – Dropout for projection layers. Defaults to 0.0.

num_heads

head_dim

scale

qkv

attn_drop

proj

proj_drop

forward(x)

class loki2.models.base.vision_transformer.Mlp(in_features: int, hidden_features: int = None, out_features: int = None, act_layer: Callable = nn.GELU, drop: float = 0.0)

Bases: torch.nn.Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables:: training (bool) – Boolean represents whether this module is in training or evaluation mode.

out_features

hidden_features

fc1

act

fc2

drop

forward(x)

class loki2.models.base.vision_transformer.DropPath(drop_prob=None)

Bases: torch.nn.Module

Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).

drop_prob

forward(x)

class loki2.models.base.vision_transformer.Block(dim: int, num_heads: int, mlp_ratio: float = 4.0, qkv_bias: bool = False, qk_scale: float = None, drop: float = 0.0, attn_drop: float = 0.0, drop_path: float = 0.0, act_layer: Callable = nn.GELU, norm_layer: Callable = nn.LayerNorm)

Bases: torch.nn.Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables:: training (bool) – Boolean represents whether this module is in training or evaluation mode.

norm1

attn

drop_path

norm2

mlp_hidden_dim

mlp

forward(x, return_attention=False)

class loki2.models.base.vision_transformer.VisionTransformer(img_size: List[int] = [224], patch_size: int = 16, in_chans: int = 3, num_classes: int = 0, embed_dim: int = 768, depth: int = 12, num_heads: int = 12, mlp_ratio: float = 4.0, qkv_bias: bool = False, qk_scale: float = None, drop_rate: float = 0.0, attn_drop_rate: float = 0.0, drop_path_rate: float = 0.0, norm_layer: Callable = nn.LayerNorm, **kwargs)

Bases: torch.nn.Module

Vision Transformer

patch_embed

num_patches

cls_token

pos_embed

pos_drop

dpr

blocks

norm

head

interpolate_pos_encoding(x, w, h)

prepare_tokens(x)

forward(x: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor]

Forward pass

Parameters:: x (torch.Tensor) – Input batch
Returns:: Class token (raw)
Return type:: Tuple[torch.Tensor]

get_last_selfattention(x)

get_intermediate_layers(x, n=1)