<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://mscneuro.neuro.uni-bremen.de/index.php?action=history&amp;feed=atom&amp;title=Layers</id>
	<title>Layers - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://mscneuro.neuro.uni-bremen.de/index.php?action=history&amp;feed=atom&amp;title=Layers"/>
	<link rel="alternate" type="text/html" href="https://mscneuro.neuro.uni-bremen.de/index.php?title=Layers&amp;action=history"/>
	<updated>2026-04-21T20:54:27Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.5</generator>
	<entry>
		<id>https://mscneuro.neuro.uni-bremen.de/index.php?title=Layers&amp;diff=502&amp;oldid=prev</id>
		<title>Davrot: Created page with &quot;Layers, layers, everywhere. If you understand PyTorch layers then you understand most of PyTorch. Well, and torch.tensor…  Questions to [mailto:davrot@uni-bremen.de David Rotermund]  Open Book &amp;#x26; Website recommendation If you don’t know what these layers mean then don’t be ashamed and check out these resources:   Dive into Deep Learning https://d2l.ai/  as well as corresponding Youtube channel from [https://www.youtube.com/playlist?list=PLZSO_6-bSqHQHBCoGaObUlj...&quot;</title>
		<link rel="alternate" type="text/html" href="https://mscneuro.neuro.uni-bremen.de/index.php?title=Layers&amp;diff=502&amp;oldid=prev"/>
		<updated>2025-10-21T14:25:47Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Layers, layers, everywhere. If you understand PyTorch layers then you understand most of PyTorch. Well, and torch.tensor…  Questions to [mailto:davrot@uni-bremen.de David Rotermund]  Open Book &amp;amp; Website recommendation If you don’t know what these layers mean then don’t be ashamed and check out these resources:   Dive into Deep Learning https://d2l.ai/  as well as corresponding Youtube channel from [https://www.youtube.com/playlist?list=PLZSO_6-bSqHQHBCoGaObUlj...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Layers, layers, everywhere. If you understand PyTorch layers then you understand most of PyTorch. Well, and torch.tensor…&lt;br /&gt;
&lt;br /&gt;
Questions to [mailto:davrot@uni-bremen.de David Rotermund]&lt;br /&gt;
&lt;br /&gt;
Open Book &amp;amp;#x26; Website recommendation If you don’t know what these layers mean then don’t be ashamed and check out these resources: &lt;br /&gt;
&lt;br /&gt;
Dive into Deep Learning https://d2l.ai/&lt;br /&gt;
&lt;br /&gt;
as well as corresponding Youtube channel from [https://www.youtube.com/playlist?list=PLZSO_6-bSqHQHBCoGaObUljoXAyyqhpFW Alex Smola]&lt;br /&gt;
&lt;br /&gt;
== torch.tensor ==&lt;br /&gt;
Before we can use layers we need to understand [https://pytorch.org/docs/stable/tensors.html torch.tensor] a bit more. You can understand them as a np.ndarray with additional properties and options. Most of it is like with numpy arrays. However, the tensor can shelter an additional gradient!&lt;br /&gt;
&lt;br /&gt;
There are official tutorials about tensors:&lt;br /&gt;
&lt;br /&gt;
* [https://pytorch.org/docs/stable/tensors.html Tensors]&lt;br /&gt;
* [https://pytorch.org/tutorials/beginner/nlp/pytorch_tutorial.html Introduction to Torch’s tensor library] However, I feel that they are a bit useless.&lt;br /&gt;
&lt;br /&gt;
Let’s be very blunt here: If you don’t understand the basics of [https://numpy.org/doc/stable/reference/arrays.ndarray.html Numpy ndarray] then you will be lost. If you know the usual stuff about np.ndarrays then you understand most of torch.tensor too. You only need some extra information, which I will try to provide here. &lt;br /&gt;
&lt;br /&gt;
The [https://pytorch.org/docs/stable/tensors.html#data-types torch.tensor data types] (for the CPU) are very similar to the Numpy. e.g. np.float32 -&amp;amp;#x3E; torch.float32&lt;br /&gt;
&lt;br /&gt;
=== There and back again (np.ndarray &amp;amp;#x3C;-&amp;amp;#x3E; torch.tensor) ===&lt;br /&gt;
Let us convert x_np: np.ndarray into a torch.Tensor:&amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;x_torch: torch.Tensor = torch.tensor(x_np)&amp;lt;/syntaxhighlight&amp;gt;And we convert it back into a torch.Tensor:&amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;x_np: np.ndarray = x_torch.detach().numpy()&amp;lt;/syntaxhighlight&amp;gt;or if x_torch is on the GPU:&amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;x_np : np.ndarray= x_torch.cpu().detach().numpy()&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.Tensor.detach.html?highlight=detach#torch.Tensor.detach Tensor.detach()]: Get rid of the gradient.&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.Tensor.numpy.html#torch.Tensor.numpy Tensor.numpy()] : “Returns self tensor as a NumPy ndarray. This tensor and the returned ndarray share the same underlying storage. Changes to self tensor will be reflected in the ndarray and vice versa.”&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.tensor.html#torch.tensor torch.tensor()]: Put stuff in there and get a torch.tensor out of it.&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.Tensor.type.html#torch.Tensor.type Tensor.type()]: numpy’s .astype()&lt;br /&gt;
&lt;br /&gt;
=== GPU interactions ===&lt;br /&gt;
[https://pytorch.org/docs/stable/generated/torch.Tensor.to.html?highlight=#torch-tensor-to torch.tensor.to]  : tensors and layers can life on different devices e.g. CPU and GPU. You need to move them where you need then. Typically only objects on the same device can interact. &amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;device_gpu = torch.device(&amp;quot;cuda:0&amp;quot;)&lt;br /&gt;
device_cpu = torch.device(&amp;quot;cpu&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
something_on_gpu =  something_on_cpu.to(device_gpu)&lt;br /&gt;
something_on_cpu =  something_on_gpu.to(device_cpu)&lt;br /&gt;
something_on_cpu =  something_on_gpu.cpu()&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available torch.cuda.is_available()]: Is there a cuda gpu in my system?&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.cuda.get_device_capability.html#torch.cuda.get_device_capability torch.cuda.get_device_capability()]: If you need to know what your gpu is capable of. You get a number and need to check [[wikipedia:CUDA#Version_features_and_specifications|here]] what this number means.  &lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.cuda.get_device_name.html#torch.cuda.get_device_name torch.cuda.get_device_name()]: In the case you forgot what GPU you bought or if you have sooo many GPUs in your system and don’t know which is what.&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.cuda.get_device_properties.html#torch.cuda.get_device_properties torch.cuda.get_device_properties()]: The two above plus how much memory the card has.&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.Tensor.device.html#torch.Tensor.device torch.Tensor.device]: Where does this tensor life?&lt;br /&gt;
&lt;br /&gt;
=== Dimensions ===&lt;br /&gt;
&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.squeeze.html#torch.squeeze torch.squeeze()]: get rid of (a selected) dimension with size 1&lt;br /&gt;
* [https://pytorch.org/docs/stable/generated/torch.unsqueeze.html#torch.unsqueeze torch.unsqueeze()]: add a dimension of size 1 at a defined position&lt;br /&gt;
&lt;br /&gt;
== Layers (i.e. [https://pytorch.org/docs/stable/nn.html basic building blocks]) ==&lt;br /&gt;
Obviously there are &amp;#039;&amp;#039;&amp;#039;a lot&amp;#039;&amp;#039;&amp;#039; of available layer. I want to list only the relevant ones for the typical daily use.&lt;br /&gt;
&lt;br /&gt;
Please note the first upper case letter. If the thing you want to use doesn’t have a first upper case letter then something is wrong and it is not a layer.&lt;br /&gt;
&lt;br /&gt;
I will skip the following layer types (because you will not care about them):&lt;br /&gt;
&lt;br /&gt;
* [https://pytorch.org/docs/stable/nn.html#sparse-layers Sparse Layers]&lt;br /&gt;
* [https://pytorch.org/docs/stable/nn.html#vision-layers Vision Layers]&lt;br /&gt;
* [https://pytorch.org/docs/stable/nn.html#vision-layers Shuffle Layers]&lt;br /&gt;
* [https://pytorch.org/docs/stable/nn.html#dataparallel-layers-multi-gpu-distributed DataParallel Layers (multi-GPU, distributed)]&lt;br /&gt;
&lt;br /&gt;
In the following I will mark the relevant layers.&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#convolution-layers Convolution Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html#torch.nn.Conv1d torch.nn.Conv1d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 1D convolution over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d torch.nn.Conv2d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 2D convolution over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Conv3d.html#torch.nn.Conv3d torch.nn.Conv3d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 3D convolution over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ConvTranspose1d&lt;br /&gt;
|Applies a 1D transposed convolution operator over an input image composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ConvTranspose2d&lt;br /&gt;
|Applies a 2D transposed convolution operator over an input image composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ConvTranspose3d&lt;br /&gt;
|Applies a 3D transposed convolution operator over an input image composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyConv1d&lt;br /&gt;
|A torch.nn.Conv1d module with lazy initialization of the in_channels argument of the Conv1d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyConv2d&lt;br /&gt;
|A torch.nn.Conv2d module with lazy initialization of the in_channels argument of the Conv2d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyConv3d&lt;br /&gt;
|A torch.nn.Conv3d module with lazy initialization of the in_channels argument of the Conv3d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyConvTranspose1d&lt;br /&gt;
|A torch.nn.ConvTranspose1d module with lazy initialization of the in_channels argument of the ConvTranspose1d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyConvTranspose2d&lt;br /&gt;
|A torch.nn.ConvTranspose2d module with lazy initialization of the in_channels argument of the ConvTranspose2d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyConvTranspose3d&lt;br /&gt;
|A torch.nn.ConvTranspose3d module with lazy initialization of the in_channels argument of the ConvTranspose3d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Unfold&lt;br /&gt;
|Extracts sliding local blocks from a batched input tensor.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Fold&lt;br /&gt;
|Combines an array of sliding local blocks into a large containing tensor.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#pooling-layers Pooling layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.MaxPool1d.html#torch.nn.MaxPool1d torch.nn.MaxPool1d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 1D max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html#torch.nn.MaxPool2d torch.nn.MaxPool2d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 2D max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.MaxPool3d.html#torch.nn.MaxPool3d torch.nn.MaxPool3d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 3D max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MaxUnpool1d&lt;br /&gt;
|Computes a partial inverse of MaxPool1d.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MaxUnpool2d&lt;br /&gt;
|Computes a partial inverse of MaxPool2d.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MaxUnpool3d&lt;br /&gt;
|Computes a partial inverse of MaxPool3d.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.AvgPool1d.html#torch.nn.AvgPool1d torch.nn.AvgPool1d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 1D average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.AvgPool2d.html#torch.nn.AvgPool2d torch.nn.AvgPool2d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 2D average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.AvgPool3d.html#torch.nn.AvgPool3d torch.nn.AvgPool3d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a 3D average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.FractionalMaxPool2d&lt;br /&gt;
|Applies a 2D fractional max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.FractionalMaxPool3d&lt;br /&gt;
|Applies a 3D fractional max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LPPool1d&lt;br /&gt;
|Applies a 1D power-average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LPPool2d&lt;br /&gt;
|Applies a 2D power-average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveMaxPool1d&lt;br /&gt;
|Applies a 1D adaptive max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveMaxPool2d&lt;br /&gt;
|Applies a 2D adaptive max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveMaxPool3d&lt;br /&gt;
|Applies a 3D adaptive max pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveAvgPool1d&lt;br /&gt;
|Applies a 1D adaptive average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveAvgPool2d&lt;br /&gt;
|Applies a 2D adaptive average pooling over an input signal composed of several input planes.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveAvgPool3d&lt;br /&gt;
|Applies a 3D adaptive average pooling over an input signal composed of several input planes.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#padding-layers Padding Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.ReflectionPad1d&lt;br /&gt;
|Pads the input tensor using the reflection of the input boundary.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ReflectionPad2d&lt;br /&gt;
|Pads the input tensor using the reflection of the input boundary.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ReflectionPad3d&lt;br /&gt;
|Pads the input tensor using the reflection of the input boundary.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ReplicationPad1d&lt;br /&gt;
|Pads the input tensor using replication of the input boundary.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ReplicationPad2d&lt;br /&gt;
|Pads the input tensor using replication of the input boundary.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ReplicationPad3d&lt;br /&gt;
|Pads the input tensor using replication of the input boundary.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ZeroPad1d&lt;br /&gt;
|Pads the input tensor boundaries with zero.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ZeroPad2d&lt;br /&gt;
|Pads the input tensor boundaries with zero.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ZeroPad3d&lt;br /&gt;
|Pads the input tensor boundaries with zero.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ConstantPad1d&lt;br /&gt;
|Pads the input tensor boundaries with a constant value.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ConstantPad2d&lt;br /&gt;
|Pads the input tensor boundaries with a constant value.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ConstantPad3d&lt;br /&gt;
|Pads the input tensor boundaries with a constant value.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity Non-linear Activations (weighted sum, nonlinearity)] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.ELU&lt;br /&gt;
|Applies the Exponential Linear Unit (ELU) function, element-wise, as described in the paper: Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Hardshrink&lt;br /&gt;
|Applies the Hard Shrinkage (Hardshrink) function element-wise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Hardsigmoid&lt;br /&gt;
|Applies the Hardsigmoid function element-wise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Hardtanh&lt;br /&gt;
|Applies the HardTanh function element-wise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Hardswish&lt;br /&gt;
|Applies the Hardswish function, element-wise, as described in the paper: Searching for MobileNetV3.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html#torch.nn.LeakyReLU torch.nn.LeakyReLU]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies the element-wise function:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LogSigmoid&lt;br /&gt;
|Applies the element-wise function:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MultiheadAttention&lt;br /&gt;
|Allows the model to jointly attend to information from different representation subspaces as described in the paper: Attention Is All You Need.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.PReLU&lt;br /&gt;
|Applies the element-wise function:&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html#torch.nn.ReLU torch.nn.ReLU]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies the rectified linear unit function element-wise:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.ReLU6&lt;br /&gt;
|Applies the element-wise function:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.RReLU&lt;br /&gt;
|Applies the randomized leaky rectified liner unit function, element-wise, as described in the paper:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.SELU&lt;br /&gt;
|Applied element-wise…&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.CELU&lt;br /&gt;
|Applies the element-wise function…&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.GELU&lt;br /&gt;
|Applies the Gaussian Error Linear Units function:&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Sigmoid.html#torch.nn.Sigmoid torch.nn.Sigmoid]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies the element-wise function:…&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.SiLU&lt;br /&gt;
|Applies the Sigmoid Linear Unit (SiLU) function, element-wise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Mish&lt;br /&gt;
|Applies the Mish function, element-wise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Softplus&lt;br /&gt;
|Applies the Softplus function&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Softshrink&lt;br /&gt;
|Applies the soft shrinkage function elementwise:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Softsign&lt;br /&gt;
|Applies the element-wise function:…&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Tanh.html#torch.nn.Tanh torch.nn.Tanh]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies the Hyperbolic Tangent (Tanh) function element-wise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Tanhshrink&lt;br /&gt;
|Applies the element-wise function:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Threshold&lt;br /&gt;
|Thresholds each element of the input Tensor.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.GLU&lt;br /&gt;
|Applies the gated linear unit function&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#non-linear-activations-other Non-linear Activations (other)] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.Softmin&lt;br /&gt;
|Applies the Softmin function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0, 1] and sum to 1.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html#torch.nn.Softmax torch.nn.Softmax]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies the Softmax function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0,1] and sum to 1.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.Softmax2d&lt;br /&gt;
|Applies SoftMax over features to each spatial location.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LogSoftmax&lt;br /&gt;
|Applies the LogSoftmax&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AdaptiveLogSoftmaxWithLoss&lt;br /&gt;
|Efficient softmax approximation as described in Efficient softmax approximation for GPUs by Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, and Hervé Jégou.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#non-linear-activations-other Normalization Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html#torch.nn.BatchNorm1d torch.nn.BatchNorm1d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies Batch Normalization over a 2D or 3D input as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html#torch.nn.BatchNorm2d torch.nn.BatchNorm2d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies Batch Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm3d.html#torch.nn.BatchNorm3d torch.nn.BatchNorm3d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies Batch Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyBatchNorm1d&lt;br /&gt;
|A torch.nn.BatchNorm1d module with lazy initialization of the num_features argument of the BatchNorm1d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyBatchNorm2d&lt;br /&gt;
|A torch.nn.BatchNorm2d module with lazy initialization of the num_features argument of the BatchNorm2d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyBatchNorm3d&lt;br /&gt;
|A torch.nn.BatchNorm3d module with lazy initialization of the num_features argument of the BatchNorm3d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.GroupNorm&lt;br /&gt;
|Applies Group Normalization over a mini-batch of inputs as described in the paper Group Normalization&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.SyncBatchNorm&lt;br /&gt;
|Applies Batch Normalization over a N-Dimensional input (a mini-batch of [N-2]D inputs with additional channel dimension) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.InstanceNorm1d&lt;br /&gt;
|Applies Instance Normalization over a 2D (unbatched) or 3D (batched) input as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.InstanceNorm2d&lt;br /&gt;
|Applies Instance Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.InstanceNorm3d&lt;br /&gt;
|Applies Instance Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension) as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyInstanceNorm1d&lt;br /&gt;
|A torch.nn.InstanceNorm1d module with lazy initialization of the num_features argument of the InstanceNorm1d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyInstanceNorm2d&lt;br /&gt;
|A torch.nn.InstanceNorm2d module with lazy initialization of the num_features argument of the InstanceNorm2d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LazyInstanceNorm3d&lt;br /&gt;
|A torch.nn.InstanceNorm3d module with lazy initialization of the num_features argument of the InstanceNorm3d that is inferred from the input.size(1).&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LayerNorm&lt;br /&gt;
|Applies Layer Normalization over a mini-batch of inputs as described in the paper Layer Normalization&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LocalResponseNorm&lt;br /&gt;
|Applies local response normalization over an input signal composed of several input planes, where channels occupy the second dimension.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#recurrent-layers Recurrent Layers] ===&lt;br /&gt;
RNN, GRU, LSTM and such lives here. If you don’t know what this means then you don’t need them…&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.RNNBase&lt;br /&gt;
|Base class for RNN modules (RNN, LSTM, GRU).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.RNN.html#torch.nn.RNN torch.nn.RNN]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a multi-layer Elman&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html#torch.nn.LSTM torch.nn.LSTM]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.GRU.html#torch.nn.GRU torch.nn.GRU]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.RNNCell&lt;br /&gt;
|An Elman RNN cell with tanh or ReLU non-linearity.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.LSTMCell&lt;br /&gt;
|A long short-term memory (LSTM) cell.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.GRUCell&lt;br /&gt;
|A gated recurrent unit (GRU) cell&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#transformer-layers Transformer Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.Transformer&lt;br /&gt;
|A transformer model.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.TransformerEncoder&lt;br /&gt;
|TransformerEncoder is a stack of N encoder layers.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.TransformerDecoder&lt;br /&gt;
|TransformerDecoder is a stack of N decoder layers&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.TransformerEncoderLayer&lt;br /&gt;
|TransformerEncoderLayer is made up of self-attn and feedforward network.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.TransformerDecoderLayer&lt;br /&gt;
|TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#linear-layers Linear Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Identity.html#torch.nn.Identity torch.nn.Identity]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|A placeholder identity operator that is argument-insensitive&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear torch.nn.Linear]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a linear transformation to the incoming data&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Bilinear.html#torch.nn.Bilinear torch.nn.Bilinear]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Applies a bilinear transformation to the incoming data&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html#torch.nn.LazyLinear torch.nn.LazyLinear]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|A torch.nn.Linear module where in_features is inferred.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#dropout-layers Dropout Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout torch.nn.Dropout]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Dropout1d.html#torch.nn.Dropout1d torch.nn.Dropout1d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Randomly zero out entire channels (a channel is a 1D feature map).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Dropout2d.html#torch.nn.Dropout2d torch.nn.Dropout2d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Randomly zero out entire channels (a channel is a 2D feature map).&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Dropout3d.html#torch.nn.Dropout3d torch.nn.Dropout3d]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Randomly zero out entire channels (a channel is a 3D feature map)&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.AlphaDropout&lt;br /&gt;
|Applies Alpha Dropout over the input.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.FeatureAlphaDropout&lt;br /&gt;
|Randomly masks out entire channels (a channel is a feature map)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#sparse-layers Sparse Layers] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.Embedding&lt;br /&gt;
|A simple lookup table that stores embeddings of a fixed dictionary and size.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.EmbeddingBag&lt;br /&gt;
|Computes sums or means of ‘bags’ of embeddings, without instantiating the intermediate embeddings.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#distance-functions Distance Functions] ===&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.CosineSimilarity&lt;br /&gt;
|Returns cosine similarity&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.PairwiseDistance&lt;br /&gt;
|Computes the pairwise distance between input vectors, or between columns of input matrices.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#loss-functions Loss Functions] ===&lt;br /&gt;
There is a huge amount of loss function and I will only list a few selected ones. However, in 90% of the cases you will only use * [https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html#torch.nn.MSELoss torch.nn.MSELoss] * [https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss torch.nn.CrossEntropyLoss]&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|torch.nn.L1Loss&lt;br /&gt;
|Creates a criterion that measures the mean absolute error (MAE) between each element in the input&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html#torch.nn.MSELoss torch.nn.MSELoss]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html#torch.nn.CrossEntropyLoss torch.nn.CrossEntropyLoss]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|This criterion computes the cross entropy loss between input logits and target.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.CTCLoss&lt;br /&gt;
|The Connectionist Temporal Classification loss.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.NLLLoss&lt;br /&gt;
|The negative log likelihood loss.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.PoissonNLLLoss&lt;br /&gt;
|Negative log likelihood loss with Poisson distribution of target.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.GaussianNLLLoss&lt;br /&gt;
|Gaussian negative log likelihood loss.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.KLDivLoss&lt;br /&gt;
|The Kullback-Leibler divergence loss.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.BCELoss&lt;br /&gt;
|Creates a criterion that measures the Binary Cross Entropy between the target and the input probabilities:&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.BCEWithLogitsLoss&lt;br /&gt;
|This loss combines a Sigmoid layer and the BCELoss in one single class.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MarginRankingLoss&lt;br /&gt;
|Creates a criterion that measures the loss&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.HingeEmbeddingLoss&lt;br /&gt;
|Measures the loss given an input tensor&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MultiLabelMarginLoss&lt;br /&gt;
|Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss)&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.HuberLoss&lt;br /&gt;
|Creates a criterion that uses a squared term if the absolute element-wise error falls below delta and a delta-scaled L1 term otherwise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.SmoothL1Loss&lt;br /&gt;
|Creates a criterion that uses a squared term if the absolute element-wise error falls below beta and an L1 term otherwise.&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.SoftMarginLoss&lt;br /&gt;
|Creates a criterion that optimizes a two-class classification logistic loss&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MultiLabelSoftMarginLoss&lt;br /&gt;
|Creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.CosineEmbeddingLoss&lt;br /&gt;
|Creates a criterion that measures the loss given input tensors and a Tensor label&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.MultiMarginLoss&lt;br /&gt;
|Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss)&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.TripletMarginLoss&lt;br /&gt;
|Creates a criterion that measures the triplet loss given an input tensors&lt;br /&gt;
|-&lt;br /&gt;
|torch.nn.TripletMarginWithDistanceLoss&lt;br /&gt;
|Creates a criterion that measures the triplet loss given input tensors&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#dropout-layers Utilities] ===&lt;br /&gt;
In this category you will find a lot of utility functions… A lot!&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html#torch.nn.Flatten torch.nn.Flatten]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Flattens a contiguous range of dims into a tensor.&lt;br /&gt;
|-&lt;br /&gt;
|&amp;#039;&amp;#039;&amp;#039;[https://pytorch.org/docs/stable/generated/torch.nn.Unflatten.html#torch.nn.Unflatten torch.nn.Unflatten]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
|Unflattens a tensor dim expanding it to a desired shape.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
=== [https://pytorch.org/docs/stable/nn.html#quantized-functions Quantization] ===&lt;br /&gt;
The probability that you need it is low but I listed it here because we are working on it. And if I need to find the [https://pytorch.org/docs/stable/quantization.html#quantization-doc link]…&lt;/div&gt;</summary>
		<author><name>Davrot</name></author>
	</entry>
</feed>