Navigation

    deepnn.science

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    Quick Overview of Neural Architectures

    Architectures
    1
    1
    1360
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A
      admin last edited by admin

      Quick Overview of Neural Architectures

      I know it has been done a thousand times before but a Deep Learning forum couldn't potentially do without such a topic 😉 At the same time, it will serve as an explanation of what the sub-categories mean.

      Convolutional Neural Networks (CNNs)

      As the name suggests, all networks which employ the operation of convolution belong to this category (unless a more specific one is available, e.g. Graph Neural Networks). In a first approximation, one can say that a characteristic trait of CNNs is that sets of convolution kernels are applied indiscriminately to each element of input/intermediate layers. These sets usually vary between the layers, aiming to detect more abstract regularities in the data as the depth increases. The three most typical parameters of a CNN layer are the kernel size, dilation and stride. The latter two go beyond the approximation above and spread out kernels in such a way that some elements might be skipped in individual convolutions. CNNs are best known from and used profusely in image classification, object detection and image segmentation. Examples of famous CNNs include ResNet, VGG and Alexnet.

      Transformers (NLP)

      This and the next category might be a bit tricky. Transformers are essentially based on the usage of attention to selectively weight different portions of information and produce new representations in consecutive layers. They are most renowned for their use in Natural Langauge Processing (NLP) with the flagship for a number of years being BERT. The input in this use are encoded representations of words, word parts and special tokens - the so called embeddings. They are then transformed across consecutive layers and - based on their aggregation or a selected token - a classification/regression-type tasks can be performed. Alternatively, the output representations can be fed to a decoder which acts in the opposite direction and for example can output a translation in different language than the original. In the Transformers category I would like to stick to this type of language models, whereas the Attention-based category is dedicated to other uses of the attention mechanism.

      Attention-based

      • TODO

      Recurrent Neural Networks

      • TODO

      Autoencoders

      • TODO

      Bayesian Networks

      • TODO

      Generative Adversarial Networks

      • TODO

      Graph Neural Networks

      • TODO
      1 Reply Last reply Reply Quote 0
      • First post
        Last post
      Powered by NodeBB | Contributors