That’s why it’s so important to choose deep learning architecture correctly. In our next tutorial i will explain you how the neural network works step by step and what is backpropagation in detail, along with programmatic implementation of neural network using python and keras. In graphs, on the other hand, the fact that the nodes are inter-related via edges creates statistical dependence between samples in the training set. They were popularized by Frank Rosenblatt in the early 1960s. This is also used widely as in many android or ios devices as photo editor. Simplicity is one of their greatest advantages. GRUs are used for smaller and less frequent datasets, where they show better performance. MobileNet is essentially a streamlined version of the Xception architecture optimized for mobile applications. Deep Learning is able to solve a plethora of once impossible problems. Customer Retention Analysis & Churn Prediction, Deep Learning Architecture – Autoencoders, Business Intelligence Consulting Services, https://en.wikipedia.org/wiki/Recurrent_neural_network, https://en.wikipedia.org/wiki/Bidirectional_recurrent_neural_networks, https://en.wikipedia.org/wiki/Long_short-term_memory, https://developer.ibm.com/technologies/artificial-intelligence/articles/cc-machine-learning-deep-learning-architectures/, https://en.wikipedia.org/wiki/Gated_recurrent_unit, https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53, https://en.wikipedia.org/wiki/Deep_belief_network, https://www.researchgate.net/figure/A-Deep-Stacking-Network-Architecture_fig1_272885058. RNNs consist of a rich set of deep learning architectures. Reason 2: Evolution of Compute power- I can say this is the most important reason which led to the evolution of deep neural networks because it requires a lots of computation per second to train neural networks and for this to happen we need lots of computation power and the evolution of GPU’s and TPU’s changed our dreams to reality and still lot to come. NNs are arranged in layers in a stack kind of shape. This is the learnt formulae by the neural network in this the 32 is termed as bias. Best Keras Tutorials and Courses for Deep Learning. We will try to understand each and every use cases in detail in our further articles. And deep learning architectures are based on these networks. In the simplest form, NAS is the problem of choosing operations in different layers of a neural network. Your email address will not be published. by Edwin Lisowski | Jul 21, 2020 | Machine Learning | 0 comments 7 min read. Hochreiter & Schmidhuber (1997) [4] solved the problem of getting a … Weight: This is something which model learns while training. We can think the architecture of neural network is same as of the human brain like whatever we used to see that terms as input and according to the input we judge what input is important based on different different context basically what to remember and what to leave, in this process we are assigning weight with the help of activation function if we will compare it with neural network. In our last Series of Deep learning we had learnt how we prepare Input for our Neural networks for Natural Language processing by using word embedding and tokenization. In fact, we can indicate at least six types of neural networks and deep learning architectures that are built on them. Now we will try to understand the basic architecture of the Neural networks. This feedback allows them to maintain the memory of past inputs and solve problems in time. ABSTRACT. (Driverless AI example), Loss Change Allocation: A Microscope into Model Training, Which One Should You choose? Unlike other models, each layer in DBN learns the entire input. It’s a bit like a machine learning framework–it allows you to make more practical use of this technology, accelerates your work, and enables various endeavors without the need to build an ML algorithm entirely from scratch. Typically, DSNs consist of three or more modules. Pixel to image: This means the generation of picture from drawing of the sketch. Training a deep convolutional neural network. They can use their internal state (memory) to process variable-length sequences of inputs. — Long / Short Term Memory. VGG-16. The input gate controls when new information can flow into the memory. At this point, we should also mention the last, and considered the most straightforward, architecture. Therefore, we can state that DBN is a stack of RBMs. AlexNet. However, LSTM has feedback connections. It is a multi-layer neural network designed to analyze visual inputs and perform tasks such as image classification, segmentation and object detection, which can be useful for autonomous vehicles. Deep neural networks have become invaluable tools for supervised machine learning, e.g., classification of text or images. You have to know that neural networks are by no means homogenous. For example if we will give the sentence “Parrot is sitting on tree” the model will output a image of parrot which is sitting on tree. With our help, your organization can benefit from deep learning architecture. Autoencoders are mainly used for dimensionality reduction and, naturally, anomaly detection (for instance, frauds). Today, we want to get deeper into this subject. DBNs work holistically and regulate each layer in order. The CNN’s hidden layers typically consist of a series of convolutional layers. It is the year 1994, and this is one of the very first convolutional neural networks, and what … Previous Chapter Next Chapter. Simply put, Autoencoders condense the input into a lower-dimensional code. Text to image synthesis: This means we will be giving input as a text to model and it will generate the image based on that text. A typical LSTM architecture is composed of a cell, an input gate, an output gate, and a forget gate. 03/30/2020 ∙ by Jie Hu, et al. I decided to start with basics and build on them. The output layer is also associated with the activation function which gives the probability of the levels. Deep learning using deep neural networks is taking machine intelligence to the next level in computer vision, speech recognition, natural language processing, etc. Considered the first generation of neural networks, perceptrons are simply computational models of a single neuron. This is at very high level. Deep Learning Architecture Deep Learning Architectures. There are many more, such as image colorization, image inpainting, Machine translation and many more. Given enough labeled training datasets and suitable models, deep learning approaches can help humans establish mapping functions for operation convenience. Each module consists of an input layer, a hidden layer, and an output layer. The development of neural networks started in 1990’s i mean LSTM(Long Short term memory) was developed in 1997 and CNN(Convolution Neural Networks) was developed in 1998. We saved DSN for last because this deep learning architecture is different from the others. Today, we can indicate six of the most common deep learning architectures: Don’t worry if you don’t know these abbreviations; we are going to explain each one of them. Soon, abbreviations like RNN, CNN, or DSN will no longer be mysterious. This is the primary job of a Neural Network – to transform input into a meaningful output. This video describes the variety of neural network architectures available to solve various problems in science ad engineering. However, there’s also the other side of the coin. Activation Function: This we can understand is a type of threshold which is responsible for the activation of any neurons. For example suppose we will give a “image of a boy using laptop” the model will decode the image to to the output as text “boy using laptop”. It’s a type of LSTM. Auto ML explained in 500 words! This architecture is commonly used for image processing, image recognition, video analysis, and NLP. The different types of neural network architectures are - Single Layer Feed Forward Network. Output layers: This is the last layer of the neural network which is responsible for prediction. When it comes to deep learning, you have various types of neural networks. I mean based on the value it will decide the importance of each input and if any input needs to used so what will be the importance at very high level. Simplifying deep neural networks for neuromorphic architectures. I will explain each and every terms related to deep learning in my next article. This construction enables DSNs to learn more complex classification than it would be possible with just one module. Experienced Information Management Consultant with a demonstrated history of working in the information technology and services industry. DBNs use probabilities and unsupervised learning to produce outputs. We can apply object detection at traffic in metropolitan city. Let’s talk for a second about autoencoders. This can be explained from below picture. This architecture has been designed in order to improve the training issue, which is quite complicated when it comes to traditional deep learning models. In my next tutorial exactly i will be using this use case and will explain you each and every steps how to implement this conversion using Keras and fully connected layer i.e dense layer in keras. Your email address will not be published. This makes them useful when it comes to, for instance, speech recognition[1]. As you can see, although deep learning architectures are, generally speaking, based on the same idea, there are various ways to achieve a goal. Let … DBN is a multilayer network (typically deep, including many hidden layers) in which each pair of connected layers is a Restricted Boltzmann Machine (RBM). RNNs are very useful when it comes to fields where the sequence of presented information is key. This indicates that biological neural networks are, to some degree, architecture agnostic. A survey of deep neural network architectures and their applications @article{Liu2017ASO, title={A survey of deep neural network architectures and their applications}, author={Weibo Liu and Zidong Wang and Xiaohui Liu and Nianyin Zeng and Yurong Liu and Fuad E. Alsaadi}, journal={Neurocomputing}, year={2017}, volume={234}, … Neural Networks are complex structures made of artificial neurons that can take in multiple inputs to produce a single output. GAN or VAE? CNNs consist of an input and an output layer, as well as multiple hidden layers. Our team of experts will turn your data into business insights. Question Answering: This is also one of the most important use case of NLP in which we used to train our model on the sequence of question and answer and allow our model to learn the sequence and that can be used. The NVIDIA CUDA, Deep Neural Network library(cuDNN) is a GPU-accelerated library of primitive for deep neural networks. In this series we will try to understand the core concepts of Deep Neural networks, rise of Neural networks and what can Neural networks do i mean what all the task we can achieve by applying neural networks concepts in industry. However, they are vulnerable to input adversarial attacks preventing them from being autonomously deployed in critical applications. All the nodes of input layer is connected to the nodes of hidden layers. We will try to understand deep architecture when we will understand supervised, unsupervised and semi supervised in our latter article. ∙ 0 ∙ share . Mainly we use RNN as both encoder and decoder in this use cases. First of all, we have to state that deep learning architecture consists of deep/neural networks of varying topologies. As we are aware, soon we will be entering into the world of Quantum computing. AlexNet is the first deep architecture which was introduced by one of the pioneers in deep … Deep neural networks (DNNs), which employ deep architectures in NNs, can represent functions with higher complexity if the numbers of layers and units in a single layer are increased. Each input (for instance, image) will pass through a series of convolution layers with various filters. If we understand the above example: Now we will see the basic architecture of Neural networks. This means that it can process not only single data points (such as images) but also entire sequences of data (such as audio or video files)[3]. Part-I, Helping Scientists Protect Beluga Whales with Deep Learning, Predicting the Political Alignment of Twitter Users. RNN: Recurrent Neural Networks. We had seen and understand why deep learning started become popular recently by understanding above 3 reasons. Chatbots are most important use cases and its used widely now a days in the industry. We discuss various architectures that support DNN executions in terms of computing units, dataflow optimization, targeted network topologies, architectures on emerging technologies, and accelerators for emerging applications. These six architectures are the most common ones in the modern deep learning architecture world. and the different terms associated with the neural networks. Lets get started. Image captioning: This is one of the most important use cases of deep learning in this we used to give a image to the network and the network understand that image and will add caption to it. At the time of its introduction, this model was considered to be very deep. Currently, we can indicate two types of RNN: You may also find it interesting – Business Intelligence Consulting Services. I recommend you to go through the imagenet website and try to explore the things there. Figure 1. Every hidden layers are associated with the activation function. As a result, you can classify the output. We will then move on to understanding the different Deep Learning Architectures, including how to set up your architecture … I will start with a confession – there was a time when I didn’t really understand deep learning. Now that we’ve seen some of the components of deep networks, let’s take a look at the four major architectures of deep networks and how we use the smaller networks to build them. However, LSTM has feedback connections. What does it mean? In 1969, Minsky and Papers published a book called “Perceptrons”that analyzed what they could do and showed their limitations. The general principle is that neural networks are based on several layers that proceed data–an input layer (raw data), hidden layers (they process and combine input data), and an output layer (it produces the outcome: result, estimation, forecast, etc.). However, artificial networks rely on their fine-tuned weights and hand-crafted architectures for their remarkable performance. We can have multiple hidden layers in the network. Best PyTorch Tutorials and Courses. This is the example of encoder-decoder architecture of the Deep neural networks. This is the widely used application of deep learning now a days and we have many use cases on object detection. Deep RNN: Multiple layers are present. Reason 1: Availability of large amount of dataset- This is one of the reason for the evolution of deep learning. It’s also a type of RNN. I would look at the research papers and articles on the topic and feel like it is a very complex topic. This tutorial provides a brief recap on the basics of deep neural networks and is for those who are interested in understanding how those models are mapping to hardware architectures. This is also one of the most important use case that we will be discussed latter. Coming to imagenet, it is a huge repository for the images which consists of 1000 categories images of more than 1 millions in numbers. Let us show you how! 936 Views • Posted On Aug. 23, 2020. DSNs are also frequently called DCN–Deep Convex Network. Various deep learning techniques (LeCun et al., 1998; Srivastava et al., 2014; Ioffe and Szegedy, 2015) enable the effective optimization of deep ANNs by constructing multiple levels of feature hierarchies and show remarkable results, which occasionally outperform human-level performance (Krizhevsky et al., 20… For each DNN, multiple performance indices are observed, such as recognition accuracy, model complexity, computational complexity, memory usage, and inference time. The data produced in 2019 is more than the complete data what has been produced in between 2000–2018 and the total data what will be going to produced in the end of 2020 it will be more than the data produced in 2000–2019. Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Reason 3: Ability to deploy matrix multiplication on GPU,s- This has the relation with the second reason what i had mentioned above. These modules are stacked one on top of another, which means that the input of a given module is based on the output of prior modules/layers. Abstract: This paper presents an in-depth analysis of the majority of the deep neural networks (DNNs) proposed in the state of the art for image recognition. Architecture Disentanglement for Deep Neural Networks. If you don’t, the information that comes out of the Autoencoder can be unclear or biased. The first layer is known as input layer that means from this layer we used to pass all the desired input to the model and after it goes through the hidden layers and after all the calculation in hidden layers, it is passed to the output layer for the prediction and re-learning. The cell remembers values over arbitrary time intervals, and these three gates regulate the flow of information into and out of the cell. Autoencoders are a specific type of feedforward neural network. The VGG network, introduced in 2014, offers a deeper yet simpler variant of the convolutional structures discussed above. Here we understand how Neural Networks work and the benefits they offer for supervised and well as unsupervised learning before building our very own neural network. The basic architecture of the neural network at very high level? Now we will feed this input and output to our network and the network will self assign the weights to these input bases on their importance. Get a quick estimate of your AI or BI project within 1 business day. Hidden layers: This is the middle layer of neural network, this is also known as the black box. Accordingly, designing efficient hardware architectures for deep neural networks is an important step towards enabling the wide deployment of DNNs in AI systems. This is the example of encoder-decoder architecture of the Deep neural networks. Earlier in the book, we introduced four major network architectures: Unsupervised Pretrained Networks (UPNs) Convolutional Neural Networks (CNNs) Recurrent Neural Networks; Recursive Neural Networks Skilled in Data Warehousing, Business Intelligence, Big Data, Integration and Advanced Analytics. We will look each and every activation function in details along with their mathematical function and graph in our latter article. Next, you have to flatten the output and feed it into the fully connected layer where all the layers of the network are connected with every neuron from a preceding layer to the neurons from the subsequent layer. Pages 1–6. The goal of neural architecture search (NAS) is to find novel networks for new problem domains and criteria automatically and efficiently. In this work, we propose new architectures for Deep Neural Networks (DNN) and exemplarily show their eectiveness for solving supervised Machine Learning (ML) problems; for a general overview about DNN and ML see, e.g., [40,21,1,22] and reference therein. Delivered straight to your inbox. You need high-quality, representative training data. The general idea is that the input and the output are pretty much the same. According to a paper “An Evaluation of Deep Learning Miniature Concerning in Soft Computing”[8] published in 2015, “the central idea of the DSN design relates to the concept of stacking, as proposed originally, where simple modules of functions or classifiers are composed first and then they are stacked on top of each other in order to learn complex functions or classifiers.”. Grow your businness with machine learning and big data solutions. I tried understanding Neural networks and their various types, but it still looked difficult.Then one day, I decided to take one step at a time. Pruning Deep Convolutional Neural Networks Architectures with Evolution Strategy. For example if will provide temperature in Celsius as the input and temperature in Fahrenheit the model learns the formulae of the conversion from Celsius to Fahrenheit as (x degree calsius*9/5)+32. If you want to find out more about this tremendous technology, get in touch with us. Thanks to many layers, DSNs consider training, not a single problem that has to be solved but a set of individual problems. Before that we will try to understand what neural network does and basically the concept of weight in neural networks at high level. Take a look. In CNNs, the first layers only filter inputs for basic features, and the latter layers recombine all the simple patterns found by the previous layers. By training the neural networks with lots of example of this type my model will also develop human intelligence and will give less importance to name and more importance to “how much i had studied” this is the basic example to understand the concept of weight in neural networks. This is again the architecture of encoder-decoder in which we used to give image as input which is encoded by the CNN after the encoded output is given to RNN to decode that image as text. The deep learning neural networks basically used for this use case is GAN’S. Let’s say that RNNs have a memory. I will walk you through the deep architecture of GAN’S in the latter article. DSN/DCN comprises a deep network, but it’s actually a set of individual deep networks. when the input passed to the neural networks based on the importance model used to assign the value to that input and that value is nothing its a weight at very high level. As per my understanding the weights to the “how much i studied” will be more because this is the important factor either i am going to pass the exam or not and “my name” this input weight will be less because name doesn’t decide for a person that he is going to pass the exam or not. Many people thought these limitations applied to all neural network models. Author links open overlay panel Francisco E. Fernandes Jr. a 1 Gary G. Yen b RNN is one of the fundamental network architectures from which other deep learning... LSTM: Long Short-Term Memory. Each nodes of hidden layers is connected with the output layer and the output generated by hidden layers are transferred to the output layer for the evaluation purpose. Also if anyone is interested in cloud computing they can go through my below blog for step by step understanding of cloud computing. The major difference is that GRU has fewer parameters than LSTM, as it lacks an output gate[5]. ∙ 0 ∙ share . DBNs can be used i.a. It’s also a type of RNN. As we know we need to pass matrix as the input to our neural networks so we need maximum amount of matrix calculation and to perform this we need high computation or parallel computation. DBN is composed of multiple layers of latent variables (“hidden units”), with connections between the layers but not between units within each layer[7]. The basic neural network consists of the input layer, weights, bias, activation function, hidden layers and output layer. Neural Network: Architecture. This is the something which model learns and also we used to provide as the time of input. Today, LSTMs are commonly used in such fields as text compression, handwriting recognition, speech recognition, gesture recognition, and image captioning[4]. Earlier, when we don’t have large amount of data, after the changing of the era from paper world to digital world at starting of 2003–04 the generation of data started growing exponentially and each and every year it is growing more than that. Architecture… The control layer controls how the signal flows from one layer to the other. From this layer we used to feed prepared input and the corresponding levels to the model. There are mostly 3 reasons why the deep neural networks became popular in late of 2010. we will try to understand one by one. They are easy to build and train. A model is simply a mathematical object or entity that contains some theoretical background on AI to be able to learn from a dataset. 47, Swieradowska St. 02-662,Warsaw, Poland Tel: +48 735 599 277 email: contact@addepto.com, 14-23 Broadway 3rd floor, Astoria, NY, 11106, Tel: +1 929 321 9291 email: contact@addepto.com, Get weekly news about advanced data solutions and technology. Popular models in supervised learning include decision trees, support vector machines, and of course, neural networks (NNs). Let’s start with the first one. Architecture of Neural Networks We found a non-linear model by combining two linear models with some equation, weight, bias, and sigmoid function. We have successfully seen the when neural networks evolved? RNN is one of the fundamental network architectures from which other deep learning architectures are built. In this case what all the input we can think? Required fields are marked *. 11/26/2020 ∙ by Abhishek Moitra, et al. To make it very simple, think tomorrow is my exam and we have to predict whether i am going to pass the examination or not, in this case our desired output y is 0(fail the exam),1(not fail the exam). They appeared to have a very powerful learning algorithm and lots of grand claims were made for what they could learn to do. Object Detection: It means basically localizing and classifying each objects in the image. Each network within DSN has its own hidden layers that process data. Every processed information is captured, stored, and utilized to calculate the final outcome. LSTM derives from neural network architectures and is based on the concept of a memory cell. In this article, we are going to show you the most popular and versatile types of deep learning architecture. Encoder (condenses the input and produces the code), Decoder (rebuilds the input using the code). The input and output both are fed to the network at the time of model training. The advanced model for this use case is cycle GAN’S which generally used in image to image translation. More about such encoder-decoder architecture we will discuss in sometime next article. I want to make it very clear that Neural networks are not something which has evolved recently. Paper: ImageNet Classification with Deep Convolutional Neural Networks. Deep learning is represented by a spectrum of architectures that can build solutions for a range of problem areas. As a result, the DL model can extract more hierarchical information. Deep Neural Networks (DNNs) are central to deep learning, and understanding their internal working mechanism is crucial if they are to be used for emerging applications in medical and industrial AI. DOI: 10.1016/j.neucom.2016.12.038 Corpus ID: 207116476. We can use this application for virtual attendance system and in hospitals. Here’s how CNNs work: First, the input is received by the network. The output gate controls when the information that is contained in the cell is used in the output. Different Types of Neural Network Architecture. A Convolutional Neural Network (CNN) is a deep learning algorithm that can recognize and classify features in images for computer vision. in image recognition and NLP. One of Autoencoders’ main tasks is to identify and determine what constitutes regular data and then identify the anomalies or aberrations. Mostly Deep Learning i mean the concepts of neural network started becoming popular after 2012 when Alexnet by Facebook was introduced and able to classify correctly from the set of 1000 labels on the imagenet dataset. Thanks to the development of numerous layers of neural networks (each providing some function), deep learning is now more practical. So just imagine how rapidly we are entering into the world of big big data so fastly and rapidly. Over the last few years, deep learning has made tremendous progress and has become a prevalent tool for performing various cognitive tasks such as object detection, speech recognition, and reasoning. H… While often offering superior results over traditional techniques and successfully expressing complicated patterns in data, deep architectures are known to be challenging to design and train such that they generalize well to new data. Virtually every deep neural network architecture is nowadays trained using mini-batches. Reconstruct Photorealistic Scenes from Tourists’ Public Photos on the Internet! We have seen the most important use cases listed above on neural networks. In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of deep neural networks, most commonly applied to analyzing visual imagery. Now will try to understand where the deep learning is mostly used now a days i mean all the applications of deep learning one by one. They are commonly used in NLP (i.a. Let start its better illustration and understand the architecture of Neural Network and Deep Neural Network. Just to make you understand i want to give you one information. To start we chose the state-of-the-art fast style-transfer neural network from Ghiasi and colleagues. This abbreviation stands for Gated Recurrent Unit. The input could be “how much did i studied”, “how smart i am”, “my previous knowledge”, “my name”. An overview of UNAS training and deployment on the target devices. Also if you want to understand more about tokenization and word embedding you can go through the below link for more understanding in step by step. Exposing the Robustness and Vulnerability of Hybrid 8T-6T SRAM Memory Architectures to Adversarial Attacks in Deep Neural Networks. it provides higly tuned implementations for the neural networks operation such as backpropagation, pooling, normalization and many more. As you know from our previous article about machine learning and deep learning, DL is an advanced technology based on neural networks that try to imitate the way the human cortex works. The forget gate controls when a piece of information can be forgotten, allowing the cell to process new data. The memory cell can retain its value for a short or long time as a function of its inputs, which allows the cell to remember what’s essential and not just its last computed value. Codeless Deep Learning with KNIME: Build, train and deploy various deep neural network architectures using KNIME Analytics-P2P Posted on 29.11.2020 at 18:08 in eBook , Ebooks by sCar KNIME Analytics Platform is open source software used to create and design data science workflows. They are also known as shift invariant or space invariant artificial neural networks (SIANN), based on their shared-weights architecture and translation invariance characteristics. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns Abstract: Deep convolutional neural networks (DCNNs) have been successfully used in many computer vision tasks. CNN can take in an input image, assign importance to various aspects/objects in the image, and be able to differentiate one from the others[6]. chatbots), speech synthesis, and machine translations. Input layer: This is the beginning layer of any neural network. The name ‘convolutional’ derives from a mathematical operation involving the convolution of different functions. Bidirectional RNN: They work two ways; the output layer can get information from past and future states simultaneously[2]. Although building these types of deep architectures can be complex, various open source solutions, such as Caffe, Deeplearning4j, TensorFlow, and DDL, are available to get you up and running quickly. Bias: This is also something which model learns at very high level. In this model, the code is a compact version of the input. Moreover, the recurrent network might have connections that feedback into prior layers (or even into the same layer). In this article, we focus on summarizing the recent advances in accelerator designs for deep neural networks (DNNs)—that is, DNN accelerators. During a person's lifetime, numerous distinct neuronal architectures are responsible for performing the same tasks. Based on this, the outcome is produced. Image generation: It means generating of images of same kind by the neural networks that means if we will give any image to neural network basically it will mimic that image and will able to generate the image of same type. These solutions can be feed-forward focused or recurrent networks that permit consideration of previous inputs. Go deeper into neural networks in this developerWorks tutorialon recurrent … What are the application of neural networks in the industry? [1] https://en.wikipedia.org/wiki/Recurrent_neural_network, [2] https://en.wikipedia.org/wiki/Bidirectional_recurrent_neural_networks, [3] https://en.wikipedia.org/wiki/Long_short-term_memory, [4] https://developer.ibm.com/technologies/artificial-intelligence/articles/cc-machine-learning-deep-learning-architectures/, [5] https://en.wikipedia.org/wiki/Gated_recurrent_unit, [6] https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53, [7] https://en.wikipedia.org/wiki/Deep_belief_network, [8] https://www.researchgate.net/figure/A-Deep-Stacking-Network-Architecture_fig1_272885058. There are many modern architecture for this use case now, such as Transformers that we will discuss latter. Now your questions will be why was these things not popular at that time. LeNet5. The VGG networks, along with t h e earlier AlexNet from 2012, follow the now archetypal layout of basic conv nets: a series of convolutional, max-pooling, and activation layers before some fully-connected classification layers at the end.