Nptel Deep Learning IIT Ropar Week 4 Assignment Answer

ABOUT THE COURSE :
Deep Learning has received a lot of attention over the past few years and has been employed successfully by companies like Google, Microsoft, IBM, Facebook, Twitter etc. to solve a wide range of problems in Computer Vision and Natural Language Processing. In this course we will learn about the building blocks used in these Deep Learning based solutions. Specifically, we will learn about feedforward neural networks, convolutional neural networks, recurrent neural networks and attention mechanisms. We will also look at various optimization algorithms such as Gradient Descent, Nesterov Accelerated Gradient Descent, Adam, AdaGrad and RMSProp which are used for training such deep neural networks. At the end of this course students would have knowledge of deep architectures used for solving various Vision and NLP tasks

INTENDED AUDIENCE: Any Interested Learners

PREREQUISITES: Working knowledge of Linear Algebra, Probability Theory. It would be beneficial if the participants have done a course on Machine Learning.

Nptel Deep Learning – IIT Ropar Week 4 Assignment Answer

Course layout

Week 1 :  (Partial) History of Deep Learning, Deep Learning Success Stories, McCulloch Pitts Neuron, Thresholding Logic, Perceptrons, Perceptron Learning Algorithm

Week 2 
:  Multilayer Perceptrons (MLPs), Representation Power of MLPs, Sigmoid Neurons, Gradient Descent, Feedforward Neural Networks, Representation Power of Feedforward Neural Networks

Week 3 
:  FeedForward Neural Networks, Backpropagation

Week 4 
:  Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD, AdaGrad, RMSProp, Adam, Eigenvalues and eigenvectors, Eigenvalue Decomposition, Basis

Week 5 
:  Principal Component Analysis and its interpretations, Singular Value Decomposition

Week 6 
:  Autoencoders and relation to PCA, Regularization in autoencoders, Denoising autoencoders, Sparse autoencoders, Contractive autoencoders

Week 7 
:  Regularization: Bias Variance Tradeoff, L2 regularization, Early stopping, Dataset augmentation, Parameter sharing and tying, Injecting noise at input, Ensemble methods, Dropout

Week 8 
:  Greedy Layerwise Pre-training, Better activation functions, Better weight initialization methods, Batch Normalization

Week 9 
:  Learning Vectorial Representations Of Words

Week 10
: Convolutional Neural Networks, LeNet, AlexNet, ZF-Net, VGGNet, GoogLeNet, ResNet, Visualizing Convolutional Neural Networks, Guided Backpropagation, Deep Dream, Deep Art, Fooling Convolutional Neural Networks

Week 11
: Recurrent Neural Networks, Backpropagation through time (BPTT), Vanishing and Exploding Gradients, Truncated BPTT, GRU, LSTMs

Week 12
: Encoder Decoder Models, Attention Mechanism, Attention over images

Nptel Deep Learning – IIT Ropar Week 4 Assignment Answer

Week 4 : Assignment 4

Due date: 2025-02-19, 23:59 IST.
Assignment not submitted
1 point

Using the Adam optimizer with β1=0.9β1=0.9β2=0.999β2=0.999, and ϵ=108ϵ=10−8, what would be the bias-corrected first moment estimate after the first update if the initial gradient is 4?

 
 
 
 
1 point
In a mini-batch gradient descent algorithm, if the total number of training samples is 50,000 and the batch size is 100, how many iterations are required to complete 10 epochs?
 
 
 
 

In a stochastic gradient descent algorithm, the learning rate starts at 0.1 and decays exponentially with a decay rate of 0.1 per epoch. What will be the learning rate after 5 epochs?
1 point
1 point
In the context of Adam optimizer, what is the purpose of bias correction?
 
 
 
 
1 point
The figure below shows the contours of a surface.

Suppose that a man walks, from -1 to +1, on both the horizontal (x) axis and the vertical (y) axis. The statement that the man would have seen the slope change rapidly along the x-axis than the y-axis is,
 
 
 
1 point
What is the primary benefit of using Adagrad compared to other optimization algorithms?
 
 
 
 
1 point
What are the benefits of using stochastic gradient descent compared to vanilla gradient descent?
 
 
 
 
1 point
What is the role of activation functions in deep learning?
 
 
 
 
1 point
What is the advantage of using mini-batch gradient descent over batch gradient descent?
 
 
 
 
1 point
In the Nesterov Accelerated Gradient (NAG) algorithm, the gradient is computed at:
 
 
 
 

Related Posts