ABOUT THE COURSE :
Deep Learning has received a lot of attention over the past few years and has been employed successfully by companies like Google, Microsoft, IBM, Facebook, Twitter etc. to solve a wide range of problems in Computer Vision and Natural Language Processing. In this course we will learn about the building blocks used in these Deep Learning based solutions. Specifically, we will learn about feedforward neural networks, convolutional neural networks, recurrent neural networks and attention mechanisms. We will also look at various optimization algorithms such as Gradient Descent, Nesterov Accelerated Gradient Descent, Adam, AdaGrad and RMSProp which are used for training such deep neural networks. At the end of this course students would have knowledge of deep architectures used for solving various Vision and NLP tasks

INTENDED AUDIENCE: Any Interested Learners

PREREQUISITES: Working knowledge of Linear Algebra, Probability Theory. It would be beneficial if the participants have done a course on Machine Learning.

Nptel Deep Learning – IIT Ropar Week 4 Assignment Answer

Course layout

Week 1 : (Partial) History of Deep Learning, Deep Learning Success Stories, McCulloch Pitts Neuron, Thresholding Logic, Perceptrons, Perceptron Learning Algorithm

Week 2 : Multilayer Perceptrons (MLPs), Representation Power of MLPs, Sigmoid Neurons, Gradient Descent, Feedforward Neural Networks, Representation Power of Feedforward Neural Networks

Week 3 : FeedForward Neural Networks, Backpropagation

Week 4 : Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD, AdaGrad, RMSProp, Adam, Eigenvalues and eigenvectors, Eigenvalue Decomposition, Basis

Week 5 : Principal Component Analysis and its interpretations, Singular Value Decomposition

Week 6 : Autoencoders and relation to PCA, Regularization in autoencoders, Denoising autoencoders, Sparse autoencoders, Contractive autoencoders

Week 7 : Regularization: Bias Variance Tradeoff, L2 regularization, Early stopping, Dataset augmentation, Parameter sharing and tying, Injecting noise at input, Ensemble methods, Dropout

Week 8 : Greedy Layerwise Pre-training, Better activation functions, Better weight initialization methods, Batch Normalization

Week 9 : Learning Vectorial Representations Of Words

Week 10: Convolutional Neural Networks, LeNet, AlexNet, ZF-Net, VGGNet, GoogLeNet, ResNet, Visualizing Convolutional Neural Networks, Guided Backpropagation, Deep Dream, Deep Art, Fooling Convolutional Neural Networks

Week 11: Recurrent Neural Networks, Backpropagation through time (BPTT), Vanishing and Exploding Gradients, Truncated BPTT, GRU, LSTMs

Week 12: Encoder Decoder Models, Attention Mechanism, Attention over images

Nptel Deep Learning – IIT Ropar Week 4 Assignment Answer

Week 4 : Assignment 4

Due date: 2025-02-19, 23:59 IST.

Assignment not submitted

1 point

Using the Adam optimizer with $β_{1} = 0.9$ , $β_{2} = 0.999$ , and $ϵ = 10^{- 8}$ , what would be the bias-corrected first moment estimate after the first update if the initial gradient is 4?

0.4

4.0

3.6

0.44

1 point

In a mini-batch gradient descent algorithm, if the total number of training samples is 50,000 and the batch size is 100, how many iterations are required to complete 10 epochs?

5,000

50,000

500

In a stochastic gradient descent algorithm, the learning rate starts at 0.1 and decays exponentially with a decay rate of 0.1 per epoch. What will be the learning rate after 5 epochs?

1 point

In the context of Adam optimizer, what is the purpose of bias correction?

To prevent overfitting

To speed up convergence

To correct for the bias in the estimates of first and second moments

To adjust the learning rate

1 point

The figure below shows the contours of a surface.

Suppose that a man walks, from -1 to +1, on both the horizontal (x) axis and the vertical (y) axis. The statement that the man would have seen the slope change rapidly along the x-axis than the y-axis is,

True

False

Cannot say

1 point

What is the primary benefit of using Adagrad compared to other optimization algorithms?

It converges faster than other optimization algorithms.

It is more memory-efficient than other optimization algorithms.

It is less sensitive to the choice of hyperparameters(learning rate).

It is less likely to get stuck in local optima than other optimization algorithms.

1 point

What are the benefits of using stochastic gradient descent compared to vanilla gradient descent?

SGD converges more quickly than vanilla gradient descent.

SGD is computationally efficient for large datasets.

SGD theoretically guarantees that the descent direction is optimal.

SGD experiences less oscillation compared to vanilla gradient descent.

1 point

What is the role of activation functions in deep learning?

Activation functions transform the output of a neuron into a non-linear function, allowing the network to learn complex patterns.

Activation functions make the network faster by reducing the number of iterations needed for training.

Activation functions are used to normalize the input data.

Activation functions are used to compute the loss function.

1 point

What is the advantage of using mini-batch gradient descent over batch gradient descent?

Mini-batch gradient descent is more computationally efficient than batch gradient descent.

Mini-batch gradient descent leads to a more accurate estimate of the gradient than batch gradient descent.

Mini batch gradient descent gives us a better solution.

Mini-batch gradient descent can converge faster than batch gradient descent.

1 point

In the Nesterov Accelerated Gradient (NAG) algorithm, the gradient is computed at:

The current position

A “look-ahead” position

The previous position

The average of current and previous positions

Nptel Deep Learning IIT Ropar Week 4 Assignment Answer

Nptel Deep Learning – IIT Ropar Week 4 Assignment Answer

Course layout

Nptel Deep Learning – IIT Ropar Week 4 Assignment Answer

Week 4 : Assignment 4

Related Posts

Nptel Deep Learning IIT Ropar Week 3 Assignment Answer

Nptel Deep Learning IIT Ropar Week 2 Assignment Answer