Nptel Deep Learning IIT Ropar Week 3 Assignment Answer

ABOUT THE COURSE :
Deep Learning has received a lot of attention over the past few years and has been employed successfully by companies like Google, Microsoft, IBM, Facebook, Twitter etc. to solve a wide range of problems in Computer Vision and Natural Language Processing. In this course we will learn about the building blocks used in these Deep Learning based solutions. Specifically, we will learn about feedforward neural networks, convolutional neural networks, recurrent neural networks and attention mechanisms. We will also look at various optimization algorithms such as Gradient Descent, Nesterov Accelerated Gradient Descent, Adam, AdaGrad and RMSProp which are used for training such deep neural networks. At the end of this course students would have knowledge of deep architectures used for solving various Vision and NLP tasks

INTENDED AUDIENCE: Any Interested Learners

PREREQUISITES: Working knowledge of Linear Algebra, Probability Theory. It would be beneficial if the participants have done a course on Machine Learning.

Nptel Deep Learning – IIT Ropar Week 3 Assignment Answer

Course layout

Week 1 : (Partial) History of Deep Learning, Deep Learning Success Stories, McCulloch Pitts Neuron, Thresholding Logic, Perceptrons, Perceptron Learning Algorithm

Week 2 : Multilayer Perceptrons (MLPs), Representation Power of MLPs, Sigmoid Neurons, Gradient Descent, Feedforward Neural Networks, Representation Power of Feedforward Neural Networks

Week 3 : FeedForward Neural Networks, Backpropagation

Week 4 : Gradient Descent (GD), Momentum Based GD, Nesterov Accelerated GD, Stochastic GD, AdaGrad, RMSProp, Adam, Eigenvalues and eigenvectors, Eigenvalue Decomposition, Basis

Week 5 : Principal Component Analysis and its interpretations, Singular Value Decomposition

Week 6 : Autoencoders and relation to PCA, Regularization in autoencoders, Denoising autoencoders, Sparse autoencoders, Contractive autoencoders

Week 7 : Regularization: Bias Variance Tradeoff, L2 regularization, Early stopping, Dataset augmentation, Parameter sharing and tying, Injecting noise at input, Ensemble methods, Dropout

Week 8 : Greedy Layerwise Pre-training, Better activation functions, Better weight initialization methods, Batch Normalization

Week 9 : Learning Vectorial Representations Of Words

Week 10: Convolutional Neural Networks, LeNet, AlexNet, ZF-Net, VGGNet, GoogLeNet, ResNet, Visualizing Convolutional Neural Networks, Guided Backpropagation, Deep Dream, Deep Art, Fooling Convolutional Neural Networks

Week 11: Recurrent Neural Networks, Backpropagation through time (BPTT), Vanishing and Exploding Gradients, Truncated BPTT, GRU, LSTMs

Week 12: Encoder Decoder Models, Attention Mechanism, Attention over images

Nptel Deep Learning – IIT Ropar Week 3 Assignment Answer

Week 3 : Assignment 3

Due date: 2025-02-12, 23:59 IST.

Assignment not submitted

Use the following data to answer the questions 1 to 2
A neural network contains an input layer

h_{0} = x

, three hidden layers

(h_{1}, h_{2}, h_{3}),

and an output layer O. All the hidden layers use the Sigmoid activation function, and the output layer uses the Softmax activation function.
Suppose the input

x \in R^{200}

, and all the hidden layers contain 10 neurons each. The output layer contains 4 neurons.

How many parameters (including biases) are there in the entire network?

2274

1 point

Suppose all elements in the input vector are zero, and the corresponding true label is also 0. Further, suppose that all the parameters (weights and biases) are initialized to zero. What is the loss value if the cross-entropy loss function is used? Use the natural logarithm (ln).

1.38

1 point

Use the following data to answer the questions 3 to 4
The diagram below shows a neural network. The network contains two hidden layers and one output layer. The input to the network is a column vector

x \in R^{3}

. The first hidden layer contains 9 neurons, the second hidden layer contains 5 neurons and the output layer contains 2 neurons. Each neuron in the

l^{t h}

layer is connected to all the neurons in the

(l + 1)^{t h}

layer. Each neuron has a bias connected to it (not explicitly shown in the figure)

In the diagram,

W_{1}

is a matrix and

x, a_{1}, h_{1},

and

O

are all column vectors. The notation

W_{i} [j, :]

denotes the

j^{t h}

row of the matrix

W_{i}, W_{i} [:, j]

denotes the

j_{t h}

column of the matrix

W_{i}

and

W_{k_{ij}}

denotes an element at

i^{t h}

row and

j^{t h}

column of the matrix

W_{k}

1 point

Choose the correct dimensions of $W_{1}$ and $a_{1}$

How many learnable parameters(including bias) are there in the network?

1 point

We have a multi-classification problem that we decide to solve by training a feedforward neural network. What activation function should we use in the output layer to get the best results?

Logistic

Step function

Softmax

linear

1 point

Which of the following statements about backpropagation is true?

It is used to compute the output of a neural network.

It is used to optimize the weights in a neural network.

It is used to initialize the weights in a neural network.

It is used to regularize the weights in a neural network.

1 point

Given two probability distributions $p$ and $q$ , under what conditions is the cross entropy between them minimized?

All the values in $p$ are lower than corresponding values in $q$

All the values in $p$ are higher than corresponding values in $q$

$p = 0$ (0 is a vector)

$p = q$

1 point

Given that the probability of Event A occurring is 0.18 and the probability of Event B occurring is 0.92, which of the following statements is correct?

Event A has a low information content

Event A has a high information content

Event B has a low information content

Event B has a high information content

Use the following data to answer the questions 9 and 10
The following diagram represents a neural network containing two hidden layers and one output layer. The input to the network is a column vector

x \in R^{3} .

The activation function used in hidden layers is sigmoid. The output layer doesn’t contain any activation function and the loss used is squared error loss

(p r e d_{y} - t r u e_{y}) 2.

The following network doesn’t contain any biases and the weights of the network are given below:

W_{1} = [\begin{matrix} 1 & 1 & 3 \\ 2 & - 1 & 1 \\ 1 & 2 & - 2 \end{matrix}] W_{2} = [\begin{matrix} 1 & 1 & 2 \\ 3 & 1 & 1 \end{matrix}] W_{3} = [\begin{matrix} 1 & 2 \end{matrix}]

The input to the network is:

x = [\begin{matrix} 1 \\ 2 \\ 1 \end{matrix}]

The target value y is:

y = 5

What is the predicted output for the given input x after doing the forward pass?

2.8

1 point

Compute and enter the loss between the output generated by input x and the true output y.

Nptel Deep Learning IIT Ropar Week 3 Assignment Answer

Nptel Deep Learning – IIT Ropar Week 3 Assignment Answer

Course layout

Nptel Deep Learning – IIT Ropar Week 3 Assignment Answer

Week 3 : Assignment 3

Related Posts

Nptel Deep Learning IIT Ropar Week 4 Assignment Answer

Nptel Deep Learning IIT Ropar Week 2 Assignment Answer