Nptel Introduction to Machine Learning Week 2 Assignment Answer

ABOUT THE COURSE :
With the increased availability of data from varied sources there has been increasing attention paid to the various data driven disciplines such as analytics and machine learning. In this course we intend to introduce some of the basic concepts of machine learning from a mathematically well motivated perspective. We will cover the different learning paradigms and some of the more popular algorithms and architectures used in each of these paradigms.

INTENDED AUDIENCE : This is an elective course. Intended for senior UG/PG students. BE/ME/MS/PhD

PREREQUISITES : We will assume that the students know programming for some of the assignments.If the students have done introductory courses on probability theory and linear algebra it would be helpful. We will review some of the basic topics in the first two weeks as well.

INDUSTRY SUPPORT : Any company in the data analytics/data science/big data domain would value this course.

Nptel Introduction to Machine Learning Week 2 Assignment Answer

Course layout

Week 0: Probability Theory, Linear Algebra, Convex Optimization – (Recap)

Week 1: Introduction: Statistical Decision Theory – Regression, Classification, Bias Variance

Week 2: Linear Regression, Multivariate Regression, Subset Selection, Shrinkage Methods, Principal Component Regression, Partial Least squares

Week 3: Linear Classification, Logistic Regression, Linear Discriminant Analysis

Week 4: Perceptron, Support Vector Machines

Week 5: Neural Networks – Introduction, Early Models, Perceptron Learning, Backpropagation, Initialization, Training & Validation, Parameter Estimation – MLE, MAP, Bayesian Estimation

Week 6: Decision Trees, Regression Trees, Stopping Criterion & Pruning loss functions, Categorical Attributes, Multiway Splits, Missing Values, Decision Trees – Instability Evaluation Measures

Week 7: Bootstrapping & Cross Validation, Class Evaluation Measures, ROC curve, MDL, Ensemble Methods – Bagging, Committee Machines and Stacking, Boosting

Week 8: Gradient Boosting, Random Forests, Multi-class Classification, Naive Bayes, Bayesian Networks

Week 9: Undirected Graphical Models, HMM, Variable Elimination, Belief Propagation

Week 10: Partitional Clustering, Hierarchical Clustering, Birch Algorithm, CURE Algorithm, Density-based Clustering

Week 11: Gaussian Mixture Models, Expectation Maximization

Week 12: Learning Theory, Introduction to Reinforcement Learning, Optional videos (RL framework, TD learning, Solution Methods, Applications)

Nptel Introduction to Machine Learning Week 2 Assignment Answer

Week 2 : Assignment 2

Due date: 2025-02-05, 23:59 IST.

Assignment not submitted

1 point

In a linear regression model $y = θ_{0} + θ_{1} x_{1} + θ_{2} x_{2} + . . . + θ_{p} x_{p}$ , what is the purpose of adding an intercept term $(θ_{0})$ ?

To increase the model’s complexity

To account for the effect of independent variables.

To adjust for the baseline level of the dependent variable when all predictors are zero.

To ensure the coefficients of the model are unbiased.

1 point

Which of the following is true about the cost function (objective function) used in linear regression?

It is non-convex.

It is always minimized at $θ = 0.$

It measures the sum of squared differences between predicted and actual values.

It assumes the dependent variable is categorical.

1 point

Which of these would most likely indicate that Lasso regression is a better choice than Ridge regression?

All features are equally important

Features are highly correlated

Most features have small but non-zero impact

Only a few features are truly relevant

1 point

Which of the following conditions must hold for the least squares estimator in linear regression to be unbiased?

The independent variables must be normally distributed.

The relationship between predictors and the response must be non-linear.

The errors must have a mean of zero.

The sample size must be larger than the number of predictors.

1 point

When performing linear regression, which of the following is most likely to cause overfitting?

Adding too many regularization terms.

Including irrelevant predictors in the model.

Increasing the sample size.

Using a smaller design matrix.

1 point

You have trained a complex regression model on a dataset. To reduce its complexity, you decide to apply Ridge regression, using a regularization parameter $λ$ . How does the relationship between bias and variance change as $λ$ becomes very large? Select the correct option

bias is low, variance is low.

bias is low, variance is high.

bias is high, variance is low.

bias is high, variance is high.

1 point

Given a training data set of 10,000 instances, with each input instance having 12 dimensions and each output instance having 3 dimensions, the dimensions of the design matrix used in applying linear regression to this data is

10000 × 12

10003 × 12

10000 × 13

10000 × 15

1 point

The linear regression model $y = a_{0} + a_{1} x_{1} + a_{2} x_{2} + . . . + a_{p} x_{p}$ is to be fitted to a set of $N$ training data points having P attributes each. Let $X$ be $N$ x $(p + 1)$ vectors of input values (augmented by 1‘s), $Y$ be $N$ x $1$ vector of target values, and $θ$ be $(p + 1) \times 1$ vector of parameter values $(a_{0}, a_{1}, a_{2}, . . ., a_{p})$ . If the sum squared error is minimized for obtaining the optimal regression model, which of the following equation holds?

$X^{T} X$ = $X Y$

$X θ$ = $X^{T} Y$

$X^{T} X θ$ = $Y$

$X^{T} X θ$ = $X^{T} Y$

1 point

Which of the following scenarios is most appropriate for using Partial Least Squares (PLS) regression instead of ordinary least squares (OLS)?

When the predictors are uncorrelated and the number of samples is much larger than the number of predictors.

When there is significant multicollinearity among predictors or the number of predictors exceeds the number of samples.

When the response variable is categorical and the predictors are highly non-linear.

When the primary goal is to interpret the relationship between predictors and response, rather than prediction accuracy.

1 point

Consider forward selection, backward selection and best subset selection with respect to the same data set. Which of the following is true?

Best subset selection can be computationally more expensive than forward selection

Forward selection and backward selection always lead to the same result

Best subset selection can be computationally less expensive than backward selection

Best subset selection and forward selection are computationally equally expensive

Both (b) and (d)

Nptel Introduction to Machine Learning Week 2 Assignment Answer

Nptel Introduction to Machine Learning Week 2 Assignment Answer

Course layout

Nptel Introduction to Machine Learning Week 2 Assignment Answer

Week 2 : Assignment 2

Related Posts

Nptel Introduction to Machine Learning Week 3 Assignment Answer

Nptel Introduction to Machine Learning Week 1 Assignment Answer