BEST代写-线上留学生作业代写 & 论文代写专家


CS代写|ECM3420 Learning from Data

CS代写|ECM3420 Learning from Data



Answer ALL questions.

Please use EXAM ANSWER SHEET for writing your answers.

The marks for this module are calculated from 60% of the percentage mark for this paper plus 40% of the percentage mark for associated coursework.

This is an Open Book exam

Section 1: Multiple Choice Questions

There are two types of questions:

  • [Select one answer only]: you must select one and only one answer. Selecting more or less than one answer will result in ZERO marks.
  • [Select all the correct statements]: in these questions you must select the correct answers only.Selecting more or less than correct answers will result in ZERO marks.

1- What methods could be used to help reduce overfitting in decision trees? [Select all the correct statements].

☐ A) Pruning.

☐ B) Enforce a minimum number of samples in leaf nodes.

☐ C) Make sure that each leaf-node is one pure class.

☐ D) Make sure that your data is normalized.

☐ E) Enforce a maximum depth for the tree.

☐ F) Use “entropy” to calculate the information gain.                                                                                   (2 marks)

2- Which of the following statements about Neural Networks is/are true? [Select all the correct statements].

☐ A) Optimize a convex cost function.

☐ B) Always output values between 0 and 1.

☐ C) Can be used in an ensemble.

☐ D) Can be used for regression as well as classification.                                                                               (2 marks)

3- In neural networks, what is/are true about the nonlinear activation functions such as sigmoid, tanh, and ReLU? [Select all the correct statements].

☐ A) Used to speed up the gradient calculation in backpropagation, as compared to linear units

☐ B) Are applied only to the output units

☐ C) Help to learn nonlinear decision boundaries

☐ D) Always output values between 0 and 1                                                                                                       (2 marks)

4- Suppose we are given data comprising points of several different classes. Each class has a different probability distribution from which the sample points are drawn. We do not have the class labels. We use k-means clustering to try to guess the classes. Which of the following circumstances would undermine its effectiveness? [Select all the correct statements].

☐ A) Some of the classes are not normally distributed.

☐ B) The variance of each distribution is small in all directions.

☐ C) Each class has the same mean.

☐ D) You choose k = n, the number of sample points                                                                                       (2 marks)

5- You have used the same data to train two different Decision Tree (DT) classifiers. The first DT has 2 levels (DT2), the second DT has 6 levels (DT6). In terms of the bias-variance decomposition, the “DT6” model is likely to have:[Select all the correct statements]

☐ A) Higher variance than “DT2” model.

☐ B) Lower variance than “DT2” model.

☐ C) Higher bias than “DT2” model.

☐ D) Lower bias than “DT2” model.                                                                                                                      (2 marks)

6- Which of the following are true about bagging? [Select all the correct statements].

☐ A) In bagging, we choose random subsamples of the input points with replacement

☐ B) Bagging is ineffective with logistic regression, because all of the learners learn exactly the same decision boundary

☐ C) The main purpose of bagging is to decrease the bias of learning algorithms.

☐ D) If we use decision trees that have one sample point per leaf, bagging never gives lower

training error than one ordinary decision tree.                                                                                                    (2 marks)

7- Regarding variance and bias, which of the following statements are true? (Here ‘high’ and ‘low’ are relative to the ideal model.) [Select all the correct statements].

☐ A) Models which overfit have a high bias.

☐ B) Models which overfit have a low bias.

☐ C) Models which underfit have a high variance.

☐ D) Models which underfit have a low variance.                                                                                               (2 marks)

8- High entropy means that the partitions in classification are [select one answer only]

◯ A) pure

◯ B) not pure

◯ C) useful

◯ D) useless                                                                                                                                                                 (2 marks)

9- Suppose we would like to perform clustering analysis on a spatial dataset such as the geometrical locations of properties and houses. We wish to produce clusters of many different sizes and shapes. Which of the following methods is the most appropriate? [select one answer only]

◯ A) Decision Trees

◯ B) Density-based clustering

◯ C) Model-based clustering

◯ D) K-means clustering                                                                                                                                           (2 marks)

10 – You are dealing with an imbalanced dataset. You decide to use Synthetic Minority Oversampling Technique (SMOTE) to deal with the class imbalance in the data. Select the most appropriate way to apply SMOTE out of the following. [select one answer only]

◯ A) SMOTE should be applied on all records in the dataset (training and testing).

◯ B) SMOTE should be applied on training records only.

◯ C) SMOTE should be applied on testing records only.

◯ D) SMOTE should be applied on a random subset of both training and testing.

◯ E) SMOTE is not a suitable method to deal with the class imbalance.                                                         (2 marks)

11 – You have been asked to develop prediction models for the London Stock Exchange market. Your models should be able to predict if the price of a given “stock market share” is likely to increase or decrease in the future. What would be the most suitable validation method to evaluate your models:[select one answer only]

◯ A) K-Fold cross validation

◯ B) Out-of-time validation sampling

◯ C) Hold-out validation sampling.

◯ D) Leave-one-out Cross Validation.                                                                                                                      (2 marks)

12 – If we know the support of itemset {a, b} is 12, which of the following numbers are the possible supports of itemset {a, b, c}? [Select all possible answers].

☐ A) 12

☐ B) 13

☐ C) 10

☐ D) 15                                                                                                                                                                              (2 marks)

13 – Below are the 8 actual values of target/output variable in the train file.


What is the entropy of the target variable? [Select one answer only].

◯ A) -(5/8 log(5/8) + 3/8 log(3/8))

◯ B) 5/8 log(5/8) + 3/8 log(3/8)

◯ C) 3/8 log(5/8) + 5/8 log(3/8)

◯ D) 5/8 log(3/8) – 3/8 log(5/8)                                                                                                                                 (2 marks)