BEST代写-线上编程学术专家

Best代写-最专业靠谱代写IT | CS | 留学生作业 | 编程代写Java | Python |C/C++ | PHP | Matlab | Assignment Project Homework代写

机器学习代写|Assignment 4: Ensemble methods and other topics of Machine Learning

机器学习代写|Assignment 4: Ensemble methods and other topics of Machine Learning

这是一篇来自澳洲的关于机器学习的集成方法和其他主题作业的机器学习代写

 

Quiz instructions

Please answer the following questions. You can have multiple attempts and only your latest attempt will be marked.

3 pts

Question 1

The following code shows an incorrect implementation of Adaboost training algorithm.

Please point out all the mistakes you can find and what should be the input variables and output variables of weak_classifier_train.

def Adaboost_train(train_data, train_label, T):

# train_data: N x d matrix

# train_label: N x 1 vector

# T: the number of weak classifiers in the ensemble

ensemble_models = []

for t in range(0,T):

model_param_t = weak_classifier_train(train_data, train_label) #

model_param_t returns the model parameters of the learned weak classifier

# definition of model

ensemble_models.append(model_param_t)

return ensemble_models

 

Question 2

Suppose we have two kernel functions such that there are 2 implicit high-dimensional feature maps that satisfies, where is the dot product (a.k.a. inner product) in the D-dimensional space.

denotes the i-th dimension of the j-th mapped feature.

Is the product of two kernel functions, that is, still a valid kernel function? If yes, prove that. If no, please explain why (You can attach images for your derivation)

 

Question 3

True

False

Assume that the weak learners are a finite set of linear classfiers, Adaboost cannot achieve zero training error if the training data is not linearly separable.

 

Question 4

True

False

Random forest uses different subset of training data to build each decision tree in the ensemble.

 

Question 5

True

False

Adaboost is an ensemble method, it can be used to boost the performance of any classifier.

 

Question 6

True

False

Assume that the weak learners are a finite set of decision stumps, subtracting a constant vectors, say, [1,0.5,3,…], from all features will not impact the predictive accuracy on the test set.

Question 7

the above equation shows a variant of Ridge regression with an bias term . Please show how to calculate and, where is i-th data sample and is the i-th target value. is the model parameters.

Question 8

We use the following convolutional neural network to classify a set of 32$\times$32 color images, that is, the input size is 32$\times$32$\times$3:

1) Layer 1: convolutional layer with the ReLU nonlinear activation function, 100 5$\times$5 filters with stride 2.

2) Layer 2: 2$\times$2 max-pooling layer

3) Layer 3: convolutional layer with the ReLU nonlinear activation function, 50 3$\times$3 filters with stride 1.

4) Layer 4: 2$\times$2 max-pooling layer

5) Layer 5: fully-connected layer

6) Layer 6: classification layer

How many parameters are in the first layer (1 point), the second layer (1 point) and the third layer (assume bias term is used) (1 point)?

bestdaixie

评论已关闭。