本次美国代写是Python数据科学相关的一个assignment

## Overview

Submit your writeup including any code as a PDF via gradescope.

### 1 We recommend reading

through the entire homework beforehand and carefully using functions for testing procedures,

plotting, and running experiments. Taking the time to reuse code will help in the long run!

Data science is a collaborative activity. While you may talk with others about the homework,

please write up your solutions individually. If you discuss the homework with your peers,

include their names on your submission. Please make sure any handwritten answers are

legible, as we may deduct points otherwise.

### 1. Bayesian fidget spinners

Nat’s company manufactures fidget spinners. The company uses two factories, which we’ll

call factory 0 and factory 1. Each fidget spinner from factory k is defective with probability qk

(k 2 f0; 1g). Nat knows that factory 0 produces fewer defective fidget spinners than factory

1 (in other words, q0 < q1).

She receives n boxes full of fidget spinners, but the boxes aren’t labeled (in other words, she

doesn’t know which box is from which factory). For each box, she starts randomly pulling

out fidget spinners until she finds a defective one, and records how many fidget spinners she

pulled out (including the defective one). She calls this number xi for box i, for i = 1; : : : ; n.

She wants to estimate the following pieces of information:

• Which boxes came from factory 0, and which came from factory 1? She defines a binary

random variable for each box zi with the factory label (i.e., zi = 0 if box i is from

factory 0, and zi = 1 if box i is from factory 1).

• How reliable is each factory? In other words, what are q0 and q1?

Inspired by what she learned about Gaussian mixture models, she sets up the following

probability model:

(a) Draw a graphical model for the probability model described above if n = 2 (i.e., there

are only two boxes of fidget spinners).

Nat decides to implement the model above setting the following hyperparameters:

(b) Which one of the following explains why Nat chose this value of :

(i) Factory 0 produces more boxes than factory 1

(ii) Factory 0 produces fewer boxes than factory 1

(iii) Factory 0 is better (i.e., it is less likely to produce defective fidget spinners)

(iv) Factory 0 is worse (i.e., it is more likely to produce defective fidget spinners)

(c) Which one of the following explains why Nat chose these values of a and b?

(i) Factory 0 produces more boxes than factory 1

(ii) Factory 0 produces fewer boxes than factory 1

(iii) Factory 0 is better (i.e., it is less likely to produce defective fidget spinners)

(iv) Factory 0 is worse (i.e., it is more likely to produce defective fidget spinners)