Best代写-最专业靠谱代写IT | CS | 留学生作业 | 编程代写Java | Python |C/C++ | PHP | Matlab | Assignment Project Homework代写

C++代写|CS126 Naive Bayes Week 2

C++代写|CS126 Naive Bayes Week 2




  • Implement the classification portion of Naive Bayes
  • Connect your implementation of Naive Bayes to a sketchpad implemented in Cinder



In this assignment you will refactor and improve your code based on feedback given to you in code review. You will implement the classification portion of this assignment on top of your existing code for training, loading, and saving a model.

Next, you will install Cinder and learn how to use this library. We have provided an implementation of a sketchpad using Cinder, and we expect you to fully understand our code.

Next, you will connect our sketchpad to your implementation of Naive Bayes such that a number from 0 – 9 drawn using the sketchpad can be classified.



You will need to submit your GitHub repository to Gradescope. There is a linter which is run on each submission; the results of the linter will be viewable by your code moderator when grading.


Getting Started

You will develop this assignment in the same repository as last week, using your existing code. This assignment will use the Cinder framework — you should have set this up in your Cinder directory, but if not, make sure to do so before starting this week’s assignment. Although we will be providing you Cinder starter code, we suggest taking a look at this Cinder tutorial and the Cinder documentation if you need a refresher.

You should make a branch called week2 on top of your week1 branch (ex: git checkout -b week2 week1) and only commit to this branch. Nothing should be committed to master. When you’re finished with week2, you can open a pull request to request that your code be merged into master, and this pull request will be reviewed by your code moderator.


The Data Files

For this assignment you will need a set of pre-labeled data files that we will use for validating data, which you can download here.

The .zip file contains a few text files:

  • testimagesandlabels: 1000 images for validation/testing and their labels (in the same format as week one)

Like last week, we expect you not to commit these files to your git repository. Make sure they’re blacklisted using the .gitignore. If you accidentally committed them to your repo, you should remove them with git rm -r --cached <filename>.

The specification of the content inside the images and labels files is the same as the one provided last week.


Part 0 – Testing

Similar to last week, you should test the mathematical correctness of your classifier by using a small dataset where you can cross-check the answers by hand. This includes the likelihood scores and the actual prediction made by the classifier.

Remember that we want to test for mathematical correctness; for example, the following test is not sufficient:

REQUIRE(accuracy >= 0.7);

Testing whether the accuracy of your classifier is over 70% works as a sanity check, but you need to test whether the math behind it is correct — for example, there might be a bug that mixes up the classifications for classes 0 and 9 but classifies everything else correctly.


Part 1 – Improving your Code

Review the feedback on your code for the previous week in Gradescope and implement the changes that your moderator suggested. You will be evaluated on improvement this week, so make sure to click through each section of the Gradescope rubric to see comments specific to each topic and to implement verbal feedback from code review.


Part 2 – Classification

For the math behind this portion of the assignment, please refer this document.

Deliverables (the deliverables for this part of the assignment are also listed in the blue box at the end of section 2.3 in the above document):

  • Given a trained model and a new image that doesn’t belong to the training dataset, you should be able to calculate the “likelihood scores” for each of the digits 0-9.
  • You should be able to determine which digit has the highest likelihood score, and classify the image as that digit.


Part 3 – Validation

What good is a classifier if you don’t know how accurate it is? We’ve given you a set of images and labels inside the .zip file which you should use to validate your model’s accuracy. This file follows the same format as the training images and labels described in week 1’s documentation and you should be able to parse them in a similar manner. Remember that we do not want to test our model on our training images and labels, which is why we provided you with two new files for testing.

You should incorporate the following functionality to your existing project:

  • classify each of the images in testimagesandlabels
  • compare the result of your classifier to the actual labels given in the same file
  • print out the accuracy of your Naive Bayes classifier

Remember to make your code flexible, so it should be easy to change/modify the following:

  • Decide whether to save/load the model from a file, or to train a new model
  • Differentiate training the model from testing the model (classification)
  • Change the filenames corresponding to the files containing test images labels


Part 4 – Visualization

Finally, you would want to see your classifier in action as it performs real-time classification of sketches using Cinder. We have provided you with some starter code, but it is your job to fill in the blanks and get the application up and running. When you are finished, you should be able to draw an image in the sketchpad and classify it by pressing the enter key. You can clear the image drawn by pressing the delete key (or FN-Delete if you have a Mac). Don’t worry if the sketchpad doesn’t classify all the sketches correctly. After all, Naive Bayes is a pretty naive model that makes some sketch-y assumptions. If you aren’t happy with the performance, it might be a cool final project idea to implement a more sophisticated machine learning algorithm!


Command line arguments – More EC

Last week, you may have functionality to parse command line arguments for extra credit. This week, you can extend your functionality to allow the user to choose whether to test their model.

Note that this means that the command line parsing you implemented last week should still work and that you must consider logical combinations of the options provided: for example, you would need to handle the following cases (note that this isn’t a complete list of all the cases you must handle):

  • Allow the user to train the model only (logically, they would need to then save the model to a file or there wouldn’t be any point in training the model — but we will leave how to handle cases like this to you)
  • Allow the user to test a model they’ve loaded in
  • Allow the user to train a model and test it


Other Extra Credit Opportunities

Important: you should focus on finishing your assignment before working on any of the suggested extra credit features. Furthermore, try to gauge the amount of time you have left: if you start an extra credit feature, you should finish it. Any non-trivial enhancements and/or additional classification algorithms will be awarded extra credit. Here are some ideas that might interest you (You would have to implement something else if you already implemented one of the below the previous week):

  • k-Nearest Neighbors
  • Gaussian Naive Bayes
  • Decision Trees
  • Voting/Boosting
  • Artificial Neural Networks (difficult!)
  • Output a confusion matrix generated by your classifier (simple!)

Feel free to use Machine Learning libraries for the extra credit portion.


Grading and Deliverables

  • Improve your code by reading and implementing the changes your moderator suggested. Your implementation must still support all the features from last week’s assignment (proper use of operator overloading, training the model accurately, etc)
  • Classify images from a file based on your model.
  • Use Cinder to create a visual representation of your classifier. Note that you’ll need to adapt the starter code we’ve provided.
  • You must test your classifier for correctness using unit tests (Does your classifier hit a certain percentage of accuracy? Does it behave as expected for small sets of test data? Does it work for different image sizes, given training and testing on the same image size? Is the math behind each step of classification correct?)
  • Lastly, you must follow the Google C++ Style Guide with regards to naming and whitespace.



It might help reviewing the workshops conducted on this assignment to become familiar with the Cinder framework. Also, be sure to review the documents hyper-linked to the documentation, they contain important information that will be required for the assignment.


Assignment Rubric

This rubric is not a comprehensive checklist. Please make sure to go over the feedback you received on your previous MPs and ask your moderator/post on Campuswire if you have any questions.Similar to API Adventures, we expect you to take your feedback from Naive Bayes Part 1 into account — you will lose points for not changing your code in accordance with your moderator’s feedback.