This assignment provides an opportunity for you to demonstrate how you can use what you have learned from lectures and tutorials to develop computer vision algorithms for real world application problems. For a successful completion of this assignment, you need to understand the fundamental algorithms covered in the lectures, conduct some research into the machine perception problem to design suitable features and classification methods, and use the skills that you have developed through completing practical exercises to build essential components of a computer vision algorithm.
External codes are allowed to be used in your assignment, but the sources must be cited in both your written report and your source codes. Feel free to use the work you have done in your practical exercises.
A substantial attempt for this assignment is required to pass this unit. A mark of 20% or more is considered a substantial attempt. This means you will not pass this unit if your total mark of this assignment is lower than 20%, even if you achieve full marks in your mid-semester test and final exam.
In this assignment, you will develop two computer vision programs: one for the detection and digit extraction of the building signages (for an example see Figure 1) and the other for coral image classification (for an example see Figure 2). A typical machine perception approach to solving the first problem would consist of at least the following
- Read an input image.
- Perform necessary image-processing operations.
- Detect and localize the building signage.
- Segment the numbers into individual digits.
For the second task, the machine perception approach would include
- Read an input image.
- Preprocess the image.
- Learn or Extract image features.
- Select a suitable classification method to classify the image into positive or negative categories with proper model selection
You will implement a suitable machine perception pipeline to perform the above steps by writing Jupyter notebook programs. Your implementation is primarily evaluated in terms of the two following tasks:
- Task 1: Building signage detection and extraction.
- Task 2: Coral image classification.
It is expected that you will make use of the computer vision algorithms discussed in the lectures and the skills acquired through doing practical exercises to complete the required tasks. You will also need to conduct your own research to understand different approaches to solving each task and decide your own choice of implementation and/or invention according to the specific settings of this assignment.
3 The Tasks
3.1 Task 1: Building signage detection and digit extraction (50 marks)
Develop a program that reads in colour images from a specified directory. For each image, detect the building number area and extract the digits, and finally output the images with bounding boxes for the building number area and for the individual digits. Your program is considered working if
- The detected areas meet the following criteria
– It must be a rectangle shape not exceeding the maximum allowable size, which is specified as follows
∗ The IoU (i.e., intersection over unions) is above 50%.
– It must contain all the digits of the building number.
- It must extract all three digits.
An illustrative example for building number detection and digit extraction is provided in Figure 1.
Note that in this task, each test image will contain only one building number. Thus, your program must not report more than one detected area. Otherwise, it will be considered a failed detection.
3.2 Task 2: Coral image classification (50 marks)
In this task, you will develop machine learning programs to train and test image classification algorithms to separate coral images from non-coral images. You can use traditional machine learning methods (e.g.nearest neighbor methods, support vector machine) or modern deep learning methods. However, you are required to use at least two different methods and compare their performances. Also, you are required to implement proper model selection methods to select hyper-parameters in training the models.
3.3. Training Data
The training images are provided. The testing images will be provided a week before the due date
- Specifications and Marking Guide
4.1 Report: 50%
A written report must be submitted, in PDF format, to Blackboard by the due date. This submission must contain
- A completed assignment cover sheet
- Printout of your source code
- A document that includes:
– Statements on how much you have attempted the assignment.
– The detail of your implementation for each task: this must clearly indicate your approach, and how the features you extract, the methods you use for model selection. It must allow the marker to understand how you approach the machine learning tasks. If a validation dataset is not available, you are expected to split the training dataset into two subsets: one for training and the other for model selection.
– The performance of your program on the testing dataset.
– Supporting diagrams, figures, tables that help describe your programs and performance clearly.
– References that your implementation is based on or inspired from.
Your report will be marked based on: 1) the clarity and presentation (20%); 2) the description of your implementation and the judgements of your design (40%); and 3) experimental results on the validation data and testing data, and discussions (40%).
4.2 Implementation: 50%
Your implementation will be marked based on the quality of your code (30%) and whether your Jupyter Notebook programs work in Google Colab or the d2l package provided by the textbook and produce reasonable performance (70%). Your codes are expected to be well written with comments and good structures.