Due: 9/28/15 at 11:59PM Eastern Time

Objective: In class, we learned about the Viola-Jones face detector, a seminal algorithm in computer vision. Here you will have a chance to work with the Viola-Jones detector first-hand by exploring its strengths and weaknesses over a well-known benchmark data set, and a data set you will collect on your own. This homework will help develop your biometric evaluation skills through a data analysis of the error rates associated with the detector output. After completing this assignment, you will have enough experience to develop new apps with face detection capabilities using the popular OpenCV library.

Grading: You will be graded based on the code you develop, plus your answers to the following questions. You will not be graded on the performance of your face detector per se, but rather on the analysis of its capabilities. This assignment is worth 175 points.

Step 1: Download and install the OpenCV library. You may do this by compiling from source, or by installing it via a package manager. OpenCV supports Linux, OS X, and Windows. You are free to develop under any of these environments. OpenCV is the most popular open source computer vision and machine learning library, with many useful algorithms for visual recognition and 3D reconstruction. Here we will focus on its face detection capabilities.

Step 2: Download the Face Detection Data Set and Benchmark (FDDB) images and the associated face annotations. To gain a sense of the design and usage of FDDB, it's a good idea to read the Tech Report released by the UMASS Computer Vision Laboratory. The data set is organized as a 10-fold cross-validation test, with training on nine folds and testing on one fold during each iteration. The individual images for each testing fold are listed in the corresponding FDDB-folds/FDDB-fold-*.txt file.

Next, download and compile the companion scoring program. You will use the "evaluation" program to generate raw data for the ROC curves you will use for analysis. This program expects annotations and detection results in a particular format described in the FDDB data set README.

Step 3: Using OpenCV, write a face detector that is able to take as input RGB images and output bounding box coordinates and detection scores in the format that can be parsed by the FDDB evaluation software. Your detector should have an optional mode to output augmented versions of the input images with bounding boxes and scores surrounding the detected faces, like in the image above. You need only support rectangular detection regions. The OpenCV documentation will serve as a good guide to help you write your detector. The OpenCV Cascade Classifier has different parameter settings and various cascades (these are described in the OpenCV documentation). Make your implementation flexible enough so that the user can easily adjust the detector’s parameters and choice of cascade(s), but initialize with default values in case they are not specified. You are free to use C++ or Python (explain why you chose one language over the other). Document your code by adding comments explaining its usage and the default parameter and cascade settings.

Tip #1: You may be asking yourself: "how do I score a candidate window?" There is more than one way to do this, and you are free to use any published strategies for candidate window score calculations. You are also welcome to design your own method. A good scoring strategy should reflect the quality of a bounding box. You will receive full credit in either case if the scoring method works.

Question 1: Explain your scoring method. If you chose an existing method, cite your source in your documentation and explain why you chose it (and any modifications you opted to make). If you designed your own method, explain how it works.

Step 4: Using your default parameters, test your detector on all ten folds of FDDB. Follow the instructions in the data set documentation (this is the test labeled "EXP-1"). Generate Discrete and Continuous ROCs for your results using the data set's evaluation program and a graphing tool of your choice. The discrete ROC curve reflects the presence or absence of valid detections, and the continuous ROC curve reflects the quality of those detections.

Question 2: Which point on the discrete curve and which on the continuous curve do you think is operationally best? Explain your reasoning.

Question 3: What types of faces is your detector missing? Provide some examples and explain why you think this is happening.

Step 5: Based on your analysis in Question 3, adjust your parameters and explore different cascades to improve your true positive rates. Generate new discrete and continuous curves reflecting your improved results.

Question 4: Explain how you selected your best set of parameters.

Question 5: What happened to the number of false positives as you improved your true positive rates?

Question 6: Provide examples of faces you can now detect but could not before (use OpenCV to draw bounding boxes around them).

Step 6: Collect your own data set (at least 50 images). Use Creative Commons tagged images from the web or some of your own images that you don't mind sharing so the class can build a new face detection benchmark set for research. Make sure there is a diverse pool of images. Generate ground truth in the form of a count of the visible faces in each image.

Question 7: Explain the criteria you used to collect your data set and describe its overall composition. Make sure to state how many faces are in your images.

Step 7: Run your detector against your data set. Adjust your parameters to achieve some measure of satisfactory performance. There is no need to generate full ROC curves for this step.

Question 8: What is the best performance you achieved with respect to True Positive Rate vs. Number of False Positives on your data set?

Question 9: Provide some examples of faces you are able to detect (with actual bounding boxes) and some you were not able to detect.

Question 10: If you could design a better detector, what problems would you focus on?

Deliverables. You must turn in the following deliverables to receive full credit for this assignment: (1) Source code for your face detector and any additional code you developed to run the experiments; (2) Your self-collected data set; and (3) a report including the ROC curves from Steps 4 and 5, and answers to Questions 1-10.

Have questions about this assignment? Ask them! If globally applicable, your question and its answer will be posted to the course website for others to see.

Tip #2: Start early! If you run into trouble with your development early on, having ample time for debugging will help.


Q: Does the detector need to output an elipse to match the FDDB ground-truth?

A: No. A rectangular bounding box is sufficient. FDDB's evaluation program will accept this.

Q: Is the score output by the detector supposed to be a similarity score reflecting a match to the ground-truth in the data set?

A: No. The FDDB evaluation program will take care of calculating correspondance with the ground-truth for you. The score output by the detector should be a reflection of the quality or confidence of a returned bounding box.

Q: When you ask us to analyze the ROC curve after changing the parameters of the detector, do you mean we can just focus on fold-1 instead of all folds?

A: No. You must recompute all 10 FDDB folds for that task.

Q: Can I submit my assignment after the deadline?

A: For this first assignment, we will consider late submissions. However, there will be a penalty of 10 points per day you are late, which will be automatically subtracted from your total score.