CVPR 2016 (6/26/16 PM only)

Terrance Boult, University of Colorado Colorado Springs

Coinciding with the rise of large-scale statistical learning within the computer vision community has been a dramatic improvement in automated methods for human biometrics, object recognition, and scene parsing, among many other applications. Despite this progress, a tremendous gap exists between the performance of algorithms in the laboratory and the performance of those same methods in the real world. A major contributing factor to this is the way in which machine learning algorithms are typically evaluated: without the expectation that a class unknown to the algorithm at training time will be experienced during operational deployment.

Both recognition and classification are common terms in the computer vision literature. What is the difference? In classification, one assumes there is a given set of classes between which we must discriminate. For recognition, we assume there are some classes we can recognize in a much larger space of things we do not recognize. A motivating question for this tutorial is: What is the general recognition problem? This question, of course, is a central theme in most applications involving visual recognition. How one should approach multi-class recognition is still an open issue. Should it be performed as a series of binary classifications, or by detection, where a search is performed for each of the possible classes? What happens when some classes are ill-sampled, not sampled at all, or undefined?

For some problems, we do not need, and often cannot have, knowledge of the entire set of possible classes. For instance, in a recognition application for biologists, a single species of fish might be of interest. However, the classifier must consider the set of all other possible objects in relevant settings as potential negatives. Similarly, verification problems for security-oriented face matching constrain the target of interest to a single claimed identity, while considering the set of all other possible people as potential impostors. In addressing general object recognition, there is a finite set of known objects in myriad unknown objects, combinations and configurations - labeling something new, novel or unknown should always be a valid outcome. This leads to what is sometimes called "open set" recognition, in comparison to systems that make closed world assumptions or use "closed set" evaluation.

The purpose of this tutorial is to introduce the CVPR audience to this difficult problem in statistical learning specifically in the context of important vision applications. A number of different topics will be introduced, including: recent formalizations of the open set recognition problem, the statistical extreme value theory for visual recognition, which facilitates generalization in probabilistic decision models, and new supervised learning algorithms that minimize the risk of the unknown. Original case studies will be covered for applications related to the analysis of faces, objects and scenes. The tutorial is composed of three parts, each lasting approximately one hour. A complete outline follows.

**Part 1:** An introduction to the open set recognition problem

- General introduction: where do we find open set problems in computer vision?
- Decision models in machine learning
- Theoretical background: the risk of the unknown
- The compact abating probability model (Scheirer et al. T-PAMI 2014)
- The open world recognition model (Bendale and Boult CVPR 2015)

**Part 2:** Statistical extreme value theory for visual recognition

- Extrema and visual recognition problems
- Classical extreme value theory and models
- Recognition score analysis
- Calibration models for supervised learning
- Decision boundary modeling
- Sampling and feature correspondence (Fragoso and Turk CVPR 2013)

**Part 3:** Algorithms that minimize the risk of the unknown

- 1-vs-Set Machine (Scheirer et al. T-PAMI 2013)
- P
_{I}-SVM (Jain et al. ECCV 2014) - W-SVM (Scheirer et al. T-PAMI 2014)
- Nearest Non-Outlier Algorithm (Bendale and Boult CVPR 2015)
- The Extreme Value Machine (Rudd et al. 2015)

- libMR: extreme value theory fitting functions and meta-recognition capability
- libsvm-openset: patched version of LIBSVM for supervised learning that mitigates the risk of the unknown
- Open Set Deep Networks: Code and data for the research paper "Towards Open Set Deep Networks" A. Bendale, T. Boult, CVPR 2016