Home People Research Publications Demos
News Jobs Prospective
About Internal

The Discriminant Hypothesis for Visual Saliency

Biological vision systems rely on attention to cope with the complexity of visual perception. Rather than sequentially scanning all possible locations of a scene, attentional mechanisms make events of possible interest "pop-out" from background clutter. This enables organisms to focus their limited perceptual and cognitive resources on the most pertinent subsets of the available sensory stimuli, facilitating learning and survival. The deployment of visual attention is believed to be driven by visual saliency mechanisms, which have been known to exist for a number of elementary visual attributes, such as color, orientation, depth, and motion.

Saliency is also of interest for computer vision, as a means to improve computationally efficiency and increase robustness to clutter. The ability to quickly identify the regions of a scene that merit further processing enables vision systems to operate in complicated environments, where multiple objects (clutter) may exist in the background, and with low-complexity hardware. Vision problems that may benefit from saliency include object recognition, tracking, and robotics. Saliency can also be of interest for image processing applications where some image regions require special emphasis, such as image compression with regions of interest, image browsers, or the protection against channel transmission errors.

In this work, we propose a new computational hypothesis for saliency: that saliency is a discriminant process. This hypothesis is denoted as discriminant saliency, and rooted in a decision theoretic view of perception. Under this view perceptual systems evolve to produce decisions about the state of the surrounding environment that are optimal in a decision-theoretic sense, e.g., that have minimum probability of error. Discriminant saliency equates saliency to an optimal decision making problem, defining salient locations as those which enable a visual system to make decisions about the nature of the visual stimulus (target vs. background) with greatest confidence. To investigate the plausibility of discriminant saliency, we seek the answers to the following questions:

  • How does the hypothesis translate into computational algorithms for saliency?
  • Can it be applied to both bottom-up (stimulus driven) and top-down (goal specific) saliency mechanisms?
  • Is the hypothesis biologically (neurophysiologically) plausible?
  • Can it predict and explain the fundamental psychophysical properties of human visual saliency?
  • Does it lead to saliency detectors that significantly benefit computer vision applications, such as object recognition?

Biologically plausible bottom-up discriminant saliency detection

The discriminant saliency hypothesis is combined with center-surround processing to produce a bottom-up saliency detector. It is shown that, when tuned to the statistics of natural images, the resulting optimal saliency detector 1) has a one-to-one mapping to the standard neural architecture of V1, and 2) predicts and explains the fundamental properties of the psychophysics of human saliency. Rather than qualitative or anecdotal, many of these predictions are shown to be quantitatively accurate.

Top-down discriminant saliency for object recognition

In the context of object recognition, discriminant saliency is defined with respect to a one-vs-all classification problem, opposing the object class of interest to all other object classes that compose the recognition problem. The optimal discriminant saliency detector is derived, and applied to the problem of learning from weakly-supervised (unsegmented) examples. It is shown that discriminant saliency achieves highly accurate localization of objects of interest (when these are embedded in cluttered backgrounds), and high classification rates for object categorization. Salient points are also shown to be robustly repeatable under various geometric and photometric transformations. Finally it is shown that, under discriminant saliency, a rich set of visual attributes can be considered salient.

Publications: Bottom-up saliency and its biological plausibility

Decision-theoretic saliency: computational principles, biological plausibility, and implications for neurophysiology and psychophysics
D. Gao and N. Vasconcelos.
Neural Computation, 21, 239-271, January 2009. [pdf]

On the plausibility of the discriminant center-surround hypothesis for visual saliency
D. Gao, V. Mahadevan, and N. Vasconcelos.
Journal of Vision, 8(7):13, 1-18, 2008.

The discriminant center-surround hypothesis for bottom-up saliency.
D. Gao, V. Mahadevan and N. Vasconcelos.
In Proc. Neural Information Processing Systems (NIPS),
Vancouver, Canada, 2007.
[ps] [pdf]

Bottom-up saliency is a discriminant process
D. Gao and N. Vasconcelos.
Proceedings of IEEE International Conference on Computer Vision (ICCV) ,
Rio de Janeiro, Brazil, 2007.
 IEEE, [ps] [pdf]

Top-down saliency for object recognition

Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition
D. Gao, S. Han, and N. Vasconcelos
To appear in IEEE Trans. on Pattern Analysis and Machine Intelligence,

Discriminant Interest Points are Stable
D. Gao and N. Vasconcelos.
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
Minneapolis, MN, 2007.
 IEEE, [ps][pdf]

Integrated learning of saliency, complex features, and object detectors from cluttered scenes
D. Gao and N. Vasconcelos,
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
San Diego, 2005.  IEEE, [ps][pdf] (A longer version is available [ps][pdf])

An Experimental Comparison of Three Guiding Principles for the Detection of Salient Image Locations: Stability, Complexity, and Discrimination
D. Gao and N. Vasconcelos,
The 3rd International Workshop on Attention and Performance in Computational Vision (WAPCV), San Diego, 2005. [ps] [pdf]

Discriminant Saliency for Visual Recognition from Cluttered Scenes
D. Gao and N. Vasconcelos,
Proceedings of Neural Information Processing Systems (NIPS) ,
Vancouver, Canada, 2004. [ps][pdf]

Contact: Dashan Gao, Nuno Vasconcelos