Code for Discriminant Saliency Detector ============================================ by Dashan Gao (dgao@ucsd.edu) version 1.0, January 3, 2005 This directory contains a compiled binary program for selecting discriminant salient features, generating saliency maps, and searching for salient locations. The programs should run under Linux on Intel compatible processors. See the web page at http://www.svcl.ucsd.edu/projects/saliency/ for references to the relevant papers describing this approach. Installation ============== Part of this code is based the ImageMagick, so installation of ImagicMagick is required before using the code. ImageMagic is available free at http://www.imagemagick.org/ Procedures ============ As described in the NIPS2004 paper, there are two steps for the program. 1. Feature Selection salient features are selected using maximum marginal diversity. (See section 3.1 for detail) 2. Saliency maps and locations Saliency maps are generated by a biological inspired model and salient locations are searched over the map. (See section 3.2 for detail) The usage of the programs for the two steps are listed below. Salient feature selection ======================================== The binary program for selecting salient features is named "MMD", which has the following syntax: MMD [options] training_filelist pyr_depth salientfeaturepath options: -fast: using the pre-computed statistics information of DCT features. Otherwise, this will be computed from the input data, which makes the results more precise but takes more time. where, training_filelist -- an ASCII file containing a list of filenames (one name per line), each of which, representing a class of images, is the name of an ASCII file which contains a list of the names of all training images in that class. (See the file "FileLists_Faces" in the directory "samplefiles" for an example). pyr_depth -- the number of Gaussian pyramid levels used for the feature selection. pyr_depth = 1 means features are selected only on the original scale. The size of the image is down-sampled by two between two successive pyramid levels. At each pyramid level, features are obtained by projecting onto the set of 8x8 DCT basis functions. salientfeaturepath -- the path to the output directory where results are stored. Note that this directory must be created before running the code. (See description below) NOTE ======= This program support all image formats that are supported by ImageMagick including popular formats like TIFF, JPEG, PNG, PDF, PhotoCD, and GIF. Visit http://www.imagemagick.org/www/formats.html for detailed information. OUTPUT files for salient features ================================= One ASCII file is generated for each image class, MMD_class_xx, where xx is the index (line number starting from 0) of the class in the "training_filelist". MMD_class_xx lists all features sorted by their feature saliency. The four entries in each line are: feature_saliency marginal_diversity pyramid_level DCT_feature_index where feature_saliency is zero for features that are not salient for the class of interest and in the remaining features, it is the marginal diversity normalized so that the smallest value is one. In pyramid_level 0 means the original image scale. DCT_feature_index ranges from 0 to 63. For illustration images of the basis functions associated with each feature are provided in directory "samplefiles/DCTbasis". Another output ASCII file, "salientfeaturenum", lists the numbers of salient features (features with non-zero feature saliency) for every image class. Generating Saliency Map and Salient Locations ============================================= The binary program for generating the saliency map and salient locations is named "SalMap", which has the following syntax: SalMap [options] salientfeaturepath queryclassidx query_filelist outpath options: -d drawcirclenum: set the number of salient locations drawn on the image (default 5). if 0 then no circles is drawn. -f featurenum: set the number of salient features used to generate saliency maps (default 5) -nosm: no saliency map will be generated. where, salientfeaturepath -- the output path of "MMD" queryclassidx -- the index in training_filelist(i.e. the "xx" in file MMD_class_xx (starting from 0)) of the class for which saliency is being computed. For example, the index for the face class in "FileLists_Faces" is 1. query_filelist -- an ASCII file with the names (one per line) of the images for which a saliency map will be generated. (See "QueryFaces_test" in the directory "samplefiles" for an example). outpath -- the path to the directory where the saliency maps and salient locations are saved. Note that this directory must be created before running the code. OUTPUT files for saliency maps and salient locations ======================================= _sm.jpg -- saliency map (normalized to [0, 1]) for the input image. This file will not be generated is option "-nosm" is chosen. _salloc -- ASCII file containing salient locations detected in the query image. Each line is a salient location and contains the following four entries: 1. row of the center of the salient location 2. column of the center of the salient location 3. radius of the salient location 4. saliency value of the salient location _salloc.jpg -- orignal image with salient locations circled. This file will not be generated if the option "-d 0" is chosen. NOTES ===== 1. The original saliency map (before normalization) can be recovered by multiplying the normalized saliency maps by the saliency value of the first salient location. 2. To determine the best number of features for saliency detection, a cross-validation stage will be required based on a presence/absence classification test. The classifier can be of any form that works on the output of saliency maps and(or) salient locations. Please refer to the paper for detail. (No code for classification is included here). Reporting Bugs ============== Write email to dgao@ucsd.edu for bugs and problems with respect to the software. License Conditions ================================== This software is being made available for research purposes only. See the file LICENSE in this directory for conditions of use.