|Human Behavior Studies Linking Tracking and Saliency|
Pscyophysics Experiment 1 : Saliency affects Tracking Performance
Subjects viewed displays
The figures below present the rate of successful tracking in the two experiment versions. In both cases, this rate was much higher in the salient than in the non-salient condition. In the latter, tracking performance was almost at the chance level of 1/3, suggesting complete tracking failure. Overall, tracking performance was vastly improved for salient targets even when they did not pop-out. In fact, the similarity of the detection rates in the two experiments suggests that pop-out plays no role in tracking performance. It suffices for the target to be locally salient.
Pscyophysics Experiment 2: Tracking vs. saliency as a function of feature contrast
The results of the first experiment show that tracking is related to saliency.
While a salient target is tracked reliably, non-salient targets are
close to non-trackable.
Experiment 2 aimed to investigate the connection between the two phenomena
in greater detail, namely to quantify how tracking reliability
depends on target saliency. Since saliency is not an independent variable,
it can only be controlled indirectly. This is usually done by manipulating
At the start of a trial, one of the ellipses was designated as target
(cued with a white bounding box). Subjects were asked to track the
target covertly, while fixating on a white square at the center
of the screen. At the end of the trial, all ellipses were completely
occluded by larger white disks
and the subjects asked to click on the disk corresponding to the target.
Each subject performed 30 trials under 7 conditions, for a total of 210
trials. The seven conditions corresponded to different levels of orientation
contrast between target and distractor ellipses. Distractor orientation,
defined by the major axis of the distractor ellipses, was always 0o.
Target orientation, determined by the major axis of the target ellipse, was
selected from 7 values: 0o, 10o, 20o, 30o, 40o,
60o or 80o. This made orientation contrast equal to the
target orientation. To keep all other variables (e.g. distance between
items, motion patterns, distance from target to fixation square) identical,
a trial was first created for one condition (target orientation 0o).
The trials of all other conditions were obtained by applying a transformation
to each frame of this video clip. This consisted of an affine transformation
of the grid of ellipse centers, followed by the desired change in target
As shown in the figure below, the curves of tracking reliability vs. orientation contrast, obtained in all three versions of the experiment, were remarkably similar to the saliency vs. orientation contrast curves of Nothdurft. As is the case for saliency, 1) distinct threshold and saturation effects were observed for tracking, with tracking reliability saturating when orientation contrast increases beyond 40o, and 2) increased distractor heterogeneity caused a decrease in tracking accuracy.
The near perfect correlation (r=0.97) between tracking reliability and saliency is evident from the scatter plot shown below. Each point in this plot corresponds to a different combination of heterogeneity and orientation contrast. In summary, tracking has a dependence on orientation contrast remarkably similar to that of saliency. This is strong evidence for the hypothesis that tracking performance is determined by the saliency of the target, and that tracking and saliency share common neural mechanisms.
Pscyophysics Experiment 3: Effect of background on tracking performance
The results of the Experiments 1 and 2 establish a strong connection between saliency and tracking. In relating saliency and tracking, the saliency hypothesis proposes that tracking uses center-surround mechanisms to identify salient features that make the target distinct from their background. The involvement of a centersurround mechanism in tracking is consistent with the results of Experiment 2, where the tracking performance is seen to depend on distractor heterogeneity - if the surround were not involved in the tracking process, the performance would not depend on the number of distractors similar to the target in the surround.
To test the involvement of a center-surround mechanism in tracking further, we designed another experiment.
In this experiment the distance between the target and the closest similar distractor (i.e.
one with the same orientation as the target) is controlled so that a region of fixed radius around
the target is devoid of any similar distractors. By varying this target-similar distractor distance
tsd, and observing the tracking performance, three possible scenarios can be evaluated :
a localized surround region is involved in the tracking process: in this case, when ttsd is varied, there should be a distance, which we shall denote as 't_critical', beyond which all similar distractors are outside the surround region relevant for tracking. So for large enough values of tsd, i.e. tsd > tcritical, distractor heterogeneity should not affect tracking
performance. the entire visual field is involved: if the entire visual field is involved, no such distance, tcritical, should exist and distractor heterogeneity should affect tracking performance for all
values of ttsd. no surround region is included in the tracking process: in this case, the success rate of tracking should be identical in all versions regardless of the distractor heterogeneity.
a localized surround region is involved in the tracking process: in this case, when ttsd is varied, there should be a distance, which we shall denote as 't_critical', beyond which all similar distractors are outside the surround region relevant for tracking. So for large enough values of tsd, i.e. tsd > tcritical, distractor heterogeneity should not affect tracking performance.
the entire visual field is involved: if the entire visual field is involved, no such distance, tcritical, should exist and distractor heterogeneity should affect tracking performance for all values of ttsd.
no surround region is included in the tracking process: in this case, the success rate of tracking should be identical in all versions regardless of the distractor heterogeneity.
The results of Experiment 2 already showed that conjecture (3) does not hold. Experiment 3 was designed to determine which among conjectures (1) and (2) holds.
The experimental setting, stimuli and procedure were identical to those in Experiment 2. The target orientation for all stimuli was fixed at 40o. Two versions of the experiment were conducted with different numbers of ellipses in the target orientation corresponding to two values of distractor heterogeneity. As in Experiment 2, in the first version, 18 of the 23 ellipses were in distractor orientation, and the remaining 5 in target orientation, one of the latter being the actual target. In the second version, 13 ellipses were in distractor and 10 in target orientation. In each version, the stimulus sequence could be in one of four conditions depending on the average value of ttsd, i.e. the average, over all frames in the sequence, of the distance between the target and the nearest similar distractor. In each condition, the sequences were designed such that this quantity was in the range 1.67o to 5.01o (about 45 pixels to 135 pixels).
The figure on the left below presents the rate of successful tracking in the two versions as a function of the average distance to nearest similar distractor. Also shown in the figure is the tracking accuracy for the version with no similar distractors at target orientation of 40o from Experiment 2. As there are no distractors similar to the target in this case, a flat line is used to denote the tracking accuracy over all values of the abscissa.
The results show that tracking performance improves as the average distance to nearest similar distractor increases under both versions with non-zero distractor heterogeneity. Further, for large enough value of the distance, tracking accuracy in the two versions are nearly the same as the one with no distractor heterogeneity. This shows that conjecture (a) holds, i.e. a localized surround region of limited size is involved in the tracking task, and tcritical is about 4o. When the identical distractors are kept out of this region, adding more such distractors does not impact tracking performance.
The prediction for the same data using the saliency model is also shown in the figure on the right above. The results clearly show that the model can predict the trend seen in the psychophysics experiment.