Using eye movements to reveal the spatio-temporal dynamics of Visual search
How do you identify and study a specific Visual routine, perhaps one of many, embedded in a complex real-world task? We believe that the answer to this question depends on the coarseness of the behavioral dependent measure. Consider for example the relatively simple task of searching for a particular target pattern in a scene. The dependent measure most often used to study a Visual search task is the time taken by subjects to manually press a button in response to detecting a target, commonly referred to as a reaction time or RT. A manual RT, however, really only indicates the completion of the search task; it provides no directly observable measure of the processes enlisted during search or how these processes give rise to target detection. Without this critical information, piecing together the Visual routine responsible for search becomes a daunting task. With David Sheinberg, an early project set out to supplement this RT measure with a measure of how people move their eyes as they search for a target. In contrast to the RT measure, the sequence of saccades and fixations available from an oculomotor measure provide a rich and readily observable source of information about the spatio-temporal evolution of the search process—it enables you to see the search process as it is happening, not simply after it has finished. Using this eye movement methodology, we found evidence for distinct and separable oculomotor signatures corresponding to parallel and serial processes in a search task (Zelinsky & Sheinberg, 1992, 1993, 1995, 1997).  With Rajesh Rao, Mary Hayhoe, and Dana Ballard, this investigation was extended to a Visual search task that used more realistic stimuli. Motivating this change was a concern that search theory had become overly dominated by Visual ly impoverished stimuli and simple displays, and that key findings in this literature may not generalize to more realistic objects and scenes. To satisfy this concern, we created a database of real-world objects and backgrounds enabling the creation of a large number of search scenes, each complex enough to tap the mid-level base representation under investigation yet simple enough to allow for rigid experimental control. Combining an eye movement search paradigm with these more realistic stimuli yielded evidence for an initially broad distribution of search in a scene, followed by a narrowing of this distribution with time until gaze and the target were spatially aligned (Zelinsky et al., 1997). The findings from this investigation have recently motivated a study comparing the search behavior of normal observers to patients with damage to their left parietal cortex (Shimozaki et al., 2003). The unique capacity for eye movements to track the unfolding search process also enables this dependent measure to be used in the evaluation of search theory. Given the widespread use of RTs in the search literature, most search theories attempt to explain only the time needed to find a target; not where and when each search movement is directed during the process of target localization. This focus is problematic because RTs alone do not sufficiently constrain search theory, meaning that there are many different ways to go about explaining the same set of RT data (Townsend, 1976). However, one prominent search theory (Guided Search Theory, GST), in addition to making predictions about the time required by search, also made strong claims about how the search process should be directed in space (Wolfe, 1994). Drawing upon relationships between eye movements and spatially directed attention (Hoffman & Subramaniam, 1995; Kowler et al., 1995), an oculomotor test of this theory was devised, which showed that GST failed to account for the guidance of gaze to a search target on a significant percentage of trials (Zelinsky, 1996). This work sparked a fiery debate in the literature regarding the evidence for and against guidance during search (Findlay, 1997; Hooge & Erkelens, 1999; Motter & Belky, 1998; Scialfa & Joffe, 1998; Williams & Reingold, 2001).
Research Philosophy
Each time we engage in a moderately complex task, we likely enlist the help of an untold number of simpler visuo-motor operations that exist largely outside of our conscious awareness. Consider for instance the steps involved in preparing a cup of coffee. For the sake of simplicity, assume that the coffee has already been brewed and is waiting in the pot, and that all of the essential accessories, an empty cup, a spoon, a carton of
cream, and a tin of sugar, are sitting on a countertop in front of you. What is your first step toward accomplishing this goal? The very first thing that you might do is to move your eyes to the handle of the coffee pot, followed shortly thereafter by the much slower movement of your preferred hand to the same target. Because the coffee pot is hot and the handle is relatively small, this change in fixation is needed to guide your hand to a safe and useful place in which to grasp the object. After lifting the pot, your eye may then dart over to the cup. This action is needed, not only to again guide the pot to a very specific point in space directly over the cup, but also to provide feedback to the pouring operation so as to avoid a spill. After sitting the pot back on the counter (an act that may or may not require another eye movement), your gaze will likely shift to the spoon. Lagging shortly behind this behavior may be simultaneous movements of your hands, with your dominant hand moving toward the sugar tin and your non-preferred hand moving to the spoon. The spoon is a relatively small and slender object that again requires assistance from foveal vision for grasping; the tin is a rather bulky and indelicate object that does not require precise Visual information to inform the grasping operation. Once the spoon is in hand and the lid to the tin is lifted, gaze can then be directed to the tin in order to help scoop out the correct measure of sugar. To ensure that the spoon is kept level, a tracking operation may be used to keep your gaze on the loaded spoon as it moves slowly to the cup. After receiving the sugar, and following a few quick turns of the spoon, your coffee would finally be ready to drink (see Land et al., 1998, for a similarly framed example).
eye movements and visual cognition