• Scene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations. More precisely, we use the pair correlation function (PCF), a powerful statistical tool, to evaluate dependencies between fixation locations along individual scanpaths. We demonstrate that aggregation of fixation locations within four degrees is stronger than expected from chance. Furthermore, the PCF reveals stronger aggregation of fixations when the same image is presented a second time. We use simulations of a dynamical model to show that a narrower spatial attentional span may explain differences in pair correlations between the first and the second inspection of the same image.
  • Top-down and bottom-up, as well as low-level and high-level factors influence where we fixate when viewing natural scenes. However, the importance of each of these factors and how they interact remains a matter of debate. Here, we disentangle these factors by analysing their influence over time. For this purpose we develop a simple saliency model, which uses the internal representation of a recent early spatial vision model as input and thus measures the influence of the low-level bottom-up factor. To measure the influence of high-level bottom-up features, we use DeepGaze II, a recent DNN-based saliency model. To vary top-down influences, we evaluate the models on a large Corpus dataset with a memorization task and on a large visual search dataset. We can separate the exploration of a natural scene into three phases: The first saccade, which follows a different narrower distribution, an initial guided exploration characterized by a gradual broadening of the fixation density and an equilibrium state which is reached after roughly 10 fixations. The initial exploration and the equilibrium state target similar areas which are determined largely top-down in the search dataset and are better predicted on the Corpus dataset when including high-level features. In contrast, the first fixation targets a different fixation density and contains a strong central fixation bias, but also the strongest guidance by the image. Nonetheless, the first fixations are better predicted by including high-level information as early as 200ms after image onset. We conclude that any low-level bottom-up factors are restricted to the first saccade, possibly caused by the image onset. Later eye movements are better explained by including high-level features, but this high-level bottom-up control can be overruled by top-down influences.
  • When searching a target in a natural scene, both the target's visual properties and similarity to the background influence whether (and how fast) humans are able to find it. However, thus far it has been unclear whether searchers adjust the dynamics of their eye movements (e.g., fixation durations, saccade amplitudes) to the target they search for. In our experiment participants searched natural scenes for six artificial targets with different spatial frequency throughout eight consecutive sessions. High-spatial frequency targets led to smaller saccade amplitudes and shorter fixation durations than low-spatial frequency targets if target identity was known before the trial. If a saccade was programmed in the same direction as the previous saccade (saccadic momentum), fixation durations and successive saccade amplitudes were not influenced by target type. Visual saliency and empirical density at the endpoints of saccadic momentum saccades were comparatively low, indicating that these saccades were less selective. Our results demonstrate that searchers adjust their eye movement dynamics to the search target in a sensible fashion, since low-spatial frequencies are visible farther into the periphery than high-spatial frequencies. Additionally, the saccade direction specificity of our effects suggests a separation of saccades into a default scanning mechanism and a selective, target-dependent mechanism.
  • When watching the image of a natural scene on a computer screen, observers initially move their eyes towards the center of the image --- a reliable experimental finding termed central fixation bias. This systematic tendency in eye guidance likely masks attentional selection driven by image properties and top-down cognitive processes. Here we show that the central fixation bias can be reduced by delaying the initial saccade relative to image onset. In four scene-viewing experiments we manipulated observers' initial gaze position and delayed their first saccade by a specific time interval relative to the onset of an image. We analyzed the distance to image center over time and show that the central fixation bias of initial fixations was significantly reduced after delayed saccade onsets. We additionally show that selection of the initial saccade target strongly depended on the first saccade latency. Processes influencing the time course of the central fixation bias were investigated by comparing simulations of several dynamic and statistical models. Model comparisons suggest that the central fixation bias is generated by a default activation as a response to the sudden image onset and that this default activation pattern decreases over time. Our results suggest that it may often be preferable to use a modified version of the scene viewing paradigm that decouples image onset from the start signal for scene exploration and explicitly controls the central fixation bias. In general, the initial fixation location and the latency of the first saccade need to be taken into consideration when investigating eye movements during scene viewing.
  • Dynamical models of cognition play an increasingly important role in driving theoretical and experimental research in psychology. Therefore, parameter estimation, model analysis and comparison of dynamical models are of essential importance. Here we propose a maximum-likelihood approach for model analysis in a fully dynamical framework that includes time-ordered experimental data. Our methods can be applied to dynamical models for the prediction of discrete behavior (e.g., movement onsets), in particular, we use a dynamical model of saccade generation in scene viewing as a case study for our approach. For this model, the likelihood function can be computed directly by numerical simulation, which enables more efficient parameter estimation including Bayesian inference to obtain reliable estimates and corresponding credible intervals. Using hierarchical models inference is even possible for individual observers. Furthermore, our likelihood approach can be used to compare different models. In our example, the dynamical framework is shown to outperform non-dynamical statistical models. Additionally, the likelihood based evaluation differentiates model variants, which produced indistinguishable predictions on hitherto used statistics. Our results indicate that the likelihood approach is a promising framework for dynamical cognitive models.
  • Visuospatial attention and gaze control depend on the interaction of foveal and peripheral processing. The foveal and peripheral regions of the visual field are differentially sensitive to parts of the spatial-frequency spectrum. In two experiments, we investigated how the selective attenuation of spatial frequencies in the central or the peripheral visual field affects eye-movement behavior during real-world scene viewing. Gaze-contingent low-pass or high-pass filters with varying filter levels (i.e., cutoff frequencies; Experiment 1) or filter sizes (Experiment 2) were applied. Compared to unfiltered control conditions, mean fixation durations increased most with central high-pass and peripheral low-pass filtering. Increasing filter size prolonged fixation durations with peripheral filtering, but not with central filtering. Increasing filter level prolonged fixation durations with low-pass filtering, but not with high-pass filtering. These effects indicate that fixation durations are not always longer under conditions of increased processing difficulty. Saccade amplitudes largely adapted to processing difficulty: amplitudes increased with central filtering and decreased with peripheral filtering; the effects strengthened with increasing filter size and filter level. In addition, we observed a trade-off between saccade timing and saccadic selection, since saccade amplitudes were modulated when fixation durations were unaffected by the experimental manipulations. We conclude that interactions of perception and gaze control are highly sensitive to experimental manipulations of input images as long as the residual information can still be accessed for gaze control.
  • During scene perception our eyes generate complex sequences of fixations. Predictors of fixation locations are bottom-up factors like luminance contrast, top-down factors like viewing instruction, and systematic biases like the tendency to place fixations near the center of an image. However, comparatively little is known about the dynamics of scanpaths after experimental manipulation of specific fixation locations. Here we investigate the influence of initial fixation position on subsequent eye-movement behavior on an image. We presented 64 colored photographs to participants who started their scanpaths from one of two experimentally controlled positions in the right or left part of an image. Additionally, we computed the images' saliency maps and classified them as balanced images or images with high saliency values on either the left or right side of a picture. As a result of the starting point manipulation, we found long transients of mean fixation position and a tendency to overshoot to the image side opposite to the starting position. Possible mechanisms for the generation of this overshoot were investigated using numerical simulations of statistical and dynamical models. We conclude that inhibitory tagging is a viable mechanism for dynamical planning of scanpaths.
  • Even when we look at stationary objects, involuntarily our eyes perform miniature movements and do not stand perfectly still. Such fixational eye movements (FEM) can be decomposed into at least two components: rapid microsaccades and slow (physiological) drift. Despite the general agreement that microsaccades have a central generating mechanism, the origin of drift is less clear. A direct approach to investigate whether drift is also centrally controlled or merely represents peripheral uncorrelated oculomotor noise is to quantify the statistical dependence between the velocity components of the FEM. Here we investigate the dependence between horizontal and vertical velocity components across the eyes during a visual fixation task with human observers. The results are compared with computer-generated surrogate time series containing only drift or only microsaccades. Our analyses show a binocular dependence between FEM velocity components predominantly due to drift. This result supports the existence of a central generating mechanism that modulates not only microsaccades but also drift and helps to explain the neuronal mechanism generating FEM.
  • In humans and in foveated animals visual acuity is highly concentrated at the center of gaze, so that choosing where to look next is an important example of online, rapid decision making. Computational neuroscientists have developed biologically-inspired models of visual attention, termed saliency maps, which successfully predict where people fixate on average. Using point process theory for spatial statistics, we show that scanpaths contain, however, important statistical structure, such as spatial clustering on top of distributions of gaze positions. Here we develop a dynamical model of saccadic selection that accurately predicts the distribution of gaze positions as well as spatial clustering along individual scanpaths. Our model relies on, first, activation dynamics via spatially- limited (foveated) access to saliency information, and, second, a leaky memory process controlling the re-inspection of target regions. This theoretical framework models a form of context-dependent decision-making, linking neural dynamics of attention to behavioral gaze data.
  • Whenever eye movements are measured, a central part of the analysis has to do with where subjects fixate, and why they fixated where they fixated. To a first approximation, a set of fixations can be viewed as a set of points in space: this implies that fixations are spatial data and that the analysis of fixation locations can be beneficially thought of as a spatial statistics problem. We argue that thinking of fixation locations as arising from point processes is a very fruitful framework for eye movement data, helping turn qualitative questions into quantitative ones. We provide a tutorial introduction to some of the main ideas of the field of spatial statistics, focusing especially on spatial Poisson processes. We show how point processes help relate image properties to fixation locations. In particular we show how point processes naturally express the idea that image features' predictability for fixations may vary from one image to another. We review other methods of analysis used in the literature, show how they relate to point process theory, and argue that thinking in terms of point processes substantially extends the range of analyses that can be performed and clarify their interpretation.
  • Eye movements during fixation of a stationary target prevent the adaptation of the photoreceptors to continuous illumination and inhibit fading of the image. These random, involuntary, small, movements are restricted at long time scales so as to keep the target at the center of the field of view. Here we use the Detrended Fluctuation Analysis (DFA) in order to study the properties of fixational eye movements at different time scales. Results show different scaling behavior between horizontal and vertical movements. When the small ballistics movements, i.e. micro-saccades, are removed, the scaling exponents in both directions become similar. Our findings suggest that micro-saccades enhance the persistence at short time scales mostly in the horizontal component and much less in the vertical component. This difference may be due to the need of continuously moving the eyes in the horizontal plane, in order to match the stereoscopic image for different viewing distance.