Supplemental Material How can we interpret the spatio-temporal patterns observed in the human and model classification images (CIs), and why did we find a strong tendency for position cues to correlate with behavior in regions located at extreme points along the shape templates (e.g. the feet of the walker or the corners of the squares)? In general, the classification image technique is well suited for exploratory analysis and has the strength of requiring few a priori assumptions about the outcome of the experiment or how the experimental variables might affect perception and behavior. However, this technique also has the power to reveal complex patterns and relationships in the data that must be interpreted a posteriori to derive meaning from the resulting images (van Boxtel & Lu, 2015). To help explain the specific pattern of results revealed in the current study we derived spatial and orientation distinctness maps, which represent the local difference in these features between a given template and its counterpart, e.g. the template moving in the opposite direction, owing to the fact that this comparison was the basis for the discrimination task performed in the actual experiment. An example of this analysis is illustrated in Supplemental Figure 1. Supplemental Figure 1. Example frame to illustrate the creation of feature distinctness maps. On the left, a reference template is shown (blue) with overlay of the opposing frame from the same time point (red). On the right, smoothed distinctness maps are shown for easy comparison to the smoothed classification image data from Experiment 1. The rationale behind this analysis is that positions are sampled uniformly and randomly from the underlying shape contour on each frame. Sometimes these locations will overlap with the competing shape contour (e.g. a walker facing the opposite direction) in which case positional information would not provide distinguishing information for discriminating the templates. On the other hand, some locations will be sampled that are distinctly specified by a particular template, thus providing distinguishing positional information that could influence the globally perceived direction of the hybrid stimulus. The same reasoning applies to orientation features of the underlying templates. On each trial, the subtle but random pattern of relative sampling from distinct and indistinct contour regions could provide an explanation for why perception was nudged toward one direction or the other in the face of generally ambiguous stimulus information. When comparing a single template frame to its competing template frame, some spatial regions are distinct and belong confidently to one shape or the other, while some regions have distinct orientation information that is not shared with nearby regions of the opposing template (Supplemental Figure 2). On the basis of these local spatial differences, we computed spatial distinctness maps, predicting that distinct spatial regions would correlate positively with decisions consistent with position cues and that indistinct regions would potentially correlate with orientation, since orientation could provide relatively more distinguishing information in these locations. For instance, imagine a case where only indistinct or overlapping regions are sampled in each stimulus frame. Discrimination between the two templates would be impossible on the basis of position information alone because there would be no features to distinguish the opposing shapes. Successful discrimination in such a case would necessarily require additional information such as that provided by element orientation. Supplemental Figure 2. Examples of spatial and orientation distinctness maps for each selected frame to match the results reported in Figure 3 of the manuscript. For easier comparison to human CIs, we spatially smoothed the spatial distinctness maps with a two-dimensional Gaussian filter (sigma = 8). Regions that are more distinct in terms of position or orientation, and hence belong with more certainty to a particular object, are represented by positive values (e.g. orange-red) in the maps, indicating larger distances between the templates. Comparing the distinctness maps to the behavioral classification images revealed many qualitative similarities. To estimate the relationship quantitatively, we computed the Pearson’s correlation coefficient between CIs and distinctness maps across all pixels in the images, excluding background pixels that never contained a signal or sample during the trials. Due to the large number of degrees of freedom (262,978 pixels for biological stimuli, and 136,660 for non-biological stimuli), we used a random permutation test to assess statistical significance of the correlation coefficients. In the permutation test, we randomly scrambled the mapping between subject responses and stimulus data, and processed this permuted data through the same pipeline as the experimental data to derive permuted group CIs. This procedure simulates a sample of observers using the same stimulus trial data, but with random responses. We computed the correlation coefficient for each of 100 randomly permuted CIs and used this null distribution to convert the experimental correlations to z-scores. We found that the relationship between the human CI and the spatial distinctness map was significant for both the biological stimulus (z=2.68, p<0.05), and non-biological stimulus (z=4.91, p<0.05).To examine the contribution of orientation cues to the responses, we performed a similar analysis based on maps that quantify the distinctness of orientation cues by computing the difference in orientation between two templates. We found that orientation distinctness on its own provided a poor fit to human data overall, as revealed by weaker similarity between human CIs and the orientation distinctiveness map for the biological (z = 0.4, p = 0.65), and non-biological stimulus (z = 0.73, p = 0.31). This result demonstrates the primacy of positional information in guiding perceptual discriminations on this task. In other words, if positions are sampled from distinct regions, then this is the primary factor in determining which global stimulus direction will be perceived by the observer. Only when indistinct regions are sampled at a relatively high rate will orientation then have the opportunity to take precedent and reverse the perceived direction. This interpretation makes sense intuitively as orientation can never exist in isolation without being anchored to a particular location specified by the center position of the Gabor window itself. Contrarily, position information can be estimated independently from orientation features. For instance, in our prior study we found that random and noisy orientation information could be easily discounted with no cost to discrimination performance in the same task (Thurman & Lu, 2014b). The spatial distinctness maps provide an intuitive and quantitative post-hoc explanation for why the human and model CIs turned out the way that they did. Although the stimuli were high ambiguous by design, perception of global movement direction was apparently pushed one way or the other due to low-level feature differences in the underlying shape templates. We view these results as complementary to those provided in the main paper in comparing human performance to the Bayesian observer model.