Comparing Thompson’s Thatcher effect 1 Comparing Thompson’s Thatcher effect with faces and non-face objects Elyssa L. Twedt Thesis completed in partial fulfillment of the requirements of the Honors Program in Psychological Sciences Under the Direction of Prof. Isabel Gauthier Vanderbilt University April 6, 2007 Approved ________________________ Date _______________________ Comparing Thompson’s Thatcher effect 2 Acknowledgements I would first like to thank my advisor, Isabel Gauthier for her commitment to this project. Her advice and guidance throughout every stage have been invaluable, and her knowledge and willingness to teach provided a wonderful learning experience. I would also like to thank the entire Object Perception Lab, as well as Tom Palmeri’s lab, for all of their helpful advice along the way. A special thank you is owed to the graduate students who offered many hours of their time in order to assist, teach, and reassure me throughout the process. I also wish to thank David Sheinberg, at Brown University, for his ideas and valuable contributions to this project. Comparing Thompson’s Thatcher effect 3 Abstract The classical Thatcher effect (TE) is experienced when global inversion of a face makes it difficult to notice the local inversion of its parts (Thompson, 1980). The TE can be quantified by comparing the ease with which observers compare a normal and locally transformed image, when both images are shown upright versus inverted. Here we compared the classical TE for images of adult faces to a wide variety of other categories, including grimacing faces, baby faces, animal faces, buildings, scenes, and various types of letter-strings. If the TE reflects a special form of configural processing for faces, faces should show a much larger TE than all other categories. Error rates revealed larger TEs for letter-strings over all other categories. Within the letter-string categories, words showed a larger TE than non-words and low frequency words revealed a larger TE than high frequency words. Within the face categories, adult, grimacing, and baby faces showed comparable TEs whereas animal faces showed the largest TE. For objects, we observe TEs for all categories, but at smaller magnitudes. Our results suggest the TE is not exclusive to faces - it does not appear to uniquely depend on factors such as expertise or the grotesque appearance of the transformation. Comparing Thompson’s Thatcher effect 4 Introduction The classical Thatcher illusion, created by Peter Thompson (1980), has long been used to demonstrate the importance of configural processing for upright faces, a strategy that does not seem to be available for inverted faces. The Thatcher effect is experienced when local inversion of the eyes and mouth is more difficult to detect when the face is globally inverted compared to when the face is upright (see Figure 1). Thompson chose to locally invert the eyes and mouth because these features are known to convey the most about a person’s expression (Thompson, 1980). He predicted that with global inversion, facial expression would be better preserved in the Thatcherized face than the normal face. However, he found that local inversion of the eyes and mouth did not make a significant difference in expression when compared to the normal face. When both images were presented in the upright position, the Thatcherized face appeared grotesque. Thus, two effects are observed with the Thatcher effect: First, global inversion makes local changes difficult to detect. Second, the upright Thatcherized face appears grotesque. This effect has been explained in a number of ways, most suggesting that the effect is face-specific. The most prominent explanation is the use of different processing strategies when viewing upright and inverted faces. Processing of upright faces is based on configural encoding of individual features (Boutsen and Humphreys, 2003). Configural encoding means that a person looks at the entire face while processing the spatial relationship between individual features to create a holistic percept. When a face is Thatcherized, configural information is disrupted, making the face appear grotesque (Bartlett & Searcy, 1993; Boutsen & Humphreys, 2003). In contrast, processing of inverted faces is based on componential information, which refers to encoding individual facial features (Boutsen & Humphreys, 2003). Because Thatcherization disrupts configural but not componential processing, inverted face processing is Comparing Thompson’s Thatcher effect 5 not impaired. When a face is globally inverted, it is more difficult to recognize because configural processing is disrupted, making the relationship between internal and external features less salient and local changes more difficult to detect (Rock, 1988). This is also known as the face inversion effect (Bartlett & Searcy, 1993; Boutsen & Humphreys, 2003; Yin, 1969). The Thatcher effect has also been explained in terms of encoding facial expression. (Bartlett & Searcy, 1993). In an upright face, Thatcherization creates a grotesque expression that is easily noticed as we focus on the image holistically. During inversion, it is more difficult to recognize a face and encode facial expression, so global inversion of a Thatcherized face reduces the perception of a grotesque expression (Bartlett & Searcy, 1993; Muskat, 1997). Others suggest that a frame of reference is an important aspect to experiencing the Thatcher effect (Parks, 1983; Rock, 1988; Valentine & Bruce, 1985). More specifically, an object-centered frame of reference incorporates information about the spatial relationship of internal parts of an object. A person gathers information from that spatial organization to assign direction, or “topness” to individual features (Parks, 1983). When internal parts are locally inverted, such as eyes and mouth, the orientation of those parts conflict with the orientation of the rest of the object. This orientation mismatch is only noticed when the object is viewed in its usual orientation. This is because, when inverted, a frame of reference becomes less powerful, and assignment of “topness” is less clear, making local changes less salient. Valentine & Bruce (1985) proposed that the facial frame of reference includes the relative position of the eyes and mouth, although Parks (1983) argued that external features are also included. In addition to the object-centered frame, Rock (1988) proposes a retinal factor which assigns direction to an object relative to the environment, based on a person’s own perception of “up” and “down”. When a Thatcherized face is upright, both frames of reference are in agreement so inversion of the mouth Comparing Thompson’s Thatcher effect 6 and eyes is easily noticed. When inverted, the object-centered frame of reference is opposite from the environmental frame so internal featural changes are less salient. Certain images are more often viewed from one orientation (i.e., mono-oriented) such as faces and words. Inversion may disrupt frames of reference for mono-oriented stimuli more severely than objects with which we have more experience viewing from multiple angles, because the latter can more easily be corrected with mental rotation (Rock, 1988). The more familiar we are with an object at a specific orientation, the more inversion will disrupt object processing. Based on the previous literature, the classical Thatcher effect has been attributed to disruption of encoding processes when faces are locally and holistically inverted, familiarity with faces and with their canonical orientation, a powerful frame of reference, and encoding of facial expression. These explanations may help explain why we experience the Thatcher effect for faces, but there is little empirical research to assess whether the Thatcher effect can be experienced with non-face objects. We did encounter a demonstration of the Thatcher effect using words. Parks (1983) locally inverted several letters of a word and then inverted the entire word to obtain the same general effect observed with faces (see Figure 2). Parks speculated that the Thatcher effect may not be specific to faces based on processing mechanisms, familiarity and/or expression, but rather, the effect may depend on a powerful frame of reference, which becomes less effective when inverted. Parks’ theory was based on a demonstration and not experimental data, but his observation compelled us to further explore the Thatcher effect in other objects, including various types of letter-strings. The main goal of the present study was to determine whether the Thatcher effect can be experienced in non-face objects. If so, we wanted to measure whether the Thatcher effect is larger for faces than non-face categories. In order to approach these questions, we chose to Comparing Thompson’s Thatcher effect 7 locally manipulate images from three broad categories: faces, words, and objects/scenes. Within these three groups, we created subgroups such as animal faces, baby faces, cars, buildings, nonwords and low-frequency words. Because, to our knowledge, a study of this scope had not been done, we chose categories that covered a wide range of individual object differences that we could compare to each other, in hopes of discovering certain trends, biases, and explanations for experiencing a Thatcher effect. We measured the degree to which “Thatcherization” (i.e., local inversion of internal features) affected various object categories. By quantifying the Thatcher effect, we were able to determine where faces lie relative to other object categories and test hypotheses about what factors most influence the Thatcher effect. This helped us determine whether faces are really in their “own category” (i.e., Thatcherization has a significantly larger affect on faces than non-face object categories) or if non-face categories elicit similar or even larger Thatcher effects. In addition to our general question of whether the Thatcher effect is face-specific, we hoped to better understand what factors influence the Thatcher effect. By reviewing the literature on inversion effects in general, we were able to hypothesize possible reasons for experiencing a Thatcher effect in some categories, but not others. One possibility is that familiarity with an object category is important for detecting local changes. If the Thatcher effect depends on familiarity with an object, we would expect that objects with which we are least experienced in discriminating would elicit smaller Thatcher effects. For example, most people have less experience discriminating animal and baby faces than adult faces so we would expect a larger Thatcher effect for adult faces then animal faces. A second factor may be perception of grotesqueness or bizarreness in an upright Thatcherized face. We would expect the size of a Thatcher effect to increase as Thatcherized images appear more bizarre. (Note: Because Comparing Thompson’s Thatcher effect 8 grotesqueness is a form of facial expression that can only be applied to faces, we will instead use the term ‘bizarre’ to describe an image’s unusual appearance). That is, differences between a very bizarre Thatcherized image and a normal image will be more easily detected, leading to a more robust Thatcher effect. A third factor, is we may have more experience seeing non-face objects (e.g., shoes, cup) in rotated orientations that deviate from their canonical orientation, so that the frames of reference for non-face objects are less powerful and less impaired than faces (Yin, 1969; Rock, 1973). If an object’s frame of reference is a good indicator for the Thatcher effect, then faces and words, which are thought to have more stable frames of reference, will elicit larger Thatcher effects than scenes or buildings. That is, we may have more experience seeing local parts of scenes (e.g., a picture) from different angles, so local inversion may not disrupt the overall appearance of the upright scene. If the Thatcher effect is face-specific, then saliency of facial expression (i.e., grotesque appearance), a special form of configural processing, and familiarity with faces may be important criteria for experiencing the Thatcher effect. However, if non-face categories show significant Thatcher effects, we may need to rethink why the Thatcher effect is experienced and reevaluate the claim that faces are special with respect to this effect. Methods Apparatus and Stimuli All experiments were run on a Power Mac G3 using Matlab OS9. Stimuli consisted of images from twelve different object categories: adult faces, grimacing faces, baby faces, animal faces, buildings, cars, close-up scenes, large scenes, high-frequency (HF) words, low-frequency (LF) words, HF non-words, and LF non-words. Figure 1 shows examples of normal and Thatcherized images for each object category, except letter-strings. There were 10 images in Comparing Thompson’s Thatcher effect 9 each category and images were collected from various Internet and image bank sources. 8-letter words were chosen from the MRC Psycholinguistic Database, which generates a list of words based on a set of criteria (i.e., word length and Kucera-Francis written frequency). LF words had a frequency of 1 and HF words had a frequency between 204 and 392 (see Coltheart, 1981 for user guidelines). For non-words, we transposed internal adjacent letters while keeping the first and last letters in the same position. That is, we switched the position of the second and third letters, the fourth and fifth letters, etc. Figure 2 shows letter-string examples and includes a complete list of the words used in all letter-string categories to illustrate how non-words were derived. Each object was centered within a white area of 250 pixels wide x 250 pixels high. Stimuli were presented at a screen resolution of 1280 x 950 pixels. Objects were manipulated using Adobe Photoshop 7.0 to produce “Thatcherized” images by locally inverting parts of each object (e.g., invert 2 letters of a word). We created 2 levels of transformation for each image. The first level, Thatcherized 1, was created by only inverting one part of each object. The second level, Thatcherized 2, was created by inverting two parts of each object. We created two levels of transformation in case we obtained a ceiling on the TE for accuracy with which changes were detected in inverted pairs. We could then match conditions in terms of performance with inverted pairs. Thus, each image had three versions: normal, Thatcherized 1, and Thatcherized 2. For each image category, we tried to locally manipulate the same parts for each object. For example, we always inverted the eyes for ‘Thatcherized 1’ faces and we inverted the eyes and mouth for ‘Thatcherized 2’ faces. Buildings, close-up scenes, and large scenes contained a lot of variability between images thus making it difficult to make uniform changes across the entire category. Comparing Thompson’s Thatcher effect 10 While trying to manipulate similar features in each image (e.g., building windows and doors; cups) features may have varied in location. Experiment 1: Pilot Study This pilot experiment was designed to measure the average size of a Thatcher effect for each category and compare those effects across categories. Image pairs were presented simultaneously and participants made a same/different judgment. We measured the size of the Thatcher effect by finding the difference in accuracy and reaction time for determining that an image pair was different when upright compared to when the pair was inverted. Participants 20 undergraduate students from Vanderbilt University (7 women and 13 men) volunteered to participate in the experiment in exchange for course credit. All participants reported normal or corrected-to-normal vision. Design Four factors were manipulated within-subjects: orientation (upright/inverted), level of transformation (normal, Thatcherized 1, Thatcherized 2), category, and trial type (same/different). Both images in each pair had the same identity. That is, both images represented the same person, word or object and were presented in the same global orientation. On same trials, either both images were normal or both images were Thatcherized at the same level of transformation. On different trials, a normal image was paired with either a Thatcherized 1 image or Thatcherized 2 image. For different trials, the position of each image was counterbalanced, so that on half of the trials the normal image appeared on the left and on half the trials the normal image appeared on the right. All image pairs were presented in both the upright and inverted orientations. Comparing Thompson’s Thatcher effect 11 Procedure Participants judged whether two images in a pair were the same (‘1’) or different (‘2’). Participants were told that two images of the same identity would appear side-by-side in the same orientation. Image pairs remained on the screen until a response was made and no feedback was given. We stressed the importance of responding as quickly and accurately as possible. There were a total of 1200 trials broken into 6 blocks of 200 trials each. Participants were offered a short break in between each block. Reaction time and accuracy were measured. Each session took about 90 minutes to complete. Results and Discussion Results are shown in Figure 3. We computed mean delta (upright-inverted) accuracy and mean delta reaction time for correct trials across all object categories and used these measures to operationally define the size of a Thatcher effect. We focused our analysis on different trials. Both analyses revealed significant Thatcher effects in all non-face categories, for at least one level of transformation, suggesting that the Thatcher effect is not unique to faces. For accuracy, adult faces did not have the most robust Thatcher effect, but rather, LF words and LF non-words at the Thatcherized 1 level had the largest Thatcher effect. In this task, we found a lot of variability within both reaction time and accuracy results across object categories. For example, LF words (level 1) and LF non-words (level 1) showed comparable Thatcher effects for accuracy, but LF words showed a much larger Thatcher effect than LF non-words for reaction time. It was difficult to equate all categories using two different dependent measures, especially since we found a lot of variability among categories. Because error rates were very high, it did not make sense to focus analysis on reaction time. Thus, we decided to revise and improve this task in order to make our results easier to interpret and reduce any speed-accuracy trade-offs. Comparing Thompson’s Thatcher effect 12 Experiment 2: Same-Different Task 2 In this experiment, we focused our analysis on accuracy and altered the design of Experiment 1 to reduce reaction time variability. We also changed images in the letter-string categories so that all letters were lowercase, in contrast to Experiment 1, where each letter-string was capitalized. The ultimate goals of this experiment were to determine whether the Thatcher effect could be experienced in non-face objects, determine which categories yielded the most robust Thatcher effects, and compare Thatcher effects between categories to gain insight on what influences the Thatcher effect. Participants 21 undergraduate students from Vanderbilt University (15 women and 6 men) volunteered to participate in the experiment in exchange for course credit or $18 cash payment. All participants had normal or corrected-to-normal vision and had not participated in Experiment 1. Design The design was similar to Experiment 1 except images in a pair were presented sequentially. For different trials, the presentation order of images was counterbalanced, so that on half of the trials the normal image was presented first, and on half the trials the normal image was presented second. All image pairs were presented in both the upright and inverted position. Procedure Participants judged whether two images in a pair were the same (‘1’) or different (‘2’). We stressed the importance of responding as quickly and accurately as possible. Each trial consisted of a fixation cross, presentation of Image 1 (750 ms), a mask (300 ms), presentation of Image 2 (750 ms), and a white screen (2250 ms). Participants were told that they could make a Comparing Thompson’s Thatcher effect 13 response from the onset of Image 2 up until three seconds. Participants heard a tone if they responded incorrectly or if they did not respond within the three-second time constraint. Trials were broken into 12 blocks of 100 trials each and participants were offered a short break in between each block. Each session took about 75 minutes to complete. Results and Discussion After initial data analysis, we noticed a bias to respond ‘same’ across object categories. Thus, we calculated d’ and took a difference of d’ scores in order to correct for this bias in our results. Delta d’ (upright trials minus inverted trials) served as our measure for size of Thatcher effect. We focused our analysis on sensitivity results and did not include an analysis of reaction time due to high error rates, which would make interpretation difficult. Figure 4 shows a difference in sensitivity for deciding that two images in a pair are different during upright trials versus inverted trials. Our results indicated significant (i.e., relative to zero) Thatcher effects for all object categories for at least one level of transformation. An ANOVA comparing all object categories revealed a significant main effect for category, F(11, 120) = 13.201, p = 0.0001, and a significant interaction between category and level of transformation, F(11, 220) = 3.2128, p = 0.0004. LF words had the largest Thatcher effect (M = 1.68) over all other categories, followed by HF words (M = 1.376), and animal faces (M = 0.947), respectively. Because it was difficult to directly compare all of our categories, we decided not to look at this interaction further, but instead look at differences within each of the three subgroups (i.e., face, objects/scenes, letter-strings). This would allow us to better understand why the Thatcher effect is experienced in general, and in particular, why the Thatcher effect is experienced to a larger extent in some objects than others. Comparing Thompson’s Thatcher effect 14 Face categories. Because prior literature focused on the Thatcher effect in faces, we manipulated different categories of faces that varied in age, specie, familiarity and facial expression (i.e., pleasant or grimacing). If the Thatcher effect depends on familiarity, we would expect adult faces to have the largest Thatcher effect. Categories for which we are least experienced in discriminating, such as animal and baby faces, would have the smallest Thatcher effect. If perception of a bizarre expression is important, then grimacing faces, which are fairly bizarre without Thatcherization, should show a smaller Thatcher effect because the Thatcherized and normal versions are more similar, making local changes less salient. Thatcherized 2 animal faces showed the largest Thatcher effect over all face categories. For level 1, faces and grimacing faces revealed comparable Thatcher effects and were larger than baby and animal faces. For level 2, animal faces had the largest Thatcher effect over all other face categories, whereas adult faces elicited the smallest Thatcher effect. An ANOVA revealed a significant interaction between level of transformation and category, F(3, 60) = 7.5493, p = 0.0002. Post hoc tests showed that the Thatcher effect was larger for animal and baby faces at level 2 transformation, whereas faces and grimacing faces had a larger Thatcher effect at level 1 transformation. Main effects for level and category were not significant. These results suggest that familiarity is not a primary predictor of the Thatcher effect. We do see that transformation is more salient in faces and grimacing faces since only the eyes need to be locally inverted to yield the same size Thatcher effect as baby faces when two parts are inverted. Also, the Thatcher effect for animal faces is much smaller when only the eyes are inverted, but very large when both the eyes and mouth are inverted. It is interesting that the Thatcher effect for grimacing faces was comparable to adult and baby faces, which casts doubt on the idea that a bizarre expression in a Thatcherized face is important to detecting changes from an unaltered face. Comparing Thompson’s Thatcher effect 15 Word categories. In analyzing the word categories, we were interested in whether word frequency and/or lexical status of a word influenced the Thatcher effect. This would suggest that familiarity is important and we would expect larger Thatcher effects for high frequency over low frequency letter-strings and for words over non-words. An ANOVA revealed a main effect for word type, F(1, 20) = 30.72, p = 0.0001, and frequency, F(1, 20) = 9.464, p = 0.006. Words showed a significantly larger Thatcher effect than non-words and LF words showed a significantly larger Thatcher effect than HF words. The interaction between level of transformation and frequency was not significant, F(1, 20) = 2.9525, p = 0.1012, but did show a trend for larger Thatcher effects in Thatcherized 1 words over Thatcherized 2 words. The main effect for words suggests that familiarity may be important, but the main effect for frequency contradicts this idea. To further explore the letter-string results, it is beneficial to look at d’ results separately for words and non-words at both the upright and inverted orientations. Figure 5 shows d’ results for all letter-string categories and delta d’ results for size of Thatcher effect. Words and non-words are different when upright but are almost identical when inverted. This suggests that certain cues or processing strategies are being used when viewing inverted words, which makes them more similar to non-words or upright words. We also observe a frequency effect for inverted letter-strings but not for upright letter-strings. Exploring this issue we found a ceiling effect for accuracy on upright word trials. Object/Scene categories. By choosing non-face, non-word objects, we hoped to better understand if the Thatcher effect is restricted to certain categories. That is, if the Thatcher effect depends on a powerful frame of reference or familiarity, objects and scenes may not show as large of a Thatcher effect as faces and words. We may have more experience in seeing these objects from different angles and rotations so that Thatcherization does not impair object Comparing Thompson’s Thatcher effect 16 processing. An ANOVA analysis revealed significant Thatcher effects (for at least one level of transformation) for all categories but at lower magnitudes than faces and letter-string categories. Buildings, cars and close-up scenes did not show significant Thatcher effects at level 1, but did at level 2. The main effect for level of transformation was significant, F(1, 20) = 4.3885, p = 0.0491, but the main effect for category did not reach significance. Object categories showed larger Thatcher effects for Thatcherized 2 images than Thatcherized 1 images, except in scenes. These results suggest that the Thatcher effect can be experienced with objects/scenes, although the effect is not as robust as with faces or letter-strings. This could either be due to greater difficulty with detecting changes in upright objects/scenes than in faces or a smaller inversion effect for objects/scenes than faces. To assess these possibilities, it is useful to look at d’ values separately for upright and inverted objects/scenes (see Appendix A). For upright cars, d’ is similar to upright animal faces, but the difference between upright and inverted for animal faces is much greater than for cars. We find a similar comparison between close-up scenes and animal faces. This suggests that changes are not more difficult to detect in objects/scenes, but there is a smaller inversion effect, which leads to a smaller Thatcher effect. Image Ratings One aspect of the Thatcher effect is that a Thatcherized face appears grotesque when upright but not inverted. This has been explained because of disruption of configural processing due to local inversion, the grotesque appearance of an upright face, and failure to encode facial expression in an inverted face. One of our goals was to explain why the Thatcher effect is more powerful for certain image categories. One hypothesis was that perceived bizarreness facilitates the size of the Thatcher effect so that the more bizarre an image appears when upright, the larger the Thatcher effect. That is, two upright images will be much easier to discriminate during a Comparing Thompson’s Thatcher effect 17 same-different judgment if one is very bizarre so fewer errors will occur for upright judgments. When inverted, this discrimination will be more difficult, and we will see a larger difference between upright and inverted images for that category, compared to images that are not rated highly bizarre. To address this idea, we asked 26 Vanderbilt undergraduate students (20 women and 6 men) to rate how bizarre each upright image appeared on a scale of 1 to 7 (1 = normal; 7 = very bizarre). Participants had not completed the previous experiments. We obtained separate ratings for image sets used in Experiment 1 and Experiment 2 because some images in Experiment 2 were altered. Because we focused our analysis on the results from Experiment 2, we will report the ratings that correspond to that task. Design Participants were in one of three possible conditions where each condition rated a different set of images. That is, all participants rated each image but at only one level of transformation. There were 9 subjects in both Conditions 1 and 2; there were 8 subjects in condition 3. Image version was randomized and counterbalanced so that each condition rated approximately the same number of images at each level of transformation. Images were randomly presented in isolation in only the upright position for a total of 120 images in each condition. Procedure Participants were shown a series of images that varied in bizarreness. Participants were asked to rate, on a scale of 1-7, how bizarre each image appeared, relative to the objects natural appearance (1 = normal, 7 = high level of bizarreness). Images remained on the screen until a response was made. Each session took about 10 minutes to complete. Comparing Thompson’s Thatcher effect 18 Results and Discussion Figure 6 shows average bizarreness ratings for each object category at both Thatcherized 1 and Thatcherized 2 levels of transformation. As expected, Thatcherized 2 images were rated as more bizarre for all categories. Adult faces, grimacing faces, and baby faces were rated as most bizarre (M = 5.87, M = 5.83, and M = 5.68, respectively) for level 2 transformation. Words were also rated as highly bizarre, especially HF and LF non-words (M = 4.81, M = 5.00, respectively), which was expected. To estimate the influence of perceived bizarreness on experience of the Thatcher effect, we correlated mean bizarreness ratings with the Thatcher effect for each category. Our correlation results (see Figure 7) revealed that bizarreness rating and Thatcher effect magnitude were positively correlated (r = 0.81) if we do not include letter-string categories in the analysis. We excluded letter-strings because it was expected that non-words would always be rated as more bizarre than words and other objects. At first glance, there seems to be a very strong correlation so that as images are rated as more bizarre, we obtain a larger Thatcher effect. However, we must note that if we analyze face categories separately from object/scene categories, each analysis reveals a correlation in the opposite direction. That is, for face categories, as bizarreness ratings decrease, the Thatcher effect increases. For object/scene categories, as bizarreness ratings increase, the Thatcher effect increases, although these categories do not follow a very definitive pattern. Therefore, it is difficult to determine the extent to which bizarreness influences the Thatcher effect based on this correlation. Recall that grimacing faces showed a comparable Thatcher effect to adult faces. This was contrary to our prediction that grimacing faces should show a smaller Thatcher effect than adult faces because the Thatcherized and normal versions of a grimacing face are more similar, making local Comparing Thompson’s Thatcher effect 19 changes difficult to detect. This result suggests that bizarreness is not a necessary predictor of the Thatcher effect. Orientation Judgment Task A second prediction for why the Thatcher effect occurs is greater experience or familiarity with an object at a given orientation. To test this hypothesis, we measured familiarity with a given orientation for each category by determining the speed at which an observer can determine an object’s orientation. If familiarity is important, we would expect faster response times for orientation judgments to correlate with larger Thatcher effects for that category. 31 undergraduate students from Vanderbilt University (17 women and 14 men) judged whether the image presented was upright (‘1’) or inverted (‘2’). All images, at each level of transformation (normal, Thatcherized 1, Thatcherized 2) were presented in both the upright and inverted orientation. Participants had not completed the previous experiments. Results and Discussion We focused our analysis on normal trials and found the average reaction time to make an orientation judgment for each category. Figure 8 shows the correlation between orientation judgment RT and size of Thatcher effect for each category. There was no significant correlation between orientation reaction time and Thatcher effects (r = -0.113). Our measure of familiarity may need improvement and could be explored in future research. However, recall that our results from Experiment 2 showed that animals, which we are assumed to be less familiar with than faces, have a larger Thatcher effect than faces. In addition, LF letter-strings had a larger Thatcher effect than HF letter-strings, contrary to our predictions based on familiarity. Therefore, familiarity with an object category is not a necessary predictor of the Thatcher effect. General Discussion Comparing Thompson’s Thatcher effect 20 Our results from Experiment 1 suggest that the Thatcher effect is not exclusive to faces. It does not appear to uniquely depend on factors such as expertise or the grotesque appearance of local transformations. We obtained significant Thatcher effects for all object categories and found that faces did not show the largest Thatcher effect. Therefore, we must reassess reasons for experiencing the Thatcher effect, as it cannot solely be based on configural processing, grotesque expression, or expertise. Our results from bizarreness ratings indicate that while images that were rated as highly bizarre led to larger Thatcher effects (e.g., faces), a bizarre appearance was not necessary to obtain a significant Thatcher effect, as evidenced by cars, buildings, and scenes. Perception of bizarreness may help explain why face categories have larger Thatcher effects than object/scene categories in general, but we can obtain a Thatcher effect without the Thatcherized image looking bizarre. Therefore, bizarreness may just strengthen the Thatcher effect, rather than explain the effect. Our orientation judgment results were not significant and we may need to find a better measure of object familiarity. Perhaps, we could test experts for certain categories (i.e., cars), assuming they are very familiar with that category, and see whether the Thatcher effect for that category increases with expertise. In a general sense, face and letter-string categories had larger Thatcher effects than objects/scenes. This may be due to larger variability within the objects/scenes manipulations, but also could be accounted for by the fact that faces and words are encountered more often and are usually viewed in the upright orientation, meaning that their frame of reference has a greater influence on experience of the Thatcher effect. Our results for the letter-string categories deserve extended attention. An ANOVA revealed a main effect for both word type and word frequency, yet each finding seems to contradict the other. On one hand, since words yield a larger Thatcher effect than non-words, we Comparing Thompson’s Thatcher effect 21 could claim that familiarity is playing a role. However, the Thatcher effect for LF letter-strings is larger than for HF letter-strings, which negates this idea. Referring back to Figure 5 can help make things more clear as we speculate on these findings. First, non-words and words were very similar when inverted for both HF and LF strings, but were different when upright. This result led us to question whether inverted letter-strings are processed more like words or non-words. An experiment by Navon & Raveh (2005) on inverted word processing sheds some light on this issue. They suggested that there is one identification process for upright and inverted words that uses a set of cues such as letter direction and spatial relationship between the reflected and adjacent letters. These cues may be disrupted during inversion and letter reflection. They proposed that inverted words are processed using a rectification strategy, or mental rotation, which normalizes the inverted word so it can be processed more like an upright word. They do not conclude whether this is a global or letter-by-letter process, but we can speculate on this idea. If inverted words use a letter-by-letter rectification process, and non-words are processed letterby-letter when upright and inverted, then this could lead to similar error rates as evidenced by similar d’ values. When upright, non-words are still processed letter-by-letter but words are processed in a more global manner, leading to differences in d’. The frequency effect occurs for both words and non-words and this may be explained based on how our non-words were created. Recall that we created the non-words by transposing adjacent interior letters in words. Therefore, our non-words were quite similar to their base words because all letters were the same, just rearranged, so orthographic information was preserved. Participants could be accessing information from the base word to make non-word judgments, which is also why we do not observe a frequency effect for non-words when upright. Perea and Rosa (2005) suggest that HF non-words generate more activation for lexical decisions Comparing Thompson’s Thatcher effect 22 during early word processing than LF non-words making it easier to maintain grapheme information. Since there is less activation of orthographic information in LF non-words, more mistakes are made. In summary, the present study went beyond previous demonstrations of the classical Thatcher illusion by quantifying the Thatcher effect and making comparisons between a wide variety of object categories. Our results led us to conclude that the Thatcher effect is not facespecific and it cannot uniquely be explained by familiarity or bizarreness. We may speculate that the frame of reference hypothesis helps to explain why faces and letter-strings show a larger Thatcher effect than objects and scenes. While the Thatcher effect does not seem to depend on expertise with a specific category, as evidenced by a larger Thatcher effect for animals than adult faces, it may be that the frame of reference which is learned for faces generalizes to similar stimuli such as baby and animal faces. The same can be said for words in that the learned frame of reference for words may generalize to non-words. Future studies could explore this strategy in particular. Comparing Thompson’s Thatcher effect 23 References Bartlett, J. C., & Searcy, J. (1993). Inversion and configuration of faces. Cognitive Psychology, 25, 281-316. Boutsen, L., & Humphreys, G. W. (2003). The effect of inversion on the encoding of normal and “Thatcherized” faces. Quarterly Journal of Experimental Psychology Section A – Human Experimental Psychology, 56, 955-975. Coltheart, M. (1981). The MRC Psycholinguistic Database. The Quarterly Journal of Experimental Psychology Section A - Human Experimental Psychology, 33, 497-505. Muskat, J. A., & Sjoberg, W. G. (1997). Inversion and the Thatcher illusion in recognition of emotional expression. Perception and Motor Skills, 85, 1262. Navon, D. & Raveh, O. (2004). On the processing of recognizing inverted words: Does it rely only on orientation-invariant cues? Memory and Cognition, 32, 1103-1117. Parks, T. E. (1983). Letter to the Editor. Perception, 12, 88. Perea, M., et al. (2005). The frequency effect for pseudowords in the lexical decision task. Perception & Psychophysics, 67, 301-314. Rock, I. (1988). On Thompson’s inverted-face phenomenon. Perception, 17, 815-817. Thompson, P. (1980). Margaret Thatcher: A new illusion. Perception, 9, 483-484. Valentine, T., & Bruce, V. (1985). What’s up? The Margaret Thatcher illusion revisited. Perception, 14, 515-516. Yin, R. K. (1969). Looking at Upside-Down Faces. Journal of Experimental Psychology, 81, 141-145. Comparing Thompson’s Thatcher effect 24 Figure Captions Figure 1. Examples of Thatcherization for each object category (except words). In all image groups, the left-hand image is normal, the middle image has been Thatcherized by locally inverting one part (level 1), and the right-hand image has been Thatcherized by locally inverting two parts (level 2). The first pair represents the classical Thatcher illusion where the eyes and mouth have been locally inverted. These changes are more difficult to detect when globally inverted, but rotating the page 180 will make these changes more obvious. Figure 2. Example of Thatcherization for LF word (top) and HF non-word (bottom). Full list of word and non-word stimuli. List includes 40 words with equal number of words in each letterstring category. Letters in bold were inverted to create Thatcherized images. The letter that was inverted for Thatcherized level 1 is underlined. Figure 3. Accuracy and reaction time results for Experiment 1. The top graph shows mean delta (upright minus inverted) accuracy for all object categories. The bottom graph shows mean delta (upright minus inverted) correct reaction times for all object categories. Results are for different trials only. Figure 4. Delta d’ results for Experiment 2. Measure of size of Thatcher effect across all object categories at both level 1 and level 2 transformation. Delta d’ was calculated by subtracting d’ for upright trials minus d’ for inverted trials. ‘x’ represents insignificant TE. Figure 5. d’ results for letter-string categories. Compares d’ values separately for inverted and upright letter-strings and also includes delta d’ for HF and LF letter-strings. Figure 6. Mean bizarreness ratings for each object category at level 1 and level 2 transformations. Images were rated on a scale of 1 (normal) to 7 (very bizarre) and ratings were averaged across object. Comparing Thompson’s Thatcher effect 25 Figure 7. Correlation between level of perceived bizarreness and Thatcher effect averaged across individual images. Correlation results exclude letter-strings. r = 0.81. Figure 8. Correlation between mean RT for orientation judgment and Thatcher effect at both Thatcherized 1 and 2 levels. r =-0.113. Comparing Thompson’s Thatcher effect 26 Figure 1. Face Grimacing Baby Animal Comparing Thompson’s Thatcher effect 27 Building Car Close-up Scene Scene Comparing Thompson’s Thatcher effect 28 Figure 2. Words High Frequency Non-words High Frequency evidence military question together anything although interest children national position eivedcne mlitiray qeutsoin tgoteehr aynhtnig atlohguh itnreset cihdlern ntaoianl psotioin Words Low Frequency Non-words Low Frequency primates blatancy overfeed marathon fracture engraver beautify artifice navigate wanderer piramets balatcny oevfreed mrataohn fartcrue egnarevr baetufiy atrficie nvagitae wnaederr Comparing Thompson’s Thatcher effect 29 Figure 3. 0.4 0.2 0.1 -200 -400 -600 -800 -1000 -1200 Nonword_LF Nonword_HF Word_LF Word_HF Scene Close-up scene Car Building Animal Baby Grimacing 0 0 Mean Delta Correct RTs (ms) Thatcherized Level 1 Thatcherized Level 2 0.3 Face Mean Delta Accuracy (Up-Inv) Experiment 1 Results Comparing Thompson’s Thatcher effect 30 Figure 4. Size of Thatcher Effect Thatcherized Level 1 Thatcherized Level 2 1.5 1 x 0.5 x x en e Sc en e or d_ W HF or N on d_ L w or F d_ N on H F w or d_ LF W -u p sc Ca r in g Cl os e Bu ild Ba by An im al rim ac in g Fa ce 0 G Thatcher Effect (Delta d') 2 Comparing Thompson’s Thatcher effect 31 Figure 5. Letter-Strings Non-Words d' (or delta d' for TE) 3.5 Words 3 2.5 2 1.5 1 0.5 0 HF Upright HF Inverted HF Thatcher Effect LF Upright LF Inverted LF Thatcher Effect Nonword_LF Nonword_HF 7 Word_LF Word_HF Scene Close-up scene Car Building Animal Baby Grimacing Face Mean Bizarreness Ratings - Scale of Comparing Thompson’s Thatcher effect 32 Figure 6. Mean Bizarreness Ratings Thatcherized 1 Thatcherized 2 6 5 4 3 2 1 0 Comparing Thompson’s Thatcher effect 33 Figure 7. Correlation between bizarreness rating and Thatcher effect magnitude r = 0.813 6 Face2 Grimacing2 Baby2 5 Face1 Baby1 Mean Bizarreness Rating Grimacing1 4 Animal2 Close-Up Scene2 Car2 Building2 3 Animal1 Building1 Close-Up Scene1 Scene2 Car1 2 Scene1 1 0 0.2 0.4 0.6 0.8 Thatcher Ef f ect Magnitude 1 1.2 1.4 Comparing Thompson’s Thatcher effect 34 Figure 8. Correlation between RT for Orientation Judgment and Thatcher effect magnitude r = -0.113 900 Mean RT (ms) 850 800 750 700 0 0.5 1 1.5 Thatcher Ef f ect Magnitude 2 2.5 Comparing Thompson’s Thatcher effect 35 Appendix A: d’ for upright and inverted (Thatcherized 2 level) Dprime Upright Dprime Inverted 3.5 3 2.5 d' 2 1.5 1 0.5 en e Sc en e or d_ W HF or N on d_ L w or F d_ N on H F w or d_ LF W p sc Ca r in g Bu ild Cl os eU G rim Fa ce ac in g Ba by An im al 0 Comparing Thompson’s Thatcher effect 36 Appendix B: d’ values for upright and inverted (Thatcherized 1 Level) Category Face, Inverted Face, Upright Grimacing, Inverted Grimacing, Upright Baby, Inverted Baby, Upright Animal, Inverted Animal, Upright Count 21 21 21 21 21 21 21 21 HITS 0.86 0.93 0.90 0.92 0.86 0.92 0.90 0.90 SE 0.02 0.01 0.01 0.01 0.02 0.01 0.01 0.01 FA 0.27 0.11 0.48 0.19 0.30 0.15 0.64 0.44 SE 0.04 0.02 0.04 0.03 0.05 0.02 0.04 0.05 Dprime 1.81 2.83 1.38 2.44 1.80 2.67 0.96 1.56 SE 0.13 0.12 0.13 0.11 0.19 0.14 0.15 0.18 Building, Inverted Building, Upright Car, Inverted Car, Upright Close-Up Scene, Inverted Close-Up Scene, Upright Scene, Inverted Scene, Upright 21 21 21 21 21 21 21 21 0.88 0.87 0.87 0.87 0.91 0.94 0.85 0.86 0.02 0.01 0.02 0.02 0.01 0.01 0.02 0.02 0.58 0.46 0.55 0.50 0.38 0.42 0.76 0.71 0.04 0.03 0.05 0.06 0.04 0.04 0.03 0.03 1.01 1.24 1.08 1.21 1.76 1.89 0.30 0.62 0.11 0.09 0.20 0.21 0.14 0.14 0.11 0.10 Word_HF, Inverted Word_HF, Upright Word_LF, Inverted Word_LF, Upright Nonword_HF, Inverted Nonword_HF, Upright Nonword_LF, Inverted Nonword_LF, Upright 21 21 21 21 21 21 21 21 0.87 0.91 0.87 0.90 0.90 0.83 0.87 0.85 0.02 0.01 0.02 0.01 0.02 0.02 0.02 0.02 0.59 0.17 0.71 0.18 0.61 0.25 0.70 0.31 0.04 0.02 0.03 0.02 0.03 0.03 0.03 0.03 0.98 2.45 0.70 2.30 1.06 1.76 0.61 1.65 0.11 0.12 0.12 0.12 0.11 0.11 0.14 0.15 SE = Standard Error FA = False Alarm Comparing Thompson’s Thatcher effect 37 Appendix C: d’ values for upright and inverted (Thatcherized 2 Level) Category Face, Inverted Face, Upright Grimacing, Inverted Grimacing, Upright Baby, Inverted Baby, Upright Animal, Inverted Animal, Upright Count 21 21 21 21 21 21 21 21 HITS 0.86 0.93 0.90 0.92 0.86 0.92 0.90 0.90 SE 0.02 0.01 0.01 0.01 0.02 0.01 0.01 0.01 FA 0.18 0.09 0.28 0.09 0.28 0.11 0.59 0.21 SE 0.03 0.01 0.03 0.01 0.04 0.02 0.05 0.04 Dprime 2.19 2.88 1.97 2.81 1.84 2.84 1.06 2.35 SE 0.13 0.08 0.12 0.09 0.16 0.16 0.15 0.17 Building, Inverted Building, Upright Car, Inverted Car, Upright Close-Up Scene, Inverted Close-Up Scene, Upright Scene, Inverted Scene, Upright 21 21 21 21 21 21 21 21 0.88 0.87 0.87 0.87 0.91 0.94 0.85 0.86 0.02 0.01 0.02 0.02 0.01 0.01 0.02 0.02 0.53 0.34 0.30 0.16 0.29 0.28 0.63 0.58 0.04 0.04 0.03 0.02 0.03 0.03 0.04 0.04 1.15 1.62 1.85 2.30 2.02 2.26 0.69 0.98 0.12 0.12 0.16 0.15 0.11 0.10 0.10 0.13 Word_HF, Inverted Word_HF, Upright Word_LF, Inverted Word_LF, Upright Nonword_HF, Inverted Nonword_HF, Upright Nonword_LF, Inverted Nonword_LF, Upright 21 21 21 21 21 21 21 21 0.87 0.91 0.87 0.90 0.90 0.83 0.87 0.85 0.02 0.01 0.02 0.01 0.02 0.02 0.02 0.02 0.45 0.13 0.59 0.08 0.45 0.20 0.56 0.20 0.04 0.02 0.04 0.01 0.04 0.03 0.04 0.03 1.34 2.63 1.02 2.79 1.52 1.97 1.03 2.03 0.11 0.14 0.11 0.13 0.13 0.15 0.14 0.17 SE = Standard Error FA = False Alarm