Zhou, Z., & Firestone, C. (in press). Humans can decipher adversarial images. Nature Communications. [preprint]
Lowet, A. S., Firestone, C., & Scholl, B. J. (2018). Seeing structure: Shape skeletons modulate perceived similarity. Attention, Perception, & Psychophysics, 80, 1278-1289. [pdf]
Firestone, C., & Scholl, B. J. (2017). Seeing and thinking in studies of embodied "perception": How (not) to integrate vision science and social psychology. Perspectives on Psychological Science, 12, 341-343. [pdf] - This is a reply to a target article by Schnall. See another reply by Durgin and a reply to those replies by Schnall in turn.
Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the evidence for 'top-down' effects. Behavioral & Brain Sciences, e229, 1-77. [pdf]
Firestone, C., & Keil, F. C. (2016). Seeing the tipping point: Balance perception and visual shape. Journal of Experimental Psychology: General, 145, 872-881. [pdf]
Firestone, C., (2016). Embodiment in perception: Will we know it when we see it? In H. Kornblith & B. McLaughlin (eds.), Alvin Goldman and his Critics (pp. 318-334). Wiley Blackwell. [pdf] - Reply by Goldman.
Firestone, C., & Scholl, B. J. (2016). 'Moral perception' reflects neither morality nor perception. Trends in Cognitive Sciences, 20, 75-76. [pdf] - Response (by Gantman & Van Bavel), and a rejoinder (by us) to that response.
Firestone, C., & Scholl, B. J. (2015). When do ratings implicate perception vs. judgment? The "overgeneralization test" for top-down effects. Visual Cognition, 23, 1217-1226. [pdf]
Firestone, C., & Scholl, B. J. (2015). Enhanced visual awareness for morality and pajamas? Perception vs. memory in top-down effects. Cognition, 136, 409-416. [pdf]
Firestone, C., & Scholl, B. J. (2015). Can you experience top-down effects on perception? The case of race categories and perceived lightness. Psychonomic Bulletin & Review, 22, 694-700. [pdf]
Firestone, C., & Scholl, B. J. (2014). "Please tap the shape, anywhere you like": Shape skeletons in human vision revealed by an exceedingly simple measure. Psychological Science, 25, 377-386. [pdf]
Firestone, C., & Scholl, B. J. (2014). "Top-down" effects where none should be found: The El Greco fallacy in perception research. Psychological Science, 25, 38-46. [pdf]
Firestone, C. (2013). How 'paternalistic' is spatial perception? Why wearing a heavy backpack doesn't and couldn't make hills look steeper. Perspectives on Psychological Science, 8, 455-473. [pdf] Responses: - Proffitt (2013) - Witt (2015)
How do prior assumptions about uncertain data inform our inferences about those data? Increasingly, such inferences are thought to work in the mind the way they should work in principle — with our interpretations of uncertain evidence being nudged towards our prior hypotheses in a “rational” manner approximating Bayesian inference. By contrast, here we explore a class of phenomena that appear to defy such normative principles of inference: Whereas inferences about new data are typically attracted toward prior expectations, we demonstrate how inferences may also be repelled away from prior expectations. In seven experiments, subjects briefly saw arrays of two spatially intermixed sets of objects (e.g. several dozen squares and circles). Over the course of the session, subjects learned that one set was typically more numerous than the other — for example, that there are typically more squares than circles. Surprisingly, upon forming the expectation that they would continue to see more squares, subjects who were then shown an equal number of squares and circles (such that it was unclear exactly which had more) judged the circles to be more numerous, seemingly adjusting their inferences away from their prior hypothesis about what they would see. Six follow-up experiments show how this effect is not explained by low-level sensory adaptation (occurring even when various sensory dimensions are equated), generalizes to many kinds of stimuli (including colors, and configurally-defined letters), and is robust to different measures (not only forced-choice [“which has more?”] but also precise enumeration [“how many are there?”]). We discuss how this “expectation contrast” effect is a genuine case of adjusting “away” from our priors, in seeming defiance of normative principles of inference. We also point to a broader class of phenomena that may behave in this way, and explore their consequences for Bayesian models of perception and cognition.
Some properties of objects are intrinsic to the objects themselves, whereas other properties encompass that object’s relationship to other objects or events in a scene. For example, when completing a jigsaw puzzle, we might notice not only the singular properties of an individual piece (e.g., its particular shape), but also its relationship to other pieces — including its ability to combine with another piece to form a new object. Here, we explore how the visual system represents the potential for two discrete objects to create something new. Our experiments were inspired by the puzzle game Tetris, in which players combine various shapes to build larger composite objects. Subjects saw a stream of images presented individually, and simply had to respond whenever they saw a certain target image (such as a complete square), and not at any other time. The stream also included distractor images consisting of object-pairs (shaped like the “tetrominoes” of Tetris) that either could or could not combine to produce the subject’s target. Accuracy was very high, but subjects occasionally false-alarmed to the distractor images. Remarkably, subjects were more likely to false-alarm to tetromino-pairs that could create their target than to tetromino-pairs that could not, even though both kinds of images were visually dissimilar to the target. We also observed a priming effect, whereby target responses were faster when the previous trial showed tetrominoes that could create the target vs. tetrominoes that could not. Follow-up experiments revealed that these effects were not simply due to a general response bias favoring matching shapes, nor were the results explained simply by representational momentum due to perceived “gravity” (since the effects generalized to 90-degree rotations of the tetromino-pair images). These results suggest that the mind automatically and rapidly evaluates discrete objects for their potential to combine into something new.
We can readily appreciate whether a tower of blocks will topple or a stack of dishes will collapse. How? Recent work suggests that such physical properties of scenes are extracted rapidly and efficiently as part of automatic visual processing (Firestone & Scholl, VSS2016, VSS2017). However, physical reasoning can also operate in ways that seemingly differ from visual processing. For example, subjects who are explicitly told that some blocks within a tower are heavier than others can rapidly update their judgments of that tower’s stability (Battaglia et al., 2013); by contrast, automatic visual processing is typically resistant to such explicit higher-level influence (Firestone & Scholl, 2016). Here, we resolve this apparent conflict by revealing how distinct flexible and inflexible processes support physical understanding. We showed subjects towers with differently-colored blocks, where one color indicated a 10x-increase in mass. Subjects successfully incorporated this information into their judgments of stability, accurately identifying which towers would stand or fall by moving their cursors to corresponding buttons. However, analyses of these cursor trajectories revealed that some towers were processed differently than others. Specifically, towers that were “stable” but that would have been unstable had the blocks been equally heavy (i.e. towers with unstable geometries) yielded meandering cursor trajectories that drifted toward the incorrect stability judgment (“fall”) before eventually arriving at the correct judgment (“stand”). By contrast, towers that were “stable” both in terms of their differentially heavy blocks and in terms of their superficial geometries produced considerably less drift. In other words, even when subjects accurately understood how a tower would behave given new information about mass, their behaviors revealed an influence of more basic visual (geometric) cues to stability. We suggest that physical understanding may not be a single process, but rather one involving separable stages: a fast, reflexive, “perceptual” stage, and a slower, flexible “cognitive” stage.
A notoriously tricky “bar bet” proceeds as follows: One patron wagers another that the distance around the rim of a standard pint glass is about twice the glass’s height. Surprisingly, this patron is usually correct, owing to a powerful (but, to our knowledge, unexplained) visual illusion wherein we severely underestimate the circumferences of circles. Here, we characterize this illusion and test an explanation of it: We suggest that the difficulty in properly estimating the perimeters of circles and other shapes stems in part from the visual system’s representation of such shapes as closed objects, rather than as open contours which might be easier to ‘mentally unravel’. Subjects who saw circles of various sizes and adjusted a line to match the circles’ circumferences greatly underestimated circumference — initially by a magnitude of over 35%. (Care was taken to exclude subjects who conflated circumference with diameter.) Estimates for these closed circles were then compared to estimates of the perimeter of a circle that was missing a continuous 18-degree segment of arc. We predicted that removing a portion of the circle’s perimeter would, paradoxically, cause the circle’s perimeter to appear longer, since this violation of closure would bias the visual system to process the stimulus as an open contour. Results revealed that, indeed, this manipulation very reliably reduced the magnitude of this “pint glass illusion” by as much as 30%, such that a circle missing a portion of its circumference was judged to have a greater perimeter than a complete, closed circle of the same diameter. We suggest that the property of closure not only influences whether a stimulus is processed as an object, but also constrains how easily such a stimulus can be manipulated in the mind.
Objects in the world frequently strike us as being complex (and informationally rich), or simple (and informationally sparse). For example, a crenulate and richly-organized leaf might look more complex than a plain stone. What is the nature of our experience of complexity — and why do we have this experience in the first place? We algorithmically generated hundreds of smoothed-edge shapes, and determined their complexity by computing the cumulative surprisal of their internal skeletal structure — essentially quantifying the amount of information in the object. Subjects then completed a visual search task in which a single complex target appeared among identical simple distractors, or a single simple target appeared among identical complex distractors. Not only was search for complex targets highly efficient (8ms/item), but it also exhibited a search asymmetry: a complex target among simple distractors was found faster than a simple target among complex distractors — suggesting that visual complexity is extracted ‘preattentively’. (These results held over and above low-level properties that may correlate with complexity, including area, number of sides, spatial frequency, angular magnitudes, etc.). Next, we explored the function of complexity; why do we experience simplicity and complexity in the first place? We investigated the possibility that visual complexity is an attention-grabbing signal indicating that a stimulus contains something worth learning. Subjects who had to memorize and later recall serially presented objects recalled complex objects better than simple objects — but only when such objects appeared within a set of other objects, and not when they were presented one-at-a-time (suggesting that the effect is not driven simply by increased distinguishability of complex shapes). We suggest not only that object complexity is extracted efficiently and preattentively, but also that complexity arouses a kind of 'visual curiosity' about objects that improves subsequent learning and memory.
Does a gray banana look yellow? Does a heart look redder than a square? A line of research stretching back nearly a century suggests that knowing an object’s canonical color can alter its visual appearance. Are such effects truly perceptual, or might they instead reflect biased responses without altering online color perception? Here, we replicate such classical and contemporary “memory-color effects”, but then extend them to include conditions with counterintuitive hypotheses that would be difficult for subjects to grasp; across multiple case studies, we find that such conditions eliminate or even reverse memory-color effects in ways unaccounted-for by their underlying theories. We first replicated the classic finding that hearts are judged as redder than squares, as measured by matching a color-adjustable background to a central stimulus. But when we varied the shape of the background itself (to be either square or heart-shaped), subjects who estimated a square’s color by adjusting a heart-shaped background made the background redder than when adjusting a square-shaped background — whereas a memory-color theory would predict the opposite pattern. Next, we successfully replicated the more recent finding that gray disks and blueish bananas are judged as more purely gray than are gray bananas (which purportedly appear yellow); however, we also found that a blueish disk is judged to be more gray than a blueish banana, exactly opposite the prediction of memory-color theories. Moreover, when asked to identify the “odd color out” from an array of three objects (e.g., gray disk, gray banana, and blueish banana) subjects easily identified the blueish banana as the odd color out, even though memory-color theories predict that subjects should pick the gray banana. We suggest that memory color effects may not be truly perceptual, and we discuss the utility of this general approach for separating perception from cognition.
Is working memory simply the reactivation of perceptual representations? Decoding experiments with fMRI suggest that perceptual areas maintain information about what we have seen in working memory. But is this activity the basis of visual working memory itself? If it is, then perceptual interference during maintenance should impair our ability to remember. We tested this prediction by measuring visual working memory performance with and without interfering mask gratings, presented during the memory delay at the same location as the to-be-remembered stimulus. Participants memorized the orientations of 1-4 sample gratings, which appeared for 800ms. After a 5-second pause, the participants were exposed to a target grating in the same location as one of the sample gratings, and the target grating was rotated either clockwise or counterclockwise relative to the original. The task was to identify the direction of change. The key manipulation was that during the 5-second maintenance period, participants were exposed either to a blank screen, or to a rapidly changing stream of mask gratings in each of the previously occupied positions. We reasoned that if visual working memory relies on early perceptual substrates then exposure to conflicting masks that putatively activate the same substrates should impair performance (relative to no-mask trials). In other words, there should be interference, between the rapidly changing perceptual inputs and the perceptually maintained memory representations at the same retinal location. Contrary to this prediction, there was no difference in performance between the masked and unmasked conditions. We did, however, observe significantly reduced accuracy as a function of set size (the number of sample gratings in a trial). This evidence suggests that representations in early perceptual brain regions may not play a functional role in maintaining visual features.
How do prior assumptions about uncertain data inform our inferences about those data? Increasingly, such inferences are thought to work in the mind the way they *should* work in principle — with our interpretations of uncertain evidence being nudged toward our prior hypotheses in a “rational” manner approximating Bayesian inference. This approach has taken the mind and brain sciences by storm, being successfully applied to perception, learning, memory, decision-making, language, and development — leading psychologists, neuroscientists, and philosophers to argue that “humans act as rational Bayesian estimators” (Clark, 2013) and that we have a fundamentally “Bayesian brain” (Knill & Pouget, 2004).
Do any mental phenomena resist such a rational analysis? Whereas some researchers have suggested so by pointing to cases where people reason poorly about various kinds of evidence (as in, e.g., base-rate neglect or the conjunction fallacy), we focus here on a more specific — and perhaps more puzzling — sort of interaction between prior hypotheses and new evidence. In particular, whereas inferences about new data are typically *attracted toward* prior expectations, we show here how inferences may also be *repelled away* from prior expectations, in seeming defiance of normative statistical inference. We do this both by reporting new experiments that investigate these phenomena, and also by reevaluating previously under-emphasized findings. We call such inferences “antirational” (to distinguish them from mere *irrationality*), because they appear to proceed exactly opposite the recommendation of a rational analysis. We conclude by discussing the consequences of such phenomena for foundational issues in philosophy and psychology.
What is this class of phenomena? Consider the classic *size-weight illusion* (Charpentier, 1891), wherein subjects are shown a large object and a small object that are in fact objectively equal in mass, and the subject is asked to lift them both up. Which object should feel heavier? The straightforward “Bayesian” prediction is that the *larger* object should feel heavier, since the ambiguous evidence (two objects giving approximately equal resistance) should be resolved in favor of the strong prior (that larger = heavier). However, the surprising result of the size-weight illusion, replicated hundreds of times over the last century, is that the *smaller* object feels heavier! For this reason, the size-weight illusion is sometimes considered a “problem case” for larger-scale theories of a Bayesian mind/brain (Clark, 2013).
At the same time, this single illusion is a somewhat ‘fringe’ phenomenon, involving many factors that are not poorly understood in their own right. Our goal is thus to demonstrate that the very same logic that makes the size-weight illusion so puzzling is actually highly generalizable, and can be exploited to produce other kinds of antirational updating, including in other areas of cognitive science where it may be easier rule out alternative explanations (cf. Peters et al., 2016).
We demonstrated this by studying the perception of numerosity. Across nine experiments, subjects briefly saw arrays of two spatially intermixed sets of objects (e.g. several dozen squares and circles). Over the course of the session, subjects learned that one set was typically more numerous than the other — for example, that there are typically more squares than circles. Surprisingly, however, subjects who were then shown an *equal* number of squares and circles on a subsequent trial (such that it was unclear exactly which had more) judged the *circles* to be more numerous. In other words, just as in the size-weight illusion, subjects adjusted their inferences *away* from their prior hypotheses about what they would see — seemingly doing exactly the opposite of what a rational model would dictate.
Follow-up experiments (1) generalized this phenomenon to many other kinds of stimuli, including not only shapes but also colors (blue vs. yellow dots) and configurally-defined letters (Ts vs. Ls); (2) ruled out low-level sensory adaptation, since the effects also occur (a) even when various sensory dimensions are equated; and (b) even at very short exposure durations (100ms) and very long intertrial intervals (1 minute) — conditions that do not reliably produce adaptation in other contexts; (3) extended the effect beyond one specific judgment made by subjects, since the results obtain both with two-alternative forced-choice (“which has more?”) and precise enumeration (“how many are there?”).
This work points to a new and general sort of phenomenon in the mind: A kind of contrast effect between hypotheses and evidence that consists in adjusting away from our priors. The existence of such an “antirational” class of mental phenomena is both a discovery to be explained by cognitive science, and a challenge to notions of mental processes as rational inferences championed by psychologists and philosophers alike.
When assembling furniture or completing a jigsaw puzzle, we appreciate not only the particular shapes of individual objects, but also their potential to *combine* into new objects. How does the mind extract this property? In 5 experiments inspired by Tetris, subjects had to respond to a particular target within a stream of “tetrominoes”; however, subjects false-alarmed more often to pairs of tetrominoes that could create their target than to tetromino-pairs that couldn’t—essentially confusing ‘potential’ objects for real ones. We suggest that the mind automatically represents not only what objects *are*, but also what they *could become*.
Human vision is increasingly well-approximated by cutting-edge Convolutional Neural Networks. However, such models are “fooled” by so-called adversarial examples — carefully-crafted images that appear as nonsense to humans but as objects to CNNs. Surprisingly, however, little work has investigated human performance on such stimuli; could humans “crack” adversarial images by predicting the machine’s classifications? In four experiments on three prominent adversarial imagesets, subjects reliably identified the machine’s chosen label over relevant foils — even for images previously considered “totally unrecognizable to human eyes”. Computer object-representation may resemble a human’s more than recent challenges suggest.