Feeds:
Posts
Comments

Posts Tagged ‘vision’

I believe that music sounds like people, moving.

Yes, the idea may sound a bit crazy, but it’s an old idea, much discussed in the 20th century, and going all the way back to the Greeks. There are lots of things going for the theory, including that it helps us explain (1) why our brains are so good at absorbing music (…because we evolved to possess human-movement-detecting auditory mechanisms), (2) why music emotionally moves us (…because human movement is often expressive of the mover’s mood or state), and (3) why music gets us moving (…because we’re a social species prone to social contagion).

And as I describe in detail in my upcoming book – “Harnessed: How Language and Music Mimicked Nature and Transformed Ape To Man” – music has the signature auditory patterns of human movement (something I hint at here http://www.science20.com/mark_changizi/music_sounds_moving_people ).

Here I’d like to describe a novel way of thinking about what the meaning of music might be.

Rather than dwelling on the sound of music, I’d like to focus on the look of music.

In particular, what does our brain think music looks like?

It is natural to assume that the visual information streaming into our eyes determines the visual perceptions we end up with, and that the auditory information entering our ears determines the events we hear. But the brain is more complicated than this. Visual and auditory information interact in the brain, and the brain utilizes both to guess the single scene to render a perception of. For example, the research of Ladan Shams, Yukiyasu Kamitani and Shinsuke Shimojo at Caltech have shown that we perceive a single flash as a double flash if it is paired with a double beep. And Robert Sekuler and others from Brandeis University have shown that if a sound occurs at the time when two balls pass through each other on screen, the balls are instead perceived to have collided and reversed direction. These and other results of this kind demonstrate the interconnectedness of visual and auditory information in our brain. Visual ambiguity can be reduced with auditory information, and vice versa. And, generally, both are brought to bear in the brain’s attempt to infer the best guess about what’s out there.

Your brain does not, then, consist of independent visual and auditory systems, with separate troves of visual and auditory “knowledge” about the world. Instead, vision and audition talk to one another, and there are regions of cortex responsible for making vision and audition fit one another. These regions know about the sounds of looks and the looks of sounds. Because of this, when your brain hears something but cannot see it, your brain does not just sit by and refrain from guessing what it might have looked like. When your auditory system makes sense of something, it will have a tendency to activate visual areas, eliciting imagery of its best guess as to the appearance of the stuff making the sound. For example, the sound of your neighbor’s rustling tree may spring to mind an image of its swaying lanky branches. The whine of your cat heard far way may evoke an image of it stuck up high in that tree. And the pumping of your neighbor’s kid’s BB gun can bring forth an image of the gun being pointed at Foofy way up there.

Your visual system has, then, strong opinions about the proper look of the things it hears. And, bringing ourselves back to music, we can use the visual system’s strong opinions as a means for gauging music’s meaning. In particular, we can ask your visual system what it thinks the appropriate visual is for music. If, for example, the visual system responds to music with images of beating hearts, then it would suggest, to my disbelief, that music mimics the sounds of heartbeats. If, instead, the visual system responds with images of pornography, then it would suggest that music sounds like sex. You get the idea.

But in order to get the visual system to act like an oracle, we need to get it to speak. How are we to know what the visual system thinks music looks like? One approach is to simply ask which visuals are, in fact, associated with music? For example, when people create imagery of musical notes, what does it look like? One cheap way to look into this is simply to do a Google (or any search engine) image search on the term “musical notes.” You might think such a search would merely return images of simple notes on the page. However, that is not what one finds. To my surprise, actually, most of the images are like the one in the nearby figure, with notes drawn in such a way that they appear to be moving through space. Notes in musical notation never actually look anything like this, and real musical notes have no look at all (because they are sounds). And yet we humans seem to be prone to visually depicting notes as moving all about.

Could these images of notes in motion be due to a more mundane association? Music is played by people, and people have to move in order to play their instrument. Could this be the source of the movement-music association? I don’t think so, because the movement suggested in these images of notes doesn’t look like an instrument being played. In fact, it is common to show images of an instrument with the notes beginning their movement through space from the instrument: these notes are on their way somewhere, not an indication of the musician’s key-pressing or back-and-forth movements.

Could it be that the musical notes are depicted as moving through space because sound waves move through space? The difficulty with this hypothesis is that all sound moves through space. All sound would, if this were the case, be visually rendered as moving through space, but that’s not the case. For example, speech is not usually visually rendered as moving through space. Another difficulty is that the musical notes are usually meandering in these images, but sound waves are not meandering – sound waves go straight. A third problem with sound waves underlying the visual metaphor is that we never see sound waves in the first place.

Another possible counter-hypothesis is that the depiction of visual movement in the images of musical notes is because all auditory stimuli are caused by underlying events with movement of some kind. The first difficulty, as was the case for sound waves, is that it is not the case that all sound is visually rendered in motion. The second difficulty is that, while it is true that sounds typically require movement of some kind, it need not be movement of the entire object through space. Moving parts within the object may make the noise, without the object going anywhere. In fact, the three examples I gave earlier – leaves rustling, Foofy whining, and the BB gun pumping – are noises without any bulk movement of the object (the tree, Foofy, and the BB gun, respectively).  The musical notes in imagery, on the other hand, really do seem to be moving, in bulk, across space.

Music is like tree-rustling, Foofy, BB guns and human speech in that it is not made via bulk movement through space.  And yet music appears to be unique in this tendency to be visually depicted as moving through space. In addition, not only are musical notes rendered as in motion, musical notes tend to be depected as meandering.

When visually rendered, music looks alive and in motion (often along the ground), just what one might expect if music’s secret is that it sounds like people moving.

A Google Image search on “musical notes” is one means by which one may attempt to discern what the visual system thinks music looks like, but another is to simply ask ourselves what is the most common visual display shown during music. That is, if people were to put videos to music, what would the videos tend to look like?

Lucky for us, people do put videos to music! They’re called music videos, of course. And what do they look like? The answer is so obvious that it hardly seems worth noting: music videos tend to show people moving about, usually in a time-locked fashion to the music, very often dancing.

As obvious as it is that music videos typically show people moving, we must remember to ask ourselves why music isn’t typically visually associated with something very different. Why aren’t music videos mostly of rivers, avalanches, car races, wind-blown grass, lion hunts, fire, or bouncing balls? It is because, I am suggesting, our brain thinks that humans moving about is what music should look like…because it thinks that humans moving about is what music sounds like.

Musical notes are rendered as meandering through space. Music videos are built largely from people moving, and in a time-locked manner to the music. That’s beginning to suggest that the visual system is under the impression that music sounds like human movement. But if that’s really what the visual system thinks, then it should have more opinions than simply that music sounds like movement. It should have opinions about what, more exactly, the movement should look like. Do our visual systems have opinions this precise? Are we picky about the mover that’s put to music?

You bet we are! That’s choreography. It’s not enough to play a video of the Nutcracker ballet during Beatles music, nor will it suffice to play a video of the Nutcracker to the music of Nutcracker, but with a small time lag between them. The video of human movement has to have all the right moves at the right time to be the right fit for the music.

These strong opinions about what music looks like make perfect sense if music mimics human movement sounds. In real life, when people carry out complex behaviors, their visual movements are tightly choreographed with the sounds – because the sight and sound are due to the same event. When you hear movement, you expect to see that same movement. Music sounds to your brain like human movement, which is why when your brain hears music, it expects that any visual of it should be consistent with it.

~~~

This was adapted from Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man (Benbella Books,2011).

~~~

This first appeared July 28, 2010, at Science 2.0.

Mark Changizi is Professor of Human Cognition at 2AI, and the author of The Vision Revolution (Benbella Books) and the upcoming book Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man (Benbella Books).

Read Full Post »

Migraine sufferers have long complained about how their headaches worsen with bright light, and in case you ever doubted their complaints, Rami Burstein and other researchers from Harvard Medical School and the Moran Eye Center at the University of Utah recently made a giant step in understanding the light-to-headache mechanism in Nature Neuroscience. They found neurons in the rat thalamus sensitive to both light and to the dura (the membrane surrounding the brain).

More intriguing than the “how” of light and headaches is the “why”. Why should light be linked with pain mechanisms at all? Why should the retina be in the business of eliciting pain in your brain?

Upon reflection, however, we all know of occasions where looking hurts. The most obvious case is when we look at the sun. And another obvious case is when someone shines a flashlight in our eyes in the dark. In each case we are likely to respond, “Ouch!” From these real-world links between light and pain can we discern what the link may be for?

light aggravates headaches

The example of the sun may coax us into suggesting that it is the retina-scorching amount of light that hurts. However, the fact that the same kind of discomfort occurs when someone shines a flashlight in our eyes shows it is not the intrinsic amount of light that is the source of the pain. A flashlight can be so dim that we can hardly see it in daytime, and yet hurt when shone in our eyes at night. The flashlight’s beam is not scorching anything, although the pain it elicits is every bit as real.

Instead, I suggest that these light/pain phenomena are similar to pain in other domains of our life. The general role of pain is not merely to tell us that something has been damaged, but to motivate us to modify our behavior toward safer or smarter action (and to so without our having to consciously think about it). For example, subtle pain signals are constantly causing me to shift my weight as I sit here and type this, leading to healthier blood flow in my lower extremities. Our eye fixations are like fingertips, reaching out and touching things in the world; just as fingertips need a pain sense to help optimally guide their behavior, so do our eye fixations.

In our normal viewing experiences there are very often wild fluctuations in brightness in our visual field, often due to the sun or to reflections of the sun. We are typically not interested in looking at objects having this full breadth of brightnesses, but, instead, at a range of “interesting objects” at a narrower range of brightnesses. To help us best see the objects of current interest, our visual system adapts to the brightness levels among them. If we were to fixate on a part of the scene that is much brighter than these interesting objects (perhaps a spot of glare), then our eyes would begin to adapt to the new brightness level, and when we look back at the objects of interest, we will be unable to see them well.

“Eye pain” of this kind may be the principal unconscious mechanism that keeps us fixating in a smart fashion within our visual field; it is what keeps our eyes performing at their best given our interests at the time. Although this kind of mechanism is unconscious, it by no means needs to be stupid. Instead, it may be able to infer where the brightest parts of the scene are on the basis of global cues in the scene.

For example, look at the earlier photograph of the glaring sun. It feels somewhat discomforting to look at this photograph, and our eyes want to steer clear from the sun. Yet the brightest spot at the center of the sun in the photograph is no brighter than the white elsewhere on this web page which causes us no discomfort to look at. Our brain seems to be able to recognize the sun-glare-like cues in the photograph, and elicits the glare-avoidance pain mechanisms for it but not for the white elsewhere on screen.

In light of these ideas for the role of light in pain, could it be that migraine-like headaches are the normal kind of pain elicited for these light/pain mechanisms, and that the trouble for migraine sufferers is the overactivation of these usually-functional mechanisms?

This first appeared on February 26, 2010, as a feature at ScientificBlogging.com.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

Scientists are prone to going on and on about how strikingly early in life we are able to comprehend speech. Our children’s aptitude for reading, however, doesn’t cause much excitement. At first glance this seems sensible: children comprehend speech fairly well by two, whereas they typically can’t read until about five. This is because, the standard story goes, we evolved to comprehend speech but did not evolve to read. And while one might debate whether we have evolved to comprehend speech, no one believes we evolved to read. Writing is only several thousand years old, far too short a time to have crafted reading mechanisms in our brain. And for many of us, our ancestors only started reading one or several generations back.

But are children really so clunky at learning to read? At five years old, most children can’t be trusted to pour a pint of beer without spilling it, and most can’t even do stereotypical ape behaviors like somersaults and the monkey bars. And yet these same wee ones are reading. That’s quite an accomplishment for an ape, especially one who gets read to so infrequently compared to getting talked to.

Picture 2

Children are, in fact, quick learners of reading, and our brains become fantastically capable readers. How can we come to be so good at reading if we don’t have a brain for it? Is it because our visual system can handle any writing one may throw at it? No. Our children would be hopeless if writing looked like bar codes or fractal patterns. How, then, did apes like us come to read?

Gifted neuroscientist Stanislas Dehaene argues in his new book, Reading in the Brain (Viking), that we read not because we have a reading instinct, and also not because our visual brain is a particularly pliable learner. Rather, we read because culture “neuronally recycles” our visual system. Culture over time has seen to it that the letter shapes of our writing systems have the shapes our visual system is good at processing. In particular, the brain is competent at processing the contour combinations that occur in natural scenes, and writing systems have come to disproportionately use these shapes.

For example, below are four configurations each having three contours and two Ts. Three of the four can happen in natural scenes, but one of these cannot, and it turns out that only this oddball is rare across human writing systems. It is not so much that the brain has a reading instinct, but that writing has a brain instinct. In fact, to the extent that writing has come to be shaped like nature (in order to get into the brain), writing has a nature instinct.

Picture 3

More generally, Dehaene’s line of thinking suggests that much of what makes humans stand so far apart from the other apes is a result of neuronal recyclling – not a result of natural selection at all.

Other pieces about the origins of writing are here, and also play prominently in my book, The Vision Revolution.

This first appeared on January 12, 2010, as a feature at the Telegraph.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

“But I thought you were blue!” That’s what Jake Sully, the main character in Avatar, says when he wakes up as an alien at the very end of the movie. Actually, the movie ends just before he has a chance to say anything, but that’s my guess about what he would have said. The question is: Why would he say that? (Or, if you’re a stickler: Why would I say he would say that?)

For those who haven’t seen the film, what you need to know is that Sully is a human who, from the safety of his brain-interface chamber back at the lab, can remotely control a “soul”-less alien body. And, at the end of the movie, and through the miracle of human suspension of disbelief, his human self gets literally uploaded into the alien’s body. Sully thereby becomes a bonafide eight foot tall, blue alien, and in the final frame of the movie we see his alien eyes open.

Question is: What does Sully see when he opens his eyes? And, more to the point: Does his new alien wife still appear blue to him?

By way of answering, let’s back up and remember what visual systems are for. When we look out at our world through our eyes, we implicitly believe we are seeing it as it truly is. Our eyes and visual systems are to us objective scientific measuring devices. But evolution does not select for objective scientific equipment – evolution selects for visual systems that best serve the animal and its reproduction. Although often the best perception is one that veridically reflects the truth, sometimes the best solution is a “useful fiction,” a little-white-lie perception that serves us better than an accurate accounting of the actual.

Consider the accent of your own voice. To you, you have no accent. It is other people that have an accent. And they think it is you that have an accent. Perceived accent, or “accentness”, is not a quality of the speech stream itself. Accentness is not an objective perception of anything out there. It depends on the baseline accent one has become adapted to, namely your own. When one adapts to a stimulus and it becomes baseline, the “qualitative feel” of that stimulus diminishes, i.e., it begins to feel like nothing to perceive it. The benefit of this is that even very  tiny deviations away from the baseline feel perceptually highly salient. Your own accent doesn’t sound accented because it is your baseline accent, allowing you to be trigger sensitive to the modulations around it from the voices of different people and their different emotional inflections. This is also why you don’t notice any smell to your own nose, any taste to your own tongue, or any temperature to your own skin.

And, in addition, it is why you don’t perceive your own skin to be colorey. People perceive the color of their own skin (or that of the most common one in their experience) as uncolorey and difficult to name. This perceptual adaptation to the baseline skin color allows people to be highly sensitive to the subtle color modulations happening on the skin of others as a function of mood or state.

Now we can circle back to our earlier question. When Sully opens his eyes as an alien, does his alien wife still appear blue to him? And the answer is no. He’s one of them now, and will perceive his wife’s skin, and his own skin, as peculiarly uncolorey, no longer blue at all. He will also not notice the taste of his own alien saliva, something you can be sure he would have noticed were he to have tasted alien saliva as a human. And now we see why alien Sully exlaimed, “But I thought you wereblue!” Let’s just hope he wasn’t into her only because she was blue!

Other pieces about color vision and skin are here, and also play prominently in my book, The Vision Revolution.

This first appeared on January 6, 2010, as a feature at the Telegraph.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

It’s nearing the end of American football season, with the Super Bowl fast approaching. These games involve displays of tremendous strength, agility and heart. What you may not have known is that some of the most talented players out on the field are doing it all with their eyes closed.   Literally.    The American football player Larry Fitzgerald of the Arizona Cardinals made news last year when photographers captured him catching the ball with his eyes closed. He apparently does this all the time. And it is not just Fitzgerald who does this: after just five minutes searching online I found evidence that acclaimed college wide receiver Austin Pettis of Boise State, this year’s Fiesta Bowl Champion’s, closes his eyes when catching, as seen in the photo here.

Austin Pettis Boise State
How can these athletes be the best in the world, and yet close their eyes at what would appear to be the most important moment? It is less surprising than it first seems.

Our brains are slow: it takes about a tenth of a second between the time that light lands on your eye to the time that the resultant perception occurs. That is a long time. A receiver running at 10 meters per second (or about 20 mph) moves one meter in a tenth of a second. If the receiver’s brain were to take the information at the eye and turn it directly into a perception of what the world was like, then by the time the perception occurs a tenth of a second later, that perception would be tenth-of-a-second-old news.

The receiver would be perceiving the world as it was a tenth of a second before. And because he may move a meter in that amount of time, anything that he perceives to be within one meter of passing him will have already passed him – or collided into him – by the time he perceives it. The ball may be moving faster still, maybe 30 meters per second (about 70 mph) or more, in which case it can move 3 meters in a tenth of a second.

Seeing the world a tenth of a second late is a big deal. That’s why our brains evolved strategies for overcoming this delay. Rather than attempting to build a perception of what the world was like when light hit the eye, the brain tries to figure out what the world will probably look like a tenth of a second after that time, and build a perception of that. By the time that perception (of the guessed-near-future) is generated in the brain, it is a perception of the present, because the near-future has then arrived. A lot of evidence exists suggesting that our brains have such “perceiving-the-present” mechanisms. And I have argued in my research that a great many of the famous illusions are due to these mechanisms – the brain anticipates a certain kind of dynamic change that never ends up happening (because it is just a drawing in a book, say), so one gets a misperception.

Back to catching with your eyes closed. Consider now that the perception you have at time t is actually a construction of your brain: the brain constructs that perception on the basis of evidence the eye got a tenth of a second earlier. So, to accurately perceive the world at time t, one need not actually have any light coming into the eye at time t. …so long as one had light coming in a tenth of a second earlier. Perhaps Pettis can get away with his eyes closed at the catch because his brain has already rendered the appropriate perception by that time.

Of course, when his eyes are closed at time t (the time of the catch), it means he won’t have a perception of the world a tenth of a second after the catch; but by then he’s being tackled and would only see stars anyway.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

This first appeared on February 1, 2010, as a feature at ScientificBlogging.com.

Read Full Post »

This first appeared on January 2, 2010, as a feature at the Telegraph.

You know what I love about going to see plays or musicals at the theater? Sure, the dialog can be hilarious or touching, the songs a hoot, the action and suspense thrilling. But I go for another reason: the 3D stereo experience. Long before movies were shot and viewed in 3D, people were putting on real live performances, which have the benefit of a 3D experience for all the two-eyeds watching. And theater performances don’t simply approximate the 3D experience – they are the genuine article.

“But,” you might respond, “One goes to the theater for the dance, the dialog, the humans – for the art. No one goes to live performances for the ‘3D feel’! What kind of low-brow rube are you? And, at any rate, most audiences sit too far away to get much of a stereo 3D effect.”

“Ah,” I respond, “but that’s why I sit right up front, or go to very small theater houses. I just love that 3D popping out feeling, I tell ya!”

At this point you walk out, muttering something about the gene pool. And you’d be right. That would be a rube-like thing for me to say. We see in 3D all the time. I just saw the waitress here at the coffee shop walk by. Wow, she was in 3D! Now I’m looking at my coffee, and my mug’s handle appears directed toward me.Woah, its in 3D! And the pen I’m writing with. 3D!

No. We don’t go to the live theater for the 3D experience. We get plenty of 3D thrown at us every waking moment. But this leaves us with a mystery. Why do people like 3D movies? If people are all 3D-ed out in their regular lives, why do we jump at the chance to wear funny glasses at the movie house to see Avatar? Part of the attraction surely is that movies can show us places we’ve never been, whether real or imaginary, and so with 3D we can more fully experience what it is like to have a Tyrannosaurus Rex make a snout-reaching grab for us.

But there is more to it. Even when the movie is showing everyday things, there is considerable extra excitement when it is in 3D. Watching a live performance in a tiny theater is still not the same as watching a “3D movie” version of that same performance. But what is the difference?

Have you ever been to one of those shows where actors come out into the audience? Specific audience members are sometimes targeted, or maybe even pulled up on stage. In such circumstances, if you’re not the person the actors target, you might find yourself thinking, “Oh, that person is having a blast!” If you’re the shy type, however, you might be thinking, “Thank God they didn’t target me because I’d have been terrified!” If you are the target, then whether you liked it or not, your experience of the evening’s performance will be very different from that of everyone else in the audience. The show reached out into your space and grabbed you. While everyone else merely watched the show, you were part of it.

The key to understanding the “3D movie” experience can be found in these targets. 3D movies differ from their real-life versions because everyone in the audience is a target, and all at the same time. This is simply because the 3D technology (sending up left and right eye images to the screen, with glasses designed to let each eye see only the image intended for it) gives everyone in the audience the same 3D effect. If the dragon’s flames appear to me to nearly singe my hair but spare everyone else’s, your experience at the other side of the theater is that the dragon’s flames nearly singe your hair and spare everyone else’s, including mine. If I experience a golf ball shooting over the audience to my left, then the audience to my left also experiences the golf ball going over their left. 3D movies put on a show that is inextricably tied to each listener, and invades each listener’s space. Everyone’s experience is identical in the sense that they’re all treated to the same visual and auditory vantage point. But everyone’s experience is unique because each experiences themselves as the target – each believes they have a special targeted vantage point.

The difference, then, between a live show seen up close and a 3D movie of the same show is that the former pulls just one or several audience members into the thick of the story, whereas 3D movies have this effect on everyone. Part of the fun of 3D movies is not, then, that they are 3D at all. We can have the same fun when we happen to be the target in a real-live show. The fun is in being targeted. When the show doesn’t merely leap off the screen, but leaps near you, it fundamentally alters the emotional experience. It no longer feels like a story about others, but becomes a story that invades your space, perhaps threateningly, perhaps provocatively, perhaps joyously. No, we don’t suffer the indignity of 3D glasses for the “popping out feeling”. We enjoy 3D movies because when we watch them we are no longer mere audience members at all.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

Click on each slide to see it in higher resolution.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

For more information about my theory, see LiveScience, New York Times, BoingBoing, SciAm. The best introduction is chapter 3 of The Vision Revolution. And the journal article is here.

This first appeared on December 24, 2009, as a feature at ScientificBlogging.com.

Read Full Post »

I was on the Lionel Show / Air America this morning, which was a blast!  Got to talk about my recent book, and about evolution, autistic savants, intelligent design, color, forward-facing eyes, illusions, and more. I really must get off the elliptical machine next time I do a radio show. Here’s the segment with me (or mp3 on your computer).

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

This first appeared on October 26, 2009, as a feature at ScientificBlogging.com

Later this evening I’ll be giving a talk to a group of astronomers on what its like to see like an alien. The beauty of this is that I can speculate until the cows come home without fear of any counterexamples being brought to my attention. And even if an alien were to be among the audience members and were to loudly object that he sees differently than I claim, I can always just say that the jury is out until we get more data, and then advise him not to let the door slam into his proboscis on the way out.

E.T. the extra-terrestrial

E.T.'s forward-facing eyes suggests its ancestors evolved in forests

Although it may seem wild-eyed to discuss the eyes of aliens, if we understand why our vision is as it is, then we may be able to intelligently guess whether aliens will have vision like ours.

And in addition to the fun of chatting about whether little green men would see green, there are human implications. In particular, it can help us address the question, How peculiar is our human vision? Are we likely to see eye to eye with the typical alien invader? Or does our view of the world differ so profoundly that any alien visual mind would remain forever inscrutable?

Let’s walk through four cases of vision that I discuss in my book The Vision Revolution and ask if aliens are like us.

Do aliens see in color like us?

Let’s begin with color. I have argued in my research that our primate variety of color vision evolved in order to sense the skin color signals on the faces, rumps, and other naked spots of us primates. Not only are the naked primates the ones with color vision, but our color vision is at the sweet spot in design space allowing it to act like an oximeter and thereby see changes in the spectrum of blood in the skin as it oxygenates and deoxygenates. (See the journal article.)

Aliens may be interested in eating our brains, but they have no interest whatsoever in sensing the subtle spectral modulations of our blood under our skin. Aliens will not see color as we do, and will have no idea what we’re referring to when we refer to “little green men.”

Little green men may not think they look green

This can take the wind out of many people, namely those who feel that their senses give them an objective view of the world around them. But evolution doesn’t care about objective views of the world per se. Evolution cares about useful views of the world, and although veridical perceptions do tend to be useful, little-white-lie perceptions can also be useful. We primates end up with colors painted all over the world we view, but our color vision (in particular the red-green dimension) is really only meaningful when on the bodies of others. Although we feel as if the objects in our world “really” have this or that color, no alien would carve the world at the color-joints we do.

Do aliens have forward-facing eyes?

How about our forward-facing eyes we’re so proud of? I have argued and presented evidence that forward-facing eyes evolved as an adaptation to see more of one’s surroundings when one is large and living in leafy habitats. Animals outside of leafy cluttered habitats are predicted to have sideways-facing eyes no matter their body size, but forest animals are predicted to have more forward-facing eyes as they get larger. That is, in fact, what I found. (See the article.)

So, would aliens have forward-facing eyes? It depends on how likely it is that they evolved in a forest-like habitat (with leaf-like occlusions) and were themselves large (with eye-separation as large or larger than the typical occlusion width). My first reaction would be to expect that such habitats would be rare. But, then again, if plant-like life can be expected anywhere, then perhaps there will always be some that grow upward, and want to catch the local starlight. If so, a tree-like structure would be as efficient a solution as it is for plants here on Earth. The short answer, then, is that it depends. But that means that forward-facing eyes are fundamentally less peculiar than our variety of color vision. Aliens could well have forward-facing eyes, but it would not appear to be a sure thing.

Do aliens suffer from illusions?

One of the more peculiar things our brain does to us is see illusions. I have provided evidence that these illusions are not some arcane mistake, but a solution to a problem any brain must contend with if it is in a body that moves forward. When light hits our eye, we would like our perception to occur immediately. But it can’t. Perception takes time to compute, namely about a tenth of a second. Although a tenth of a second may not sound like much, if you are walking at two meters per second, then you have moved 20 cm in that time, and anything perceived to be within 20 cm of passing you would have just passed you – or bumped into you – by the time you perceive it. To deal with this, our brains have evolved to generate a perception not of the world as it was when light hit the eye, but of how the world will be a tenth of a second later. That way, the constructed perception will be of the present. Although there is no room in this piece to describe the details, I have argued that a very large swathe of illusions occur because the visual system is carrying out such mechanisms. (See the paper.)

Are aliens buying books of illusions and “ooh”ing and “ah”ing at them like we are? If they are moving forward (and have non-instantaneous brains), then they probably are buying these books. This is because the optic flow characteristics that underlie the explanation of the illusions are highly robust, holding in any environment where one moves forward. Aliens are, then, likely to suffer from illusions. The illusions we humans suffer from, then, may not be due to some arcane quirk or mistake in our visual system software, but, instead, a consequence of running the efficient software for dealing with neural delays.

Is alien writing shaped like ours?

I have provided evidence that our human, Earthly writing systems “look like nature,” in particular so that words have object-like structure. And I have shown that for writing like ours where letters stand for speech sounds, letters look like sub-objects, namely object junctions. Certain contour-combinations happen commonly in natural scenes, and certain combinations happen rarely. I have shown that the common ones in such environments are the common letters shapes found in human writing systems. Culture has selected writing to have the visual shapes our illiterate brains can see, which is why we’re such capable readers. (See the paper, a popular piece, and an excerpt from The Vision Revolution on this.)

Would alien writing look like this?

In this light, would alien writing look like nature as well? It depends on how specific one is when one says “like nature.” If, say, our human writing looks specifically like a savanna – i.e., if our writing mimicked signature visual features of the savanna – then it would appear very unlikely that aliens would have our kind of writing. But what if human writing looks like a very general notion of nature, so general it is likely to apply to most conceivable aliens? In my research I have provided evidence that the “nature” that appears relevant for understanding the shape of human writing is, indeed, highly general: namely, “3D environments with opaque objects strewn about.” Although highly general, aliens could float in a soup of cloudy transparent blobs, which is a kind of “nature” radically different than the one that human writing looks like. But it does seem plausible that most aliens will be roaming around opaque objects in 3D, and if that’s the case, then (so long as their culture has selected their writing to harness their visual object recognition system) their writing may look similar to human writing. Alien writing, if thrown into a pile of samples across our human writing, might just fit right in!

—-

So let’s take stock.

Would aliens have our color vision? No. Definitely not. Ours is due to our peculiar hemoglobin.

Would aliens have forward-facing eyes? Maybe. If they evolved in leafy habitats and were large.

Would aliens see our illusions? Probably.   If they move forward.

Would aliens have writing that looks like ours? Probably. If they live in a 3D world with opaque objects.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

This is based on an excerpt from “Spirit-Reading”, the fourth chapter of The Vision Revolution.

The topic: How illiterate apes like us came to read.

===========

Super Reading Medium

Communicating with the dead is a standard job requirement for a psychic such as the infamous medium John Edward of the television show Crossing Over who claims to be able to listen to what the deceased family members of his studio audience have to say. Hearing the thoughts of the dead would appear to be one superpower we certainly do not possess. Surely this superpower must remain firmly in the realm of fiction (Edward included). However, a little thought reveals that we in fact do this all the time. …by simply reading. With the invention of writing, the ability for the dead to speak to the living suddenly became real. (Progress in communicating in the other direction has been slower going.) For all you know, I’m dead, and you’re exercising your spirit-reading skills right now. Good for you!

outsideCoverOnlyPowerpoint

Before the advent of writing, in order to have our thoughts live on after we had gone we had to invent a great story or catchy tune and hope that they’re singing it by the fire for generations. Only a few would be lucky enough to have a song or story with such legs (e.g., Homer’s Illiad), and at any rate, if our ancestors were anything like us, their greatest hits probably tended to include “ooh-la-la” and “my baby left me” much more often than “here’s my unsolicited advice” and “beware of milk-colored berries.” Getting your children to be your audio tape in this fashion is probably futile (and aren’t they just as likely to purposely say the opposite?), but at least it relies on spoken words, something readily understandable by future generations. The problem is getting your voice to last. Voices are just too light and insubstantial, like a quarterback finding an open receiver and throwing to him a marshmellow. Marshmellows are great to hold, but impossible to throw far. I suppose if you were to speak loudly enough during a heavy volcanic ash storm, ripples on the rapidly accumulating layers of ash might record your spoken words, one day to be recovered by clever archeologist decoders. However, much of what you’re likely to say in such circumstances will be unrepeatable in polite company.

What prehistoric people did successfully leave behind for us to read tended to be solid and sturdy, like Stonehenge or the moai statues of Easter Island. These were quarterback passes that got to the receiver all right, except that now the quarterback is throwing something that is uncatchable, like porcupines or anvils. Massive monuments are great if your goal is to impress the neighboring tribes or to brag to posterity. But if your goal is to actually say something that can be understood, this tact is worse than writing abstruse poetry, and literally much heavier. The only thing we’re sure of about such communications is that they had too much free time on their hands. Not the most informative spirit-reading.

The invention of writing changed spirit-reading forever. It also changed the world. Reading now pervades every aspect of our daily lives, so much so that one would be hardpressed to find a room in a modern house without words written somewhere inside. Lots of them. Many of us now read more sentences in a day than we listen to. And when we read we must process thousands of tiny shapes in a short period of time. A typical book may have more than 300,000 strokes, and many long novels will have well over one million. Not only are we highly competent readers, but our brains even appear to have regions devoted to recognizing words. Considering all this, a Martian just beginning to study us humans might be excused for concluding that we had evolved to read. But, of course, we haven’t. Reading and writing is a recent human invention, going back only several thousand years, and much more recently for many parts of the world. We are reading using the eyes and brains of our illiterate ancestors. And this brings us to a deep mystery: Why are we so good at such an unnatural act? We read as if we were designed to read, but we have not been designed to read. How did we come to have this super power?

Reading as a super power? Isn’t this, you might ask, a bit of an exaggeration? No, it really is super. To better appreciate it, when you next have the illiterate caveman neighbors over to the house—the ones who always bring the delicious cave-made bunt cake—wow them with how you can transmit information between you and your spouse without speaking to one another. …by writing and reading. They’ll certainly be impressed. It’s not your use of symbols that will impress them, however, because they leave symbols for one another all the time, like a shrunken head in front of the cave to mean the other is at the witchdoctor’s. And they have spoken language, after all, and realize that the sounds they utter are symbols. What will amaze them about your parlor trick is how freakishly efficient you are at it. How did your spouse read out the words from the page so fast? Although they appreciate that there’s nothing spooky in principle about leaving marks on a page that mean something, and someone reading them later, they conclude that you are just way too good at it, and that, despite your protestations, you must be magical shamans of some kind. They also don’t fail to notice that your special power would work even if the writer was far away. Or long dead. Their hairs stand on end, the conversation becomes forced, they skip dessert, and you notice that their cavekids don’t come around to throw spears at your kids any more. As the saying goes, one generation’s maelstrom is a later generation’s hot tub. We’re just too experienced with writing to appreciate how super it is, but not so for your cave neighbours.

We have the super power of reading not because we evolved to read—and certainly not because we’re magical in any way—but because culture evolved writing to be good for the eye. Just as Captain Kirk’s technology was sometimes interpreted as magic by some of the galaxy locals, your neighbors are falsely giving you credit for the power when the real credit should go to the technology. The technology of writing. And not simply some new untested technology, but one that has been honed over many centuries, even millenia, by cultural evolution. Writing systems and visual signs tended to change over time, the better variants surviving, the worse ones being given up. The resultant technology we have today allows meanings to flow almost effortlessly off the page and straight into our minds. Instead of seeing a morass of squiggles we see the thoughts of the writer, almost as if he or she is whispering directly into our ears.

The special trick behind the technology is that human visual signs have evolved to look like nature. Why? Because that is what we have evolved over millions of years to be good at seeing. We are amazingly good at reading the words of the dead (and, of course, the living) not because we evolved to be spirit-readers. Rather, it is because we evolved for millions of years to be good at quickly visually processing nature, and culture has evolved to tap into this ability by making letters look like nature. Our power to quickly process thousands of tiny shapes on paper is our greatest power of all, changing our lives more than our other powers. Literacy is power, and it’s all because our eyes evolved to see well the natural shapes around us and we, in turn, put those shapes to paper.

Good Listening

“How did your date go?” I asked.

“Great. Wow. What a guy!” she replied. “He listened so attentively the entire dinner, just nodding and never interrupting, and—”

“Never interrupting?” I interjected.

“That’s right. So supportive and interested. And so in tune with me, always getting me without even needing to ask me questions, and his—”

“He asked you no questions?” I interrupted, both eyebrows now raised.

“Yes! That’s how close the emotional connection was!”

It struck me that any emotional connection she felt was a misreading of his eyes glazing over, because her date was clearly not listening. At least not to her! I didn’t mention to her that the big game was last night during her dinner, and I wondered whether her date might have been wearing a tiny ear phone.

Good listeners don’t just sit back and listen. Instead, they are dynamically engaged in the conversation. I’m a good listener in the fictional conversation above. I’m interrupting, but in ways that show I’m hearing what she’s saying. I am also able to get greater details of the story where I might need them. In this case about her date’s conversational style. That’s what good listeners do. They rewind the story if needed, or forward it to parts they haven’t heard, or ask for greater detail about parts. And good communicators tend to be those who are able to be interacted with while talking. If you bulldoze past all attempts by your listener to interrupt you, your listener will probably soon not be listening. Perhaps he’s heard that part before and is tuning out now. Or perhaps he was confused by something you said fifteen minutes earlier, and gave up trying to make sense of what you’re saying. Good listeners require good communicators. My fictional friend above appears to be a good communicator because she dynamically reacts to my queries midstream. The problem lies in her date, not her, and I politely suggest he may not be the right one for her.

Even though we evolved to speak and listen, but didn’t evolve to read, there is a sense in which writing has allowed us to be much better listeners than speech ever did. That’s because readers can easily interact with the writer, no matter how non-present the writer may be. Readers can pause the communication, skim ahead, rewind back to something not understood, and delve deeper into certain parts. We listeners can, when reading, manipulate the speaker’s stream of communication far beyond what the speaker would let us get away with in conversation—“Sorry, can you repeat the part that started with, ‘The first day of my trip around the world began without incident’?”—making us super-listeners, and making the writer a super-communicator.

We don’t always prefer reading to listening. For example, we listen to books on tape, lectures, and talk radio, and in each case the speakers are difficult to interrupt. However, even these cases help illustrate our preference for reading. Although people do sometimes listen to books on tape, they tend to be used when reading is not possible, like when driving. When one’s eyes are free, people prefer to read stories rather than hear them on tape, and the market for books on tape is miniscule compared to that for hard copy books. We humans have brains that may have evolved to comprehend speech, and yet we prefer to listen with our eyes, despite our eyes not having been designed for this! Television and movies have an audio stream that is not easily interruptable, and we do like that, but now the visual modality helps keep our attention. And although students have been listening for centuries to the speech of their professors, until recently with relatively little visual component, anyone who has sat through years of these lectures knows how often one’s mind wanders. …how often one is not actually listening. Talk radio has some popularity, and tends to be more engaging than traditional lectures, but notice that such shows go to great lengths to be conversational, typically having conversations with callers, and often having a pair of hosts (or a sidekick), to elicit the helpful interruptions found in good listening.

Canned speech, then, tends to be difficult to listen to, whereas genuine, dynamic, interactive conversation enables good listening. There is, however, one kind of audio stream our brains can’t get enough of, where interruption is not needed for good listening, and where we’re quite happy not seeing anything. Music. Audio tapes that give up on communication and aim only for aesthetics are suddenly easy listening. The rarity of books on tape, and the difficulty with listening to canned speech more generally, is not due to some intrinsic difficulty with hearing per se. The problem is that speech requires comprehension—music doesn’t—and comprehension can occur most easily when the listener is able to grab the conversation by the scruff of the neck and manipulate it as needed so he can fit it into his head. Good conversation with the speaker can go a long way toward this, but even better listening can be achieved by reading because then you can literally pick up the communication with your hands and interact with it to your heart’s content.

Working Hands

Having a conversation is not like passing notes in class. Although in each case two people are communicating back and forth in turn, when passing notes you tend to do little reading and lots of wiggling—either wiggling your hand in the act of writing a note, or twiddling your thumbs while waiting for your friend to write his. Note-writing takes time, so much time that passing notes back and forth is dominated by the writing, interspersed with short bouts of reading. All the work’s in the writing, not the reading. Conversation—i.e., people speaking to one another—is totally unlike this. Speaking flows out of us effortlessly, and comes out nearly at the speed of our internal thoughts. That is, whereas writing is much more difficult than reading, speaking is not much more difficult than listening.

The reason for this has to do with how many people we’re communicating with. When we speak there are typically only a small number of people listening, and most often there’s just one person listening (and often less than that when I speak in my household). For this reason spoken language has evolved to be a compromise between the mouth and ear: somewhat easy for the speaker to utter, and somewhat easy for the listener to hear. In contrast, a single writer can have arbitrarily many readers, or “visual listeners.” If cultural evolution has shaped writing to minimize the overall efforts of the community, then it is the readers’ efforts that will drive the evolution of writing because there are so many of them. That’s why as amazing, as writing may be, it is a gift to the eye more than a gift to the hand. For example, a book may take six months to write, but it may take only six hours to read. That’s a good solution because there are usually many readers of any given book.

Is writing really for the eye, at the expense of the hands? One of the strongest arguments that this is the case is that writing has been culturally selected to look like nature, something we’ll see later. That’s a good thing for the eye, not the hand, because the eye has evolved to see nature—the hand has not evolved to draw it. Not only does writing tend to look like nature, but I have found that even visual symbols like trademark logos—which are typically never written by hand, and are selected to be easy on the eyes—have the fundamental structural shapes found in nature. And note that for some decades now much of human writing has not been done by hand, but instead has been done by keyboard. If the structures of letters were for the hand, we might expect that now that our hands tend to be out of the picture, the structures of letters might change somewhat. However, although there are now hundreds of varying fonts available on computers, the fundamental structural shapes have stayed the same. Shorthands, however, have been explicitly selected for the hand at the expense of the eye, and shorthands look radically different from normal writing, and I have shown that they have shapes that are not like nature. I have also taken data from children’s scribbles and shown how the fundamental structures occurring in scribbles are unlike that found in writing and in nature. Finally, one can estimate how easy a letter is to write by the number of distinct hand sweeps required to produce it (this counts sweeps resulting in strokes on the page, and also sweeps of the hand between touchings of the paper), and such estimates of “motor ease” do not help to explain the kinds of shapes we find in writing.

Could culture really have given no thought whatsoever to the tribulations of the hand? Although selection would have favored the eye, it clearly would have done the eye no good to have writing be so difficult that no hands were willing to make the effort. Surely the hand must have been thrown a bone, and it probably was. The strokes in the letters you’re reading, and in line drawings more generally, are quite a bit like contours in being thin, but there is an important difference. Real world contours occur when one surface stops and another starts, like the edge between two walls, or the edge of your table. Usually there is no line or stroke at all (although sometimes there can be), but a sudden change in the nature of the color or texture from one region to the next. The visual system would therefore probably prefer contours, not strokes. But strokes are still fairly easy to see by the visual system, and are much easier for the hand to produce. After all, to draw true contours rather than strokes would require drawing one color or texture on one side of the border of the intended contour and another color or texture on the other side. I just tried to use my pen to create a vertical contour by coloring lightly to the right and more darkly to the left, but after a dozen tries I’ve given up. I won’t even bother trying to do this for an “S”! It’s just too hard, which is why when we try to draw realistic scenes we often start with lines as contours, and only later add color and texture between the lines. And that’s probably why writing tends to use strokes. That we use strokes and not true contours is for the benefit of the hand, but the shapes of our symbols are for the eye.

Harness the Wild Eye

You’d be surprised to see a rhinoceros with a rider on its back. In fact, a rider would seem outlandish on most large animals, whether giraffe, bison, wildebeast, bear, lion, or gorilla. But a rider on a horse seems natural. Unless you grew up on a farm and regularly saw horses in the meadows, a large fraction of your experiences with horses were likely from books, television and film where the horses typically had riders. Because of your “city-folk” experiences with horses, a horse without a rider can seem downright unnatural! In fact, if aliens were observing the relationship between humans and horses back when horses were our main mode of transportation, they may have falsely concluded that horses were designed to carry humans on their backs. But, of course, horses aren’t born with bridles and saddles attached, and they didn’t evolve to be ridden. They evolved over tens of millions of years of evolution in savannas and prairies, and it is only recently that one of the primates had the crazy idea to get on one. How is it that horses could have become so well adapted as “automobiles” in a human world?

Horses didn’t simply get pulled out of nature and plugged into society. Instead, culture had to evolve to wrap around horses, making the fit a good one. Horses had to be sired, raised, fed, housed, steered, and scooped up after. Countless artifacts were invented to deal with the new tasks required to accommodate the entry of horses into society, and entire markets emerged for selling them. Diverse riding techniques were developed and taught, each having certain advantages for controlling these beasts. The shapes of our homes and cities themselves had to change. Water troughs in front of every saloon, stables stationed through towns and cities, streets wide enough for carriages, parking spaces for horses, and so on. That horses appear designed for riders is an illusion due to culture having designed itself so well to fit horses.

Just as horses didn’t evolve to be ridden, eyes didn’t evolve for the written. Your eyes reading these words are wild eyes, the same eyes and visual systems of our ancient preliterate ancestors. And yet, despite being born without a “bridle,” your visual system is now saddled with reading. We have, then, the same mystery as we find in horses: how do our ancient visual systems fit so well in modern reading-intensive society?

Eyes may seem like a natural choice for pulling information stored on material, and indeed vision probably has inherent superiorities over touch or taste, just as horses are inherently better rides than rhinos. But just as horses don’t fit efficiently into culture without culture evolving to fit horses, the visual system couldn’t be harnessed for reading until culture evolved writing to fit the requirements of the visual system. We didn’t evolve to read, but culture has gone out of its way to create the illusion that we did. We turn next to the question of what exactly cultural evolution has done to help our visual systems read so well.

From the Hands of Babes

You might presume that a two and a half year old girl couldn’t have much to say. If I were struck on the head and reduced to infant-level intelligence for two and a half years, I’m fairly sure I wouldn’t thereby have a flood of stories to tell you about. None, at least, that are not considerably degrading. But there she is, as you well know if you’ve seen these creatures, talking up a storm. A little about the few things that have happened to her, but mostly about things that never have, and never will: princesses, dragons, Spongebob, Stegosauruses. She’s five now and there’s been no let-up. She’s talking to me as I write this!

I just gave her a piece of paper and crayons, and although she’s just begun trying her hand at writing—“cat,” “dug,” “saac” (snake), “flar” (flower)—she’s been putting her thoughts and words to the page for a long time now. By drawing. Children are instructive for the invention of writing because they invent their own writing through pictures. Through the work of Rhoda Kellogg in the mid-twentieth century we know that children world-wide draw very similar shapes, and follow a similar developmental schedule. Since they are not designed to draw, these similarities are, in a sense, parallel discoveries about how to ably communicate on paper; on how to write. Sir Herbert E. Read, an early 20th century professor of literature and arts, encountered Rhoda Kellogg’s work late in his life, and wrote the following:

It has been shown by several investigators, but most effectively by Mrs. Rhoda Kellogg of San Francisco, that the expressive gestures of the infant, from the moment that they can be recorded by a crayon or pencil, evolve from certain basic scribbles towards consistent symbols. Over several years of development such basic patterns gradually become the conscious representation of objects perceived: the substitutive sign becomes a visual image. … According to this hypothesis every child, in its discovery of a mode of symbolization, follows the same graphic evolution. … I merely want you to observe that it is universal and is found not only in the scribblings of children but everywhere the making of signs has had a symbolizing purpose—which is from the Neolithic age onwards. [from Herbert Read, “Presidential Address to the Fourth General Assembly of the International Society for Educational Society for Eduation through Art.” Montreal: Aug 19, 1963.]

Aren’t children’s drawings just that, drawings? It’s certainly true that sometimes children are just trying to depict what they see. Those are “mere” drawings. But often their drawings are primarily aimed to say something—to tell a story. When my daughter brings me her latest drawing, she usually doesn’t brag about how real it looks (nor does she tell me about its composition and balance). Sure, sometimes she asks me to count how many legs her spider has, but usually I get a story. A long story. For example, here is a Cliffs Notes version of the story behind her drawing in Figure 1: A house with arms and eyes; the windows have faces; it is a magic house; there is a girl holding a plate of cream puffs; two people are playing with toys at the table but a tomato exploded all over the toy; there are butterflies in the house. Her drawing is intended to communicate a story, and that sounds an awful lot like writing.

But if she’s truly writing, then she’d have to be using symbols. Is it really plausible that small children are putting symbols on the page before they learn formal writing, as Rhoda Kellogg and Herbert Read believe? I think so, for consider that most of their drawings have only the barest resemblance to the objects they are intended to denote. Look again at nearly any of the objects in my daughter’s drawing in Figure 1. An attempt at realism? Hardly. We find similar kinds of symbols when even adults draw cartoons—adults who could draw realistically if they wished. These cartoon symbols, like those in the first row of Figure 2a, are ridiculously poor renderings of objects. You have surely seen similar visual signs out and about in culture. Although you’ll probably have no trouble knowing what animals the drawings are intended to symbolize, your dog would have no idea what those (or my daughter’s) drawings are supposed to be. They get their meaning by convention more than by resemblance. We’re so used to these conventions that we have the illusion that they look like the animals they refer to, but other cultures often have somewhat different conventions for their animals. For example, I find it difficult to tell what kind of animal I’m looking at in many of today’s Japanese cartoons for kids, some of them shown in the second row of Figure 2a.

The same is true for sound. We in the United States say “ribbit” to refer to the call made by a frog, and after growing up with that as the symbol for frog calls it can be hard to appreciate that frogs don’t sound at all like that. In fact, people from different cultures use different sound symbols to refer to frog calls, and each person is initially convinced that their sound resembles frogs. Algerians say “gar gar,” Chinese say “guo guo,” the English say “croak,” the French say “coa-coa,” Koreans say “gae-gool-gae-gool,” Argentinians say “berp,” Turks say “vrak vrak,” and so on. Just as in children’s drawings, the sound “ribbit” is a symbol for the call of the frog, not a real attempt to resemble or mimic it.

Children’s drawings communicate stories with symbols. That sure sounds like writing to me. Or at least the barest beginnings. If these little whippersnappers are so smart that they can spontaneously invent writing largely on their own, perhaps it couldn’t hurt to look into the kinds of symbols they choose for their writing. And the answer is so obvious that it may be difficult to notice: children draw object-like symbols for the objects in their writing. Their drawings may not look much like the objects they stand for, but they look like objects, not like fractal patterns, not like footprints, not like scribbles, not like textures, and so on. The same is true for the cartoons drawn by adults, as in Figure 2a. And we find the same so-obvious-it-is-hard-to-notice phenomenon for animal calls: although there are lots of different sounds used for frog calls, they are all animal-call-like. All those frog calls sound like some possible kind of animal. What might this phenomenon mean for writing?

Word and Object

Is the strategy of object-like drawings for objects mere child’s play? Apparently not, because it’s not just in kids’ drawings and cartoons that you find this, but among human visual signs generally. Most non-linguistic visual signs throughout history have been object-like, such as those found in pottery, body art, religion, politics, folklore, medicine, music, architecture, trademarks and traffic (see Figure 2b for a small variety). And computer desktop icons are not only object-like in appearance, but can even be moved around like objects. Much of formal writing itself has historically been of this objects-for-words form, such as Egyptian hieroglyphs, Sumerian cuneiform, Chinese, and Mesoamerican writing. Modern Chinese is still like this, used by nearly half the world. In these writing systems we find drawings with the complexity of simple objects and used as symbols to refer to objects, and also to refer to adjectives, adverbs, verbs and so on. (See Figure 2c for several examples.) Object-like symbols for objects—that trick’s not just for kids.

Is there something beneficial about drawing objects for the words in writing? I suspect so, and I suspect that it is the same reason that animal-call symbols tend to be animal-call-like: we probably possess innate circuitry that responds specifically to animal-call-like sounds, and so our brain is better able to efficiently process a spoken word that means an animal call if the word itself sounds animal-call-like. Similarly, we possess a visual system designed to recognize objects and efficiently react to the information. If a word’s meaning is that of an object (even an abstract object), then our visual system will be better able to process and react to the written symbol for that object if the written symbol is itself object-like. Figure 3b shows a fictional case of writing with object-like symbols for words (and single strokes are shown for “function words” like ‘the’ and ‘in’). To begin to grasp why this strategy might be good, consider two alternative strategies besides the objects-for-words one.

First, rather than drawing objects for words, we could be lazy and just draw a single contour for each spoken word. Writing “The rain in Spain stays mainly in the plain” would then look something like that shown in Figure 3a. Shorthand is somewhat akin to the lazy approach, with some words having single stroke notations. Shorthand is great for writers with fast-talking bosses, but is notoriously hard to read and has not caught on for writing. Kids also don’t think it’s a good idea—there’s not even a single lone contour in my daughter’s drawing in Figure 1. One reason it’s not a good idea is that there are just not enough distinguishable stroke types for all the words we speak. Coming up with even 100 easily distinguishable stroke types would be tricky, and that would still be far below the tens of thousands that would be needed for writing.

There is also a more fundamental difficulty, and it has to do with the fact that the part of your brain doing the visual computations is arrayed in a hierarchy. The earlier stages of the hierarchy deal with simpler parts like contours, higher areas deal with simple combinations of contours, and eventually at the highest regions of the hierarchy full objects are recognized and perceived. The problem with using single strokes to represent spoken words like in Figure 3a is that the visual system finishes processing the strokes far too early in the hierarchy. The visual system is not accustomed to word-like (e.g., object-like) interpretations to single strokes. Single strokes are typically not perceived at all, at least not in the sense that they make the list of things we see out there. For example, when you look at Figure 4 you perceive a cube in front of a pyramid. That’s what you consciously notice and carry out judgements upon. You don’t see the dozen contours in quite the same sense. Nor do you see the many object corners and junctions (intersections of contours). You don’t say, “Hey, look at all those contours and corners in the scene.” Our brains evolved to perceive objects, not object-parts, because objects are the clumps of matter that stay connected over time and are crucial to parsing and making sense of the world. Our brains naturally look for objects and want to interpret stimuli out there as objects, so using a single stroke for a word (or using a junction for a word) is not something our brains are happy about. Instead, when seeing the stroke-word sentence in Figure 3a the brain will desperately try to see objects in the jumble of strokes, and if it can find one, it will interpret that jumble of strokes in an object-like fashion. But if it did this, it would be interpreting a phrase or whole sentence as an object, something that is not helpful for understanding a sentence: the meaning of a sentence is “true” or “false,” not any single word meaning. Using single strokes as words is, then, a bad idea because the brain is not designed to treat single contours as meaningful. Nor is it designed to treat object junctions as meaningful. That’s why spoken words tend to be written with symbols having a complexity no smaller than visual objects.

How about, instead, letting spoken words be visually symbolized by whole scenes, i.e., via multiple objects rather than just a single one? Figure 3c shows what “The rain in Spain…” might look like with this “scene-ogram” strategy. Quite an eye full. These are akin to the drawings found in some furniture assembly manuals. The problems now are the opposite to those before. First, the natural meaning of scene-ogram images is more like that of a sentence, like “Take the nail that looks like this, and pound it into the wooden frame that looks like that.” Secondly, the fact that there are objects as part of these complex symbols is itself a problem because now the brain wants to inappropriately make meanings out of these, and yet these objects are now just the building blocks of a written word, having no meaning at all.

In sum, the visual system possesses innate mechanisms for interpreting object-like visual stimuli as objects. Because spoken words are the smallest meaningful entities in spoken language, and often have meanings that are at the object level (either meaning objects, or properties of objects, or actions of objects), it is only natural to have visual representations of them that the visual system has been designed to interpret, and to interpret as objects. By drawing objects for spoken words—and not smaller-than-object visual structures like contours or junctions, and not larger-than-object visual structures like scenes—the visual system is able to be best harnessed for a task it never evolved to do. (See Figure 5.)

Object-like symbols might, then, be a good idea for representing words, but are the object-like symbols we find in culture a result of cultural evolution having selected for this, or might it instead be that they are just a left-over due to the first symbols having been object-like? After all, the first symbols tended to be object-like pictograms, even more object-like than the symbols in Figures 2b and 2c. Perhaps our symbols are still object-like merely because of inheritance, and not because culture has designed them to be easy on the eye. The problem with this argument is that writing tended to change quickly over time, especially as cultures split. If there were no cultural selection pressure to keep symbols looking object-like, then the symbol shapes would have randomly changed over the centuries, and the object-likeness would have tended to become obliterated. But that’s not what we find. Culture has seen to it that our symbols retain their object-likeness, because that’s what makes us such good readers. It is interesting, though, that even the first symbols were on the right track, before cultural evolution had time to do any shaping of its own. Although, given that even small children codgeon onto this, it’s perhaps not too surprising that the first scribes appreciated the benefits of object-like drawings for words.

The Trouble with Speech Writers

The brain prefers to see objects as the symbols for words, and kids and much of the world have complied. Such writing is “logographic” (symbols for words), and doesn’t give the reader information on how to speak it, which is itself a great benefit, for then even people who speak different languages can utilize the same writing system and be able to communicate via it. That is, logographic writing systems can serve as universal writing systems bringing together a variety of spoken languages into harmony and friendship, Tower-of-Babel style. Japanese speakers, for example, have no idea what a Chinese speaker is saying, but can fairly well understand written Chinese because Japanese speakers also use Chinese writing (which is of the objects-for-words kind).

Brotherhood and peace may be nice, but there er jus some thangs ya cayant do when writin’ with objects. For one thing, you can’t communicate how to say those words. …including putting a person’s accent down on the page. A Japanese person may be glad to be able to read Chinese content, but he will be totally unprepared to actually speak to anyone in China. The kind of writing you’re reading at the moment is entirely different. Rather than symbols for spoken words, the basic symbols are letters saying how to speak the words. You’re reading “speech-writing.” Speech-writing allows us to put Tom Sawyer’s accent on paper, and it allows non-speakers of our language to obtain a significant amount of knowledge about how to speak among us by reading at home. Such a learner would have an atrocious accent, of course, but would nevertheless have a great start. A second important advantage to speech-writing is that one can get away with many fewer symbols for writing. Rather than one object-like symbol for each of the tens of thousands of spoken words, one only needs a symbol for each of the dozens of speech sounds, or phonemes, we make. That’s a thousand-fold reduction in the number of written symbols we have to learn.

I have no idea whether the merits of speech-writing outweigh the benefits of logographic (symbols-for-words) writing, but there have been hundreds of speech-writing systems over history, many in use today by about half the world’s population. And when culture decided to go the speech-writing route rather than the logographic route, it created for itself a big dilemma. As we’ve discussed, the best way to harness the natural object-recognition powers of the visual system is to have spoken words look object-like on paper. But in speech-writing the symbols are for speech sounds, and written words will consist of multiple speech sound symbols. How can our written words look like objects if written words no longer have fundamental symbols associated with them? If symbols are for fundamental speech sounds, then the look of a written word will depend upon the letters in it. That is, the word’s look will be due to the vagaries of how the word sounds when spoken. Had it been spoken differently, the written word would look different. If the look of a word depends on how speakers say the word, it would seem that all hope is lost in trying to make written words look object-like in speech-writing.

There is a way out of the dilemma, however, and although no individual may have conceived of the idea, culture nevertheless eventually evolved to utilize this solution. The solution is this: If written words must be built out of multiple symbols, then to make words look object-like, make the symbols look like object parts. That’s what culture did. Culture dealt with the speech-writer dilemma by designing letters that look like the object parts found in nature, object junctions, in particular. That way written words will typically be object-like, so that again our visual system can be best harnessed for reading.

Because the geometrical shapes of letters vary considerably across fonts (and across individuals), but do not typically much change in their topology (see Figure 6a), a topological notion of shape is the apt one for studying letter shape. It is also apt because the geometrical shape of a conglomeration of contours in a scene changes with the observer’s viewpoint whereas the topological shape will be highly robust to viewpoint modulations. Figure 6b shows three simple kinds of topological shape, or configuration: L, T and X. Each stands for an infinite class of geometrical shapes having the same topology. Two smoothly curved contours make an L if they meet at their tips, a T if one’s tip meets anywhere along the other (except at the tip), and an X if both contours cross each other. Whereas Ls and Ts commonly occur in the world—as corners and at partial occlusion boundaries as displayed in Figure 6b—Xs do not. And, indeed, Ls and Ts are common, but Xs rare, over the history of human visual signs and nearly a hundred writing systems (see the red squares in Figure 6e). Figure 6c shows four configuration types that are similar in that they each have three contours and two T junctions. Despite these similarities, they are not all the same when it comes to how commonly they can be found in nature. While three of them can be caused by partial occlusions and are thus fairly common, one of them cannot, and is thus rare in nature. Their commonness over the history of writing also shares this asymmetry, the rare-in-nature configuration also rarely occurring among human visual signs (see the green diamonds in Figure 6e). Finally, Figure 6d shows five configurations having three strokes that all meet at a single point, or junction, and one can see that some of these require greater coincidental alignments in the world for them to occur, and are accordingly expected to be rarer in nature. And measurements show that writing over history mimics this relative frequency distribution (see the blue circles in Figure 6e).

Commonness in the world drives commonness in writing. Culture appears to have, over centuries, selected for written words that look object-like, thereby harnessing the natural powers of our visual system, allowing us to read with remarkable efficiency.

This excerpt also appeared at Semiotix.

Mark Changizi is a professor of cognitive science at Rensselaer Polytechnic Institute, and the author of The Vision Revolution (Benbella Books).

Read Full Post »

Older Posts »