here - Computer Science

advertisement

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 1 of 14

12/04/2020

Human Vision – Introduction

Introduction

Even if you wear spectacles, your eyes are functioning at a level which allows you to recognise the typed letters on this page.

Text on most book and computer screens is about 3 millimeters tall and 2 mm wide.

As you read this one sentence, you are probably oblivious to the thousands of pieces of visual information that your eyes are gathering each second. Just in the retina alone, there are millions of cells at work right now acting as photoreceptors reacting to light, similar to how a camera works to capture images on film 1 .

Biological vision is the process of using light reflected from the surrounding world as a way of modifying behavior. Generally, with humans, we say that the surrounding

“At first it appears that nothing could be easier than seeing.

We just point our eyes where we want them to go, and gather in whatever there is to see. Nothing could be less in need of explanation. The world is flooded with light, and everything is available to be seen. We see people, pictures, landscapes, and whatever else we need to see, and with the help of science we can see galaxies and viruses and the insides of our own bodies. Seeing does not interfere with the world or take anything from it, and it does not hurt or damage anything.

Seeing is detached and efficient and rational. Unlike the stomach or the heart, eyes are our own to command: they obey every desire and thought.”

Each one of those ideas is completely wrong. The truth is more difficult: seeing is irrational, inconsistent, and undependable.

It is immensely troubled, cousin to blindness and sexuality, and caught up in the threads of the unconscious. Our eyes are not ours to command: they roam where they will and tell us they have only been where we have sent them. No matter how hard we look, we see very little of what we look at….

Elkins, J., The Object Stares Back, Simon & Schuster: New York, 1996 environment is interpreted by visual input.

This usually implies some form of conscious understanding of the three-dimensional (3D) world from the two-dimensional (2D) projection that it forms on the retina of the eye (Figure 1).

… Even when I am not thinking of the use of objects, they remind me of use. And there is a curious thing here that easily passes unnoticed:

I do not focus on anything that is not connected in some way with my own desires and actions. I fail to see the stretches of wall between the lamp and the coffee cup, or the manila paper of the file folders, or the black plastic calendar holder. My eyes can understand only desire and possession. Anything else is meaningless and therefore invisible.

Elkins, J., The Object Stares Back, Simon & Schuster: New York, 1996

Solving the problem of converting light into ideas, of visually understanding features and objects in the world, is a complex task far beyond the abilities of the world's most powerful computers. Much of our visual computation is carried out unconsciously and often our interpretations can be fallacious. Vision requires distilling foreground from background, recognising objects presented in a wide range of orientations, and accurately interpreting spatial cues.

2

In this part of the module we will briefly overview the human visual system and try to understand the ways in which this system interprets its input. Although not strictly correct, the analogy between machine vision and biological vision is currently the best model available.

Moreover, the models interact in an ever increasing fashion: we use the human visual system as an existence proof that visual interpretation is even possible in the first place, and its response to optical illusions as a way to guide our development of algorithms that replicate the human system; and we use our understanding of machine vision and our ability to generate ever more complex computer images as a way of modifying, or evolving, our visual system in its efforts to interpret the visual world.

Figure 1: Projection of 3D World to 2D Image

1 Bonsor, K., How Artificial Vision Will Work, Web Page at URl: http://health.howstuffworks.com/artificial-vision.htm

2 Brain Connection, How Vision Works, Web Page at URL: http://www.brainconnection.com/topics/printindex.php3?main=anat/vision-work

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 2 of 14

12/04/2020

The Human Eye

The eye is considered by most neuroscientists as part of the brain. It consists of a small spherical globe of approximately 2cm in diameter, which is free to rotate under the control of six extrinsic (or extraocular) muscles.

The tough, outermost layer of the eye is called the sclera. It maintains the shape of the eye. The front sixth of this layer is clear and is called the cornea. All light must first pass through the cornea when it enters the eye. Figure 2: Extrinsic Muscles

The choroid (or uveal tract) is the second layer of the eye. It contains the blood vessels that supply blood to structures of the eye. The front part of the choroid contains two structures:

 The ciliary body - The ciliary body is a muscular area that is attached to

 the lens. It contracts and relaxes to control the size of the lens for focusing.

The iris - The iris is the colored part of the eye. The iris is an adjustable

Interesting Facts about Pupils

Pupil size can change from 2 millimeters to 8 millimeters. This means that by changing the size of the pupil, the eye can change the amount of light that enters it by 30 times.

Pupil size between 6 and 8 mm may indicate the use of cocaine, crack, meth, hallucinogens, crystal, ecstasy or other stimulant. Pupil size between 1 and 2 mm may indicate the use of heroin, opiates or other depressants.

diaphragm around an opening called the pupil.

Inside the eyeball there are two fluid-filled sections separated by the lens. The larger, back section contains a clear, gel-like material called vitreous humor. The smaller, front section contains a clear,

We subconsciously pick up clues from others’ pupil sizes and use them to help us form opinions about people. Eckhard

Hess * , a Chicago biopsychologist, performed a study in which men were shown retouched photographs of women. In half the photographs, the pupils were made to appear larger than normal, and in the other half they were smaller. The men in the study invariably perceived the women with larger pupils as being more attractive and friendlier than the very same women when shown with smaller pupils.

* Hess, E., Attitude and Pupil Size, Scientific America, 212: 46-54, 1965. watery material called aqueous humor.

The lens is a clear, bi-convex structure about 10 mm (0.4 inches) in diameter. The lens changes shape because it is attached to muscles in the ciliary body. The lens is used to fine-tune vision.

The eye is unique in that it is able to move in many directions to maximise the field of vision, yet is protected from injury by a bony cavity called the orbital cavity. The eye is embedded in fat, which provides some cushioning.

The eyelids protect the eye by blinking. This also keeps the surface of the eye moist by spreading tears over the eyes. Eyelashes and eyebrows protect the eye from particles that may injure it.

Tears are produced in the lacrimal glands, which are located above the outer segment of each eye. The tears eventually drain into the inner corner of the eye, into the lacrimal sac, then through the nasal duct and into the nose 3

That is why your nose runs when you cry.

.

Figure 3: Ophthalmoscope View of Eye

3 Bianco, C., How Vision Works – Basic Anatomy, Web Page at URL: http://science.howstuffworks.com/eye1.htm

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 3 of 14

12/04/2020

Human Vision – The Eye

Introduction

The human eye is nature's optical masterpiece, whose sensitivity and performance characteristics approach the absolute limits set by quantum physics. The eye is able to detect as little as a single photon as input, and is capable of adjusting to ranges in light that span many orders of magnitude. No camera has been built that even partially matches this performance.

The eye can focus on a wide range of objects, bright and dull, large and small, far and near, all at dizzying speeds and with uncanny accuracy. We rely on our eyes to provide most of the information we perceive about the world, so much so that a significant portion of the brain is devoted entirely to visual processing. The anatomy of the visual system provides important clues about how the brain is structured in general, and about how we humans solve the complex problem of sight in particular.

4

This is what happens when you look at an object:

 Scattered light from the object enters through the cornea.

 When light enters the eye, it first passes through the cornea, then the aqueous humor, lens and vitreous humor.

 The light is then projected onto the retina.

 The retina sends messages to the brain through the optic nerve.

 The brain interprets what the object is.

The Retina

The retina is complex in itself. This thin membrane at the back of the eye is a vital part of our ability to see. Its main function is to receive and transmit images to the brain, by converting light into neural signals that can be relayed to the cortex through the optic nerve.

The retina consists of a team of five types of cells whose role it is to collect light, extract basic information about color, form, and motion, and pass the pre-processed image on to centers in the brain. These cell types are photoreceptors, bipolar cells, horizontal cells, amacrine cells, and

ganglion cells. They are arranged within the retina in three layers, from the back to the front.

The Layers of the Retina

Light passes all the way through the retina (from left to right in the figure to the right) before reaching the photoreceptor cells at the back.

Photoreceptors convert light signals into neural impulses that are relayed to a variety of other cells types in the retina for processing. The ganglion cells at the front of the retina are the final relay station in the eye, and they pass signals into the brain via the optic nerve.

The location of the optic nerve on the retina obviously prohibits the existence of photoreceptors at this point. This point is known as the blind spot and any light that falls upon it is not perceived by the viewer. Most people are unaware of their blind spot, although it is easy to demonstrate that it exists.

4 Brain Connection, The Anatomy of Vision, Web Page at URL: http://www.brainconnection.com/topics/?main=anat/vision-anat

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 4 of 14

12/04/2020

The Fovea

The fovea is the region of the retina that allows us to see detail. At the fovea, the top two layers of the retina thin out, allowing light to fall directly onto the photoreceptor cells.

Foveal photoreceptors are mostly cone cells, meaning that the fovea is particularly good at seeing color in daylight.

At night, the activity of color-insensitive rod cells in the periphery of the retina dominates

vision.

Rods and Cones

An image shining upon the retina traverses these layers to reach the photoreceptor cells, which absorb the incoming light and transform it into electrochemical signals. Photoreceptors are divided into two subtypes, rods and cones. Generally, the outer segment of rods are long and thin, whereas the outer segment of cones are more cone shaped.

Rod cells are very sensitive to changes in contrast even at low light levels (they can detect a single photon) and can create black and white images without much light, but consequently are imprecise in detecting position (due to light scatter) and insensitive to color. Rods are generally located in the periphery of the retina and used for night vision. Rods function well below

3 cd/m 2 and cones function well at 3 cd/m 2 and above 5 .

Once enough light is available (for example, daylight or artificial light in a room), cones give us the ability to see the colour and detail of objects. These are the high-precision cells that are specialised to detect red, green, or blue light. They are generally located in the center of the retina in a region of high spatial acuity (the fovea). Cones are responsible high acuity tasks like reading (they allow you to read this page) - they allow us to see at a high resolution.

There are about 125 million rods and cones within the retina that act as the eye's photoreceptors. Rods are the most numerous of the two photoreceptors, outnumbering cones

18 to 1. When light contacts these two types of cells, a series of complex chemical reactions occurs. The chemical that is formed (activated rhodopsin) creates electrical impulses in the optic nerve.

The photoreceptors are in a continuous turnover, and the outer segment ‘discs’ have a turnover of around 9 to 13 days. The rod disc shedding is maximal in the morning, and cone shedding maximal at dusk. These are probably mediated through melatonin. The information received by the rods and cones are then transmitted to the nearly 1 million ganglion cells in the retina. These ganglion cells interpret the messages from the rods and cones and send the information on to the brain by way of the optic nerve.

5 Candelas per meter squared, or cd/m², is a unit of measure that used to be called "nits." Candela is a term that originated in the days when candles were used in theaters. Candelas per meter squared measures the light properties radiating from a one-meter-square surface.

3 cd/m 2 is the light level around sunset.

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 5 of 14

12/04/2020

The Visual Cortex

A small group of fibers in the optic nerve splits off and travels down to brainstem nuclei, which are groups of cells governing reflex actions. Those fibers mediate automatic responses, such as adjusting the size of the pupil, blinking, and coordinating the movement of the eyes. The majority of fibers in the optic nerve take another path that leads to the very back of the brain, to a part of the occipital lobe called primary visual cortex, or V1.

On the way to V1, these fibers enter a nucleus in

The cerebral cortex is the extensive outer layer of gray matter of the cerebral hemispheres, and is involved in higher brain functions, including sensation, voluntary muscle movement, thought, reasoning, and memory. The grooves between the

Gyri on the brain's surface results in much of the cortex being buried; over 60% of the cerebral cortex in primates is buried and not visible from the surface. the center of the brain called the thalamus. The thalamus acts as a central depot for information coming into and going out of the cortex, and it has centers specialised for different types of information. The center that deals with vision is called the Lateral Geniculate Nucleus (LGN), a layered structure with cells that respond to form, motion, and color. Fibers from the optic nerve enter the LGN, where streams of information about the Cerebral cortex is typically 2-4mm thick and folded, but if unfolded it has the same area as a

21” television. visual image are further separated and then sent on to the primary visual cortex.

The primary visual cortex (V1) is part of the cerebral cortex. The V1, also called striate cortex because of the distinctive stripe it bears, is responsible for creating the basis of a threedimensional map of visual space, and extracting features about the form and orientation of objects. Once basic processing has occurred in V1, the visual signal enters the secondary visual cortex, V2, which surrounds V1.

Visual Acuity

Visual acuity is a measure of the spatial resolving power of the visual system; it indicates the angular size of the smallest detail that can be resolved. Visual acuity is measured for various purposes. When determining the appropriate eyeglasses, the corrective lens power that permits the best visual acuity is prescribed. In the diagnosis and monitoring of eye diseases that may affect vision, changes of visual acuity are often taken to indicate the presence and magnitude of change in the eye condition 6 .

Visual acuity measurements are also used by some licensing authorities and employers as eligibility criteria for some occupations (e.g., airline pilot, police officer) and activities (e.g. driving). Visual acuity has traditionally been used as the primary indicator of the magnitude of functional impairment due to vision loss.

6 Agingeye Times, What is 20/20 Vision, Web Page at URl: http://www.agingeye.net/visionbasics/healthyvision.php#

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 6 of 14

12/04/2020

Good visual acuity is important for a variety of everyday tasks in the workplace, but probably is most important for reading text and interpreting symbols. The visual acuity demand for a given task depends on the minimum size of the detail in the task and the observation distance. For example, a person with good visual acuity might be expected to recognise faces at about 20 meters. To recognize the same faces, a person with poor visual acuity would have to get significantly closer.

A person's "uncorrected" vision refers to the visual acuity when no glasses or contact lenses are used. The "best corrected" vision is the visual acuity with the best glasses or contact lens prescription for that person. Each eye is usually tested separately, although the vision may be slightly better when both eyes are tested together.

The notation of visual acuity is written as a fraction, with normal vision being 20/20 (twenty twenty vision). At a 20 foot distance, (the top number in the fraction, or testing distance), a person with normal vision should be able to read the small 20/20 line on an eye chart. The smallest line that you can read on the chart is your visual acuity.

If larger lines than the 20/20 line are all that can be read, the visual acuity may be 20/30, 20/60, etc. The larger the second number is, the worse is the vision. A person with 20/200 vision would have to come up to 20 feet to see a letter that a person with normal vision could see at 200 feet!

Similarly, if the vision is 20/10, it means that the vision is better than normal.

A person with 20/10 vision can read a letter at 20 feet that a person with normal vision would have to come up to 10 feet to read. Eye charts in offices are calibrated for different test distances, so that rooms do not have to be 20 feet long.

Certain visual acuities have special significance. Some of these are 7 :

• 20/20 vision is considered normal vision

• 20/40 vision in at least one eye is the vision required to pass the driving test

• 20/200 vision or worse is the legal definition of blindness

Dark Adaptation

Our eyes have to dark-adapt before we can see in dark or dim light. Usually, this requires fifteen to twenty minutes (or more) in an environment as dark as the environment you will be viewing.

First of all, the pupil needs to dilate to its maximum aperture in order to collect the most light.

Another component of night vision is contained in the biochemistry of the eye.

Dark adaptation is how the visual system adapts when going from a bright environment to a dark one. Light adaptation happens when we go from a dark environment to a bright one. For example, you spend the afternoon in the movies and when you leave the sun is still shining, your eyes may hurt when you get outdoors. On a snowy winters day you go from indoors where the lighting is moderate and you find that it is difficult to see for a while, perhaps as long as a minute.

One of the major differences between dark adaptation and light adaptation is their time course.

Whereas dark adaptation takes about 30 minutes to be complete, light adaptation happens very quickly, usually in less than a minute. Another difference between these to type of adaptation is that when you are light adapted and then go into a very dark room for a while you may not see anything at all. As you dark adapt more and more things become visible. When you go from a darker area to a very bright one you usually are not temporarily blind. It is just that your vision, temporarily, is not very good.

In technical jargon, your contrast sensitivity is poor until you become light adapted. By that we mean that you will have difficulty in perceiving areas of low contrast. It is like everything is all washed out. But as you quickly light adapt the darker areas become darker and the lighter areas become easier to see.

7 Agingeye Times, What is 20/20 Vision, Web Page at URl: http://www.agingeye.net/visionbasics/healthyvision.php#

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 7 of 14

12/04/2020

Human Vision – Illusions

Introduction to Illusion

Amongst the senses, Plato gave primacy to sight. When he decided that we had five senses,

Aristotle ranked sight over hearing: 'Of all the senses, trust only the sense of sight'. Plato and

Aristotle closely associated vision and reason.

This has been a persistent bias in Western culture. Thinking is associated with visual metaphors:

'observation' privileges visual data; phenomenon (Greek: 'exposing to sight'); definition (from definire, to draw a line around); insight, illuminate, shedding light, enlighten, vision, reflection, clarity, survey, perspective, point of view, overview, farsighted.

Other words associated with thinking also have visual roots: intelligent, idea, theory, contemplate, speculate, bright, brilliant and dull. And there is no shortage of commonly-used phrases which emphasise the primacy of the visual: Seeing is believing , Let me see, I see, I'll believe when I see with my own eyes, Seeing eye to eye, The mind's eye, See what I mean?

It is likely that the spread of literacy in modern times has helped to privilege sight.

At this stage, it is useful to note that theories about perception tend to emphasise the role of either sensory data or knowledge in the process. Some theorists adopt a data-driven or 'bottomup' stance, according to which perception is 'direct': visual data is immediately structured in the optical array prior to any selectivity on the part of the perceiver. Others adopt a 'constructivist' or

'top-down' stance emphasising the importance of prior knowledge and hypotheses. Both processes are important: if we were purely data-driven we would be mindless automatons; if we were purely theory-driven we would be disembodied dreamers.

8

One thing we do know, is that the human visual system is a mixture of distributed and central processing. There are at least two levels of vision, autonomous and perceptual. Shapes are perceived primarily by contours, edges and light level changes.

Vision without contours (edge detection) can cause physical and psychological disorientation.

However, the edges we see may not be real.

Mach Banding

We will start with a simple vision problem\illusion, Mach bands. This illusion describes a particular vision property which many optical illusions exploit.

In the image above, Mach bands appear at the boundary between squares of differing gray levels.

Along the boundary the dark side looks darker and the light side looks lighter. The Mach band effect exaggerates the change in intensity at any boundary where there is a discontinuity in magnitude or slope of intensity.

8 Chandler, D., Visual Perception – http://www.aber.ac.uk/media/Modules/MC10220/visindex.html

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

These bars, known as Mach Bands after their discoverer, the physicist Ernst

Mach are illusory. Mach banding is caused by lateral inhibition of the receptors in the eye. As receptors receive light they draw light-sensitive chemical compounds from adjacent regions, thus inhibiting the response of receptors in those regions.

Receptors directly on the lighter side of the boundary can pull in unused chemicals from the darker side, and thus produce a stronger response.

Receptors on the on the darker side of the boundary, however, produce a weaker effect because of that same migration.

Page 8 of 14

12/04/2020

The best way to explain this is with an example. In the graph above, the solid black curve represents the amount of light being reflected from the figure on the left. The dashed red curve represents the brightness of this figure as it is usually perceived. To the left of the point where the figure just starts to get lighter people usually see a dark bar that is slightly darker that the area to the left of it. At the point where the brightness just stops increasing, people usually perceive a bright bar.

The receptive fields are represented as a disk (+) and annulus (-) in the figure above. The center disk is an excitatory area and the annulus an inhibitory area. The receptive fields in the uniformly white and uniformly black areas receive about the same stimulation in their excitatory centers and inhibitory surrounds. Therefore the center excitations are in balance with the surround inhibitions.

The receptive field over the bright Mach Band gives a stronger response in the center because part of the surround is in the darker area. Therefore it receives less inhibition from the surround than did the center at the extreme left and right ends. The receptive field over the dark band receives more surround inhibition because part of the surround is in the brighter area. Therefore, the excitatory response is less and results in seeing the area as darker.

Grouping and Similarity

It is natural for humans to group together similar visual objects. The image on the right is a prime example; humans are more likely to impose a particular grouping on what they see.

People tend to refer to five pairs of lines which are close together with fairly broad gaps between them. They are less likely to group together the lines

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 9 of 14

12/04/2020 which are further apart, perhaps partly because this would leave lonely lines on each side of the image, but also, perhaps, because we seem to have a predisposition to associate things which are close together.

Groupings can be triggered by position or similarity. In the images below the way we group the shapes in the image to the left will probably be different to the way we group the shapes in the image to the right.

Shapes are also seen in order to maintain continuity, humans fill in the gaps. The context in which a shape is seen can also determine its perception.

Categories of Illusions

A picture assembled on the basis of partial information must be expected to occasionally be in error. The mind will always try to match stimulus and memory to create a picture. It will make what seems to be the most likely choice, and present that to the consciousness. An illusion occurs when the choice is incorrect. If a picture is created solely from memory, without visual stimulus

(or with only a minimal visual stimulus) the result is hallucination, with which we shall not be concerned here, since it is a disorder of perception, not a normal or intended part of it. Things that are not there can also appear in illusion, it must be emphasised, but here it is normal.

9

An illusion can arise in any of the three links of visual perception. The

mirage is an example of an external illusion, created in the first, physical link of light rays. It is visually interpreted as an actual scene, though we consciously recognise it as an illusion, and understand its cause. When we stare at a brightly illuminated red disk for a time, then transfer our attention to a white paper, we see a green disk as a result of what is called rather inaccurately fatigue. The green disk is an illusion created in the second partly physical, partly mental link. When the full moon is seen at the horizon, it seems much larger than when riding high in the sky, though physically it subtends exactly the same angle at the eye. This familiar illusion occurs in the third, mental link of vision, and a satisfying explanation of it is unknown.

Optical Illusions

Illusions occurring in the third link are those most generally recognised as optical illusions. Their scientific study began with 1854 10 , much work was done later in the century, but tapered off after

1900, although the subject is still actively researched by psychologists. Recent work deals largely with color and motion illusions, not on the static, black and white illusions that dominated earlier work. Popular interest in optical illusions has been sustained 11 .

9 Calvert, J. B., Illusions, Web Page at URL: http://www.du.edu/~jcalvert/optics/illusion.htm

10 Oppel, J. in Jahresberichte des physikalisches Vereins zu Frankfurt, p. 138, 1854.

11 The books by M. Luckeish (Visual Illusions, 1920), S. Tolansky (Optical Illusions, 1964), and M. Fineman (The Nature of Visual Illusion,

1981) are evidence of the continuing fascination.

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 10 of 14

12/04/2020

Sometimes a phenomenon is called an illusion when it really is not, but is simply a true picture of an unexpected observation. An example is the searchlight illusion described by Luckeish. The beam of a bright searchlight is visible because of scattering by dust and fog in its path, so that it seems practically a physical object. When the beam is projected up into the sky, it seems to vanish abruptly while still in full glory. When you look at this apparent end of the beam, you are looking in the direction in which the beam is pointed. If the beam were parallel (as your mind expects) it would, by perspective, narrow to a point. However, a searchlight beam is actually more or less divergent, fooling this expectation. It is only one's mental interpretation that is an illusion in this case, not the observation. Stars can be pointed out to others by means of a strong laser using this effect. If you view the searchlight beam from a distance, you see it diverge and become attenuated, and perhaps penetrating the layer of dusty air.

Camouflage

Tricking the eye into recognising one thing while observing another is often very useful to living things.

There are three different ways to do this. First, one might mimic something dangerous or nasty-tasting, as does the fly who resembles a wasp, or a brightly-colored butterfly with large eyes on its wings. Another way is to merge with the background, as do moths, stick insects, tabby cats etc. An interesting way to do this is to break up a familiar outline by a contrasting pattern. Warships were painted in bold, zig-zag patterns in the First World

War for this purpose. The patterns did indeed break up the outline when you were close enough to see that they were ships, but at large distances aerial perspective

(blue haze) smoothed the pattern, revealing again that they were ships. The third way is to look like something else. Cylindrical snakes and lizards are dark on top and light on the bottom, contrary to the normal modelling of a cylinder, so they resemble flat objects containing no meat.

Classic Illusions

A picture drawn on a flat background is an attempt to trick the eye into perceiving a three-dimensional scene. This is very effective, since the eye must do something similar in its normal functioning, because the retina is two-dimensional.

The skill of perceiving depth and perspective in a painting is learned, not innate. In moving pictures, the mind interprets the succession of static frames as continuous motion, again something it must do in its normal functioning.

Let's look at some classic static illusions created by black-andwhite figures. All are third-link illusions resulting from a failure of estimation, or from the faulty comparison of distances or objects. In the bisection illusion, the vertical line is the same length as the horizontal line it bisects, though it seems about

25% longer. The illusion persists if the figure is rotated 90°, so it is not due to asymmetry of the retina, as one witless psychologist asserted. In the Müller-Lyer illusion, the line is bisected by the center arrowhead. The segment with the diverging wings appears longer, but it is not. In the annulus illusion, the area of the central disk is equal to the area of the annulus surrounding it, though it appears greater. Distance b-c in the lozenge illusion is equal to distance a-b, though appearing significantly longer. In the curvature illusion, all three arcs have exactly the same radius of curvature. Poggendorff's illusion is very

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 11 of 14

12/04/2020 famous. Line 2 is actually the continuation of the line on the left, although line 1 appears to be.

Greek temples were designed with deliberate distortions to make the building appear correctly.

Columns were given entasis, a slight swelling in the middle, so they would look straight, and architraves were cambered up slightly in the center so they would appear straight. Modern buildings are not so sensitively designed.

There is no satisfying explanation for any of these illusions, or even of the reasons why they should exist. Depth clues are not involved in any of them, at least obviously, or ambiguous or incomplete information. They can, however, be recognised and classified, and have some practical application.

Ambiguous Figures

Sometimes a view may not contain enough information for the mind to make a conclusive interpretation. Where there are only two reasonable interpretations, the mind may alternate them, as if unable to make up its mind. The rate of alternation gives some idea of the length of time between reconsiderations of input data by the visual system, or of the operation of the short-term memory that is so necessary to avoid overload in the face of the flood of information bombarding the mind.

In the ambiguous figures shown, the one on the left can be interpreted either as an open book, or as a folded card with the fold towards you. The cube can be interpreted either with the diagonal line in the lower left-hand corner out of the page, or behind it. Vision is not really fooled here; there is simply insufficient depth information for a conclusive choice. Modifying the figures to give better depth clues, as shown, makes the interpretation unique. In one case, the figure was made to resemble a definite object, an open book, and in the other hidden lines were removed to make the cube appear solid.

Other Illusions

Illusions can also arise from contrast of brightness, as the perception strives to maintain line and shade. The well-known illusion shown at the right is an example. There are gray patches at every crossing, except for the one you are looking at directly.

Illusions of motion and color are difficult to illustrate in text, and are so extensive as to require individual study. Color can be perceived in a rotating black-and-white disc of suitable pattern, which is probably due to different fatigue characteristics of the color-sensitive proteins in the cone cells of the retina.

Many color illusions are due to physical causes, because of the poor spectral resolution of the eye, and differences in illuminants and pigments. Adaptation, where the ambient illumination comes to appear as white as possible, and color constancy, where colors are interpreted similarly under different conditions of illumination, are fundamental and useful properties of the color sense, not illusions.

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 12 of 14

12/04/2020

Eye Movements

There are four basic types of eye movements: saccades, smooth pursuit movements, vergence movements, and vestibulo-ocular movements. The functions of each type of eye movement are introduced here: 12

Saccades are rapid, ballistic movements of the eyes that abruptly change the point of fixation.

They range in amplitude from the small movements made while reading, for example, to the much larger movements made while gazing around a room. Saccades can be elicited voluntarily, but occur reflexively whenever the eyes are open, even when fixated on a target. The rapid eye movements that occur during an important phase of sleep are also saccades.

Smooth pursuit movements are much slower tracking movements of the eyes designed to keep a moving stimulus on the fovea. Such movements are under voluntary control in the sense that the observer can choose whether or not to track a moving stimulus. (Saccades can also be voluntary, but are also made unconsciously.) Surprisingly, however, only highly trained observers can make a smooth pursuit movement in the absence of a moving target. Most people who try to move their eyes in a smooth fashion without a moving target simply make a saccade.

Vergence movements align the fovea of each eye with targets located at different distances from the observer. Unlike other types of eye movements in which the two eyes move in the same direction (conjugate eye movements), vergence movements are disconjugate (or disjunctive); they involve either a convergence or divergence of the lines of sight of each eye to see an object that is nearer or farther away.

Vestibulo-ocular movements stabilise the eyes relative to the external world, thus compensating for head movements. These reflex responses prevent visual images from "slipping" on the surface of the retina as head position varies. The action of vestibulo-ocular movements can be appreciated by fixating an object and moving the head from side to side; the eyes automatically compensate for the head movement by moving the same distance but in the opposite direction, thus keeping the image of the object at more or less the same place on the retina. The vestibular system detects brief, transient changes in head position and produces rapid corrective eye movements.

Reading is not performed through continuous eye movement but through sudden changes of fixation, fixating a given point in a space, encompassing the surrounding letters. The speed limit of the eye to shift from one fixation to another is determined by the time that brain takes to process the information input. Faster reading is not achieved by quicker eye movement but an expansion of visual field.

Depth Perception

Depth perception is the ability to see the world in three dimensions and to perceive distance.

Although this ability may seem simple, depth perception is remarkable when you consider that the images projected on each retina are twodimensional. From these flat images, we construct a vivid three-dimensional world. To perceive depth, we depend on two main sources of information: binocular disparity, a depth cue that requires both eyes; and monocular cues, which allow us to perceive depth with just one eye.

12 Purves, D. et al, Neuro-Science, Second Edition, Sinaur 2001.

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 13 of 14

12/04/2020

Binocular Disparity

Perhaps the most important perceptual cues of distance and depth depend on so-called binocular disparity. Because our eyes are spaced apart, the left and right retinas receive slightly different images. This difference in the left and right images is called binocular disparity. The brain integrates these two images into a single three-dimensional image, allowing us to perceive depth and distance. The phenomenon of binocular disparity functions primarily in near space because with objects at considerable distances from the viewer the angular difference between the two retinal images diminishes.

Monocular Disparity

Monocular cues are cues to depth that are effective when viewed with only one eye. Although there are many kinds of monocular cues, the most important are interposition, atmospheric perspective, texture gradient, linear perspective, size cues, height cues, and motion parallax.

Interposition: Probably the most important monocular cue is interposition, or overlap. When one object overlaps or partly blocks our view of another object, we judge the covered object as being farther away from us.

Atmospheric Perspective: The air contains microscopic particles of dust and moisture that make distant objects look hazy or blurry. This effect is called atmospheric perspective, and we use it to judge distance.

Texture Gradient: A texture gradient arises whenever we view a surface from a slant, rather than directly from above. The texture becomes denser and less detailed as the surface recedes into the background, and this information helps us to judge depth.

Linear Perspective: Linear perspective refers to the fact that parallel lines, such as railroad tracks, appear to converge with distance, eventually reaching a vanishing point at the horizon.

The more the lines converge, the farther away they appear.

Size Cues: Another visual cue to apparent depth is closely related to size constancy. If we assume that two objects are the same size, we perceive the object that casts a smaller retinal image as farther away than the object that casts a larger retinal image. This depth cue is known as relative size, because we consider the size of an object's retinal image relative to other objects when estimating its distance.

Another depth cue involves the familiar size of objects. Through experience, we become familiar with the standard size of certain objects. Knowing the size of these objects helps us judge our distance from them and from objects around them.

Height Cues: We perceive points nearer to the horizon as more distant than points that are farther away fom the horizon. This means that below the horizon, objects higher in the visual field appear farther away than those that are lower. Above the horizon, objects lower in the visual field appear farther away than those that are higher. This depth cue is called relative height, because when judging an object's distance, we consider its height in our visual field relative to other objects.

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Seminar in Advanced Topics (HCI 530)

Human Computer Interaction

Page 14 of 14

12/04/2020

Motion Parallax: Motion parallax appears when objects at different distances from you appear to move at different rates when you are in motion. The rate of an object's movement provides a cue to its distance. The more distant objects appear to move in a more slower pace.

Movement detection

Movement detection is the oldest and most important visual process. All seeing animals can detect movement, though they may vary widely in their other perceptual abilities. If one regards vision as an "early warning" system, aiding in the detection of predators, prey and other biologicallysalient events, then the important thing to detect is environmental change, and movement always accompanies this. Movement helps in encoding the third dimension (by means of motion parallax) and aids image segmentation and hence object recognition.

Research suggests that retinal ganglion cells signal the presence and direction of movement by adding up the excitation and inhibition acting upon them. The output of one group of receptors

(R1) is excitatory and goes directly to the ganglion cell. The output from a second set of receptors

(R2) is inhibitory and reaches the ganglion cell after a short delay.

Consequently, if the stimulus moves in one direction, it stimulates first R1 and then R2; the excitation from R1 reaches the ganglion cell before the inhibition from R2, and the ganglion cell fires. If the stimulus moves in the opposite direction, the inhibition from R2 reaches the ganglion cell at the same time as the excitation from R1; the excitation and inhibition cancel out; and the ganglion cell does not fire. Organisation of these cells into "opponent pairs" would stop the system being fooled by stationary stimuli.

There are various types of apparent (i.e. illusory) movement. As with other illusions, they give us some insight into the processes underlying normal movement perception.

(a) Induced Movement: Movement perception seems to embody an "assumption" about the natural world - that larger objects are usually stationary and smaller objects usually move.

(b) Apparent motion (stroboscopic motion): If two lights are flashed on and off in succession, with an appropriate interval between them, the viewer perceives one light that appears to move from the position of the first light to the position of the second.

(c) Autokinetic movement: A small spot of light, inspected in an otherwise totally dark room, appears to wander around spontaneously and apparently randomly. The cause is still unknown.

(d) Movement after-effect (MAE); the "Waterfall" illusion: After prolonged viewing of a pattern that moves in one direction, then on looking elsewhere, everything in the new scene appears to move in the opposite direction. This could be explained by fatigue of the motion detectors.

Final Comment on Vision

To finish the topic of vision a quote taken from “The Object Stares Back” based on the work of a

French surrealist writer.

13

There are three things that throw the eyes into such confusion that they may not be able to see at all:

The Sun, Death and Genitals.

Confronted with objects like these, vision goes out of control, and we see where we do not want

to, or fail to see where we should, and our eyes no longer obey our conscious wishes.

13 Georges Bataille, Oevres Completes, ed (Paris, 1970), Volume 1.

Damian Schofield, Room 117, Syngg Hall, Department of Computer Science, SUNY Oswego, Oswego 13126

1 (410) 504 3178 – schofield@cs.oswgo.edu – www.cs.oswego.edu/~schofield

Download