Perception Kurt Akeley CS248 Lecture 18 29 November 2007

advertisement
Perception
Kurt Akeley
CS248 Lecture 18
29 November 2007
http://graphics.stanford.edu/courses/cs248-07/
Today
This is the last for-credit lecture

Material from next-weeks lectures will not be tested
Emphasize perception

Pull together and re-emphasize ideas from earlier lectures

Introduce some new ideas

Tie everything back to performance
CS248 Lecture 18
Kurt Akeley, Fall 2007
Optical quality of the eye
Range of focus:
• 5” to infinity (you)
• 40” to infinity (me, corrected)
Fovea
What is the image of this (ideal) line?
CS248 Lecture 18
Image from www.wikipedia.com
Kurt Akeley, Fall 2007
Retinal image of an ideal line
CS248 Lecture 18
Eye image from www.wikipedia.com
Kurt Akeley, Fall 2007
Line spread function
CS248 Lecture 18
Kurt Akeley, Fall 2007
Retinal image of a sine wave grating
Lower
contrast
CS248 Lecture 18
Eye image from www.wikipedia.com
Kurt Akeley, Fall 2007
Modulation transfer function
CS248 Lecture 18
Kurt Akeley, Fall 2007
Ricco’s Law
Area and intensity are indistinguishable for objects that subtend
less than (roughly) 6 arc min.
This allows antialiasing to work

Especially fractional-width points and lines
Antialiased pixels should subtend less than 6 arc min
CS248 Lecture 18
Kurt Akeley, Fall 2007
Ricco’s Law and line spread (a coincidence?)
6 arc min
CS248 Lecture 18
Kurt Akeley, Fall 2007
Spatial resolution of the eye
Cone spacing in the fovea:

L and M cones: 0.5 arc min

S cones: 10 arc min
Thus the lower spectral
response seen in the color
theory lecture
Nyquist frequency for foveal photopic vision is 60 cpd

Half the 120 cone/deg density
Nyquist frequency is much lower outside the fovea

Effective receptor density falls to 1/20th that of the fovea ?

Rendering can take advantage of this

E.g., insets in flight-simulation graphics accelerators
CS248 Lecture 18
Kurt Akeley, Fall 2007
No aliasing in foveal vision
Peripheral Nyquist
frequency (approximate)
Foveal Nyquist
frequency
CS248 Lecture 18
Kurt Akeley, Fall 2007
No aliasing in foveal S cones either
Optics of the eye are substantially worse for 400 nm light
MTF did not show this (it is an aggregate)
CS248 Lecture 18
Kurt Akeley, Fall 2007
Vernier acuity
Can detect an offset of 5 arc sec
But sensor spacing is 30 arc sec
How does this work?

Not due to random sensor
locations (works with very short
lines)
5 arc sec
CS248 Lecture 18
Kurt Akeley, Fall 2007
How vernier acuity (probably) works
Cone spacing
CS248 Lecture 18
Kurt Akeley, Fall 2007
Display resolution
h = screen height
d = viewing distance
p = pixel count
h
q = pixel angle
æ ö
-1 ç h ÷
q =tan ç ÷
÷
÷
çè pd ø
æ 12" ö
÷
.028 = tan çç
= 1.68 arc min
÷
çè1024 ×24" ÷
ø
- 1
CS248 Lecture 18
d
θ
Satisfies Ricco’s Law (less
than 6 arc min)
Kurt Akeley, Fall 2007
Matching foveal resolution
h = screen height
d = viewing distance
p = pixel count
q = pixel angle
æ ö
-1 ç h ÷
q =tan ç ÷
÷
çè pd ø÷
Foveal
resolution
æ 12" ö÷
.00833 = tan çç
÷= 0.5 arc min
çè3438 ×24"ø÷
- 1
CS248 Lecture 18
Kurt Akeley, Fall 2007
Flicker
Flicker fusion threshold

Statistically 16 Hz
Increases

In peripheral vision

With brighter scenes

With viewer fatigue
Hence “jumping” numeric or CRT
displays, when you aren’t looking
directly at them
Flicker rates:

Movies: 48 Hz (typical), 72 Hz (using computer displays)

Video: 60 Hz (US NTSC), 50 Hz (Europe and Asia, PAL)

Computer displays: 60-100 Hz (CRT), no flicker (LCD)

Fluorescent lights: 120 Hz (US), 100 Hz (Europe, Asia)
CS248 Lecture 18
Kurt Akeley, Fall 2007
Frame rate vs. flicker rate
Increasing flicker rate above frame rate:

Avoids flicker-rate problems

But introduces visual artifacts

Image doubling (2x) or even tripling (3x)
Media
Frame rate
Flicker rate
Movie
24
48 or 72
Television
25 or 30*
50 or 60
Visual simulation
60
60
CS248 Lecture 18
Kurt Akeley, Fall 2007
Interlaced displays
Two fields per “frame”

Display odd lines in the first field

Display even lines in the second field
“Frame” is misleading:


True interlaced sampling is “flying spot”

Each pixel is sampled and displayed at proportional times

Motion artifacts are avoided
Interlaced frames (e.g., video display of a movie)

All pixels are sampled at the same moment

But display is sequential, causing motion artifacts
Still common in video

1080i is standard

1080p is becoming more common
CS248 Lecture 18
Big battle during
definition of HDTV!
Kurt Akeley, Fall 2007
Interlacing and antialiasing
Small moving objects can disappear

Object subtends a single pixel

Fields are rendered properly (not from a single frame)
One solution is antilaliasing with a large filter kernel

Rendered objects necessarily subtend more than a single
pixel
Field n
CS248 Lecture 18
Field n+1
Field n+2
Kurt Akeley, Fall 2007
Color sequential displays
Time-sequential red, green, blue (and sometimes white)
Examples:

Many digital projectors

Professional head-mounted displays
Should render each “frame” separately

Movies don’t


So time sequential projectors yield “rainbow” effects
Simulation systems do

So motion artifacts are avoided
CS248 Lecture 18
Kurt Akeley, Fall 2007
Mach banding – slope discontinuities
Same peak intensities
CS248 Lecture 18
Kurt Akeley, Fall 2007
Human response is not linear
Twice as many photons/sec does not appear twice as bright
Instead 5.7 times as many photons appear twice as bright
Brightness (human perception) and intensity (actual photon rate)
are related by Steven’s Power Law:
B = k ×I 0.4
CS248 Lecture 18
Kurt Akeley, Fall 2007
Human sensitivity is not linear either
Can distinguish intensity differences of 1%

Static images

Photopic (intensities bright enough for cones to see)
This corresponds to a linear change in brightness
0.4
VB = k ×(1.01×I) - k ×I 0.4
0.4
= k ×(0.01)
CS248 Lecture 18
Kurt Akeley, Fall 2007
Motion matters
CS248 Lecture 18
Kurt Akeley, Fall 2007
Numeric representation
Optimal numeric representation would arrange for adjacent
intensities to be (barely) indistinguishable.
Thus optimal numeric representation is

nonlinear in intensity (relative differences of 1 percent)

but linear in brightness (absolute differences of k(0.01)0.4)
CS248 Lecture 18
Kurt Akeley, Fall 2007
Contrast ratio
Visible contrast:

4-5 orders of magnitude within a scene (at the same time)

6 orders of magnitude of “adaptation”

Can take up to 40 minutes, though
Bits
CS248 Lecture 18
Delta I
per step
Delta I
total
8
1.0 %
12.6
8
1.8 %
100
10
0.7 %
1000
Kurt Akeley, Fall 2007
Solutions
Brightness-linear storage

Use linear arithmetic (get incorrect answers)

Use non-linear arithmetic (get correct answers)

Convert convert to intensity-linear, operate, convert back

Implement nonlinear arithmetic
Intensity-linear storage

Gamma correct (convert to brightness-linear form) when
displaying
CS248 Lecture 18
Kurt Akeley, Fall 2007
Brightness-linear storage
Intensities can be added, brightnesses cannot
Store image linear in brightness (unusual in 3-D systems)
n

Best use of available storage precision

256 representable levels are enough

Requires conversion for each pixel operation (e.g., blend)
Gamma
converter
CS248 Lecture 18
8
8-bit
frame
buffer
8
DAC
Display
Kurt Akeley, Fall 2007
Intensity-linear storage
Store image linear in intensity (typical in 3-D systems)
n

Native arithmetic format

Requires conversion during display

Large brightness steps at low intensities

256 DAC levels is OK, but frame buffer needs more
n-bit
frame
buffer
CS248 Lecture 18
n
Gamma
converter
8
DAC
Display
Kurt Akeley, Fall 2007
What is n ?
Assume
0.4166

8-bit DAC

Gamma of 2.4
æinput
ö
2.4
output = çç n
255 ÷
÷
÷
çè 2 - 1
ø
Output
n=8
255
254
Output
n=10
255
255
3
2
1
0
40
34
25
0
22
19
14
0
Output
n=12
255
255
Output
n=14
255
255
Output
n=16
255
255
7
6
4
0
4
3
2
0
…
Table
input
2**n-1
2**n-2
CS248 Lecture 18
13
11
8
0
Kurt Akeley, Fall 2007
Display gamut
No finite set of primaries
can reproduce the entire
gamut. But more primaries
do a better job.
CS248 Lecture 18
Kurt Akeley, Fall 2007
Perception and Performance
(adapted from my VR2004 keynote)
CS248 Lecture 18
Kurt Akeley, Fall 2007
Latency
For an out-the-window display

100 to 150 milliseconds
For a head-mounted display

5 to 15 milliseconds**
Total response latency, sum of

Tracking/input delay, plus

Rendering delay, plus

Display delay
A 72 Hz display refreshes every 14 ms
CS248 Lecture 18
** source: Fred Brooks
Kurt Akeley, Fall 2007
Latency solution
Reduce system latency to 5-15 ms range
Requires 2-4 ms frame time (250-500 Hz)

Assuming 3-frame latency
Estimated cost: 5x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Running total
Cost
5x
Feature
Low latency
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Kurt Akeley, Fall 2007
Stereo solution
Binocular disparity is a very strong visual cue
Must render separately for each eye

Occlusion

View-dependent lighting (e.g. reflections, specularity)

Alternatives tend to be hacks
Estimated cost: 2x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Running total
Cost
5x
2x
Feature
Low latency
Stereo
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Two independent views
Kurt Akeley, Fall 2007
Incorrect retinal cue – blur gradient
Correct
CS248 Lecture 18
Incorrect
Kurt Akeley, Fall 2007
Focus cue solution
Multiple image plane display

Fixed relationship to viewer (e.g. head mounted)

Low resolution in depth

Non-occluding images with depth filtering

Separate left and right displays (2x cost already accounted)
Leverages 2D technology

Amounts to a 2.5D display
f
Cost estimate: 3x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Running total
Cost
5x
2x
3x
Feature
Low latency
Stereo
Correct focus cues
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Two independent views
Multi-plane display
Kurt Akeley, Fall 2007
High Dynamic Range (HDR)
Human limitations
Numbers from
Sunnybrook
Technologies

1,000,000:1 range of sensitivity

100,000:1 contrast within scene
Current displays

CRT 300:1 contrast ratio

LCD 1000:1 contrast ratio
SIGGRAPH 2003 ET

Sunnybrook Technologies
CS248 Lecture 18
Kurt Akeley, Fall 2007
Sunnybrook Technologies
Dual-density display

Conventional LCD panel in front (full-resolution)

White LED array used as back-light (~1/50 resolution)
CS248 Lecture 18
Kurt Akeley, Fall 2007
Sunnybrook Technologies
Scattering masks low resolution LEDs
CS248 Lecture 18
Kurt Akeley, Fall 2007
HDR solution
Requires 16-bit framebuffer components

Rendering

Blending

Full-scene anti-aliasing
Requires multi-resolution rendering

Full-resolution for LCD, corrected for back-lighting

Low-resolution for back-lighting
Estimated cost: 2x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Running total
Cost
5x
2x
3x
2x
Feature
Low latency
Stereo
Correct focus cues
High dynamic range
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Two independent views
Multi-plane display
Multi-resolution rendering
Kurt Akeley, Fall 2007
Field of view
Human field of view (FOV)

Monocular: 160 deg (wide) x 135 deg (high)

Binocular: 200 deg (wide)

Binocular overlap: 120 deg (wide)
Typical screen FOV

55 deg (wide) x 41 deg (high)
d
CS248 Lecture 18
d
Kurt Akeley, Fall 2007
Optical flow matters
“Women Go With the (Optical) Flow”, Desney S. Tan, Mary
Czerwinski, George Robertson.
http://research.microsoft.com/users/marycz/chi2003flow.pdf
CS248 Lecture 18
Kurt Akeley, Fall 2007
FOV solution
Double horizontal FOV to 110 degrees
Double vertical FOV to 80 degrees
Cleverness to distribute resolution ?

e.g. cylindrical projection
Estimated cost: 5x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Pixels subtend different angles
Assumes planar display
9
8
7
Center pixel
6
Edge Pixel
4
5
3
2
1
0
0
CS248 Lecture 18
20
40
60
80
100 120 140
Field of view
Kurt Akeley, Fall 2007
Running total
Cost
5x
2x
3x
2x
5x
Feature
Low latency
Stereo
Correct focus cues
High dynamic range
Full field of view
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Two independent views
Multi-plane display
Multi-resolution rendering
110 deg (wide) x 80 deg (high)
Kurt Akeley, Fall 2007
Foveal resolution
Foveal sampling density is ½ arc min

Display pixel should subtend ½ arc min
Typical monitor pixel subtends 2 arc min

1600 pixels at (dist = width)
IBM T221 (aka Big Bertha) LCD Display

Resolution: 3840 (wide) x 2400 (high)

Dimensions: 19” (wide) x 12” (high)
Estimated cost: 15x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Running total
Cost
5x
2x
3x
2x
5x
15x
Feature
Low latency
Stereo
Correct focus cues
High dynamic range
Full field of view
Foveal resolution
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Two independent views
Multi-plane display
Multi-resolution rendering
110 deg (wide) x 80 deg (high)
½ arc min
Kurt Akeley, Fall 2007
Full-scene antialiasing
SAGE
Render

16 sample / pixel
Reconstruction

5x5 pixel filter

400 samples / pixel

~1000 FLOPs / pixel
Estimated cost: 5x
“The SAGE Graphics Architecture”, Michael Deering and
David Naegle, Proceedings of SIGGRAPH 2002
CS248 Lecture 18
Kurt Akeley, Fall 2007
Running total
Cost
5x
2x
3x
2x
5x
15x
5x
Feature
Low latency
Stereo
Correct focus cues
High dynamic range
Full field of view
Foveal resolution
Full-scene AA
CS248 Lecture 18
Notes
Frame rate 250-500 Hz
Two independent views
Multi-plane display
Multi-resolution rendering
110 deg (wide) x 80 deg (high)
½ arc minute
16 samples / pixel, 5x5 pixel filter
Kurt Akeley, Fall 2007
Let’s sum it all up
Cost
5x
2x
3x
2x
5x
15x
5x
Feature
Low latency
Stereo
Correct focus cues
High dynamic range
Full field of view
Foveal resolution
Full-scene AA
Notes
Frame rate 250-500 Hz
Two independent views
Multi-plane display
Multi-resolution rendering
110 deg (wide) x 80 deg (high)
½ arc minute
16 samples / pixel, 5x5 pixel filter
22,500x
CS248 Lecture 18
Kurt Akeley, Fall 2007
This will keep GPU vendors busy ...
Multiple
2.2 CAGR
2.0 CAGR
1.8 CAGR
1000
9 years
10 years
12 years
5000
11 years
12 years
15 years
10000
12 years
13 years
16 years
50000
15 years
16 years
18 years
22,500x
CS248 Lecture 18
Kurt Akeley, Fall 2007
Vision research
Almost all experimental vision research is now done using
computer graphics
There are research opportunities in this area:
CS248 Lecture 18
http://bankslab.berkeley.edu/
Kurt Akeley, Fall 2007
Purpose of computer graphics?
Communication is the purpose
Human perception is the context

Techniques leverage visual perception abilities
Fidelity is a tool, not (necessarily) the goal

Virtual reality is great, but

Don’t want to be limited to reality

Want to do super reality

Non-photorealistic rendering (NPR) is valuable
–

Bill Buxton, Sketching User Experiences, 2006
No apology is required for “approximations”

Especially for interactive graphics
CS248 Lecture 18
Kurt Akeley, Fall 2007
Summary
Rule 1: All discontinuous frame-to-frame
changes correspond to discontinuous
scene or visibility changes
CS248 Lecture 18
Kurt Akeley, Fall 2007
Assignments
Next lecture: Computational Photography (Marc Levoy)
Reading assignment: none (work on your projects)
CS248 Lecture 18
Kurt Akeley, Fall 2007
End
CS248 Lecture 18
Kurt Akeley, Fall 2007
Related documents
Download