Perception Kurt Akeley CS248 Lecture 18 29 November 2007 http://graphics.stanford.edu/courses/cs248-07/ Today This is the last for-credit lecture Material from next-weeks lectures will not be tested Emphasize perception Pull together and re-emphasize ideas from earlier lectures Introduce some new ideas Tie everything back to performance CS248 Lecture 18 Kurt Akeley, Fall 2007 Optical quality of the eye Range of focus: • 5” to infinity (you) • 40” to infinity (me, corrected) Fovea What is the image of this (ideal) line? CS248 Lecture 18 Image from www.wikipedia.com Kurt Akeley, Fall 2007 Retinal image of an ideal line CS248 Lecture 18 Eye image from www.wikipedia.com Kurt Akeley, Fall 2007 Line spread function CS248 Lecture 18 Kurt Akeley, Fall 2007 Retinal image of a sine wave grating Lower contrast CS248 Lecture 18 Eye image from www.wikipedia.com Kurt Akeley, Fall 2007 Modulation transfer function CS248 Lecture 18 Kurt Akeley, Fall 2007 Ricco’s Law Area and intensity are indistinguishable for objects that subtend less than (roughly) 6 arc min. This allows antialiasing to work Especially fractional-width points and lines Antialiased pixels should subtend less than 6 arc min CS248 Lecture 18 Kurt Akeley, Fall 2007 Ricco’s Law and line spread (a coincidence?) 6 arc min CS248 Lecture 18 Kurt Akeley, Fall 2007 Spatial resolution of the eye Cone spacing in the fovea: L and M cones: 0.5 arc min S cones: 10 arc min Thus the lower spectral response seen in the color theory lecture Nyquist frequency for foveal photopic vision is 60 cpd Half the 120 cone/deg density Nyquist frequency is much lower outside the fovea Effective receptor density falls to 1/20th that of the fovea ? Rendering can take advantage of this E.g., insets in flight-simulation graphics accelerators CS248 Lecture 18 Kurt Akeley, Fall 2007 No aliasing in foveal vision Peripheral Nyquist frequency (approximate) Foveal Nyquist frequency CS248 Lecture 18 Kurt Akeley, Fall 2007 No aliasing in foveal S cones either Optics of the eye are substantially worse for 400 nm light MTF did not show this (it is an aggregate) CS248 Lecture 18 Kurt Akeley, Fall 2007 Vernier acuity Can detect an offset of 5 arc sec But sensor spacing is 30 arc sec How does this work? Not due to random sensor locations (works with very short lines) 5 arc sec CS248 Lecture 18 Kurt Akeley, Fall 2007 How vernier acuity (probably) works Cone spacing CS248 Lecture 18 Kurt Akeley, Fall 2007 Display resolution h = screen height d = viewing distance p = pixel count h q = pixel angle æ ö -1 ç h ÷ q =tan ç ÷ ÷ ÷ çè pd ø æ 12" ö ÷ .028 = tan çç = 1.68 arc min ÷ çè1024 ×24" ÷ ø - 1 CS248 Lecture 18 d θ Satisfies Ricco’s Law (less than 6 arc min) Kurt Akeley, Fall 2007 Matching foveal resolution h = screen height d = viewing distance p = pixel count q = pixel angle æ ö -1 ç h ÷ q =tan ç ÷ ÷ çè pd ø÷ Foveal resolution æ 12" ö÷ .00833 = tan çç ÷= 0.5 arc min çè3438 ×24"ø÷ - 1 CS248 Lecture 18 Kurt Akeley, Fall 2007 Flicker Flicker fusion threshold Statistically 16 Hz Increases In peripheral vision With brighter scenes With viewer fatigue Hence “jumping” numeric or CRT displays, when you aren’t looking directly at them Flicker rates: Movies: 48 Hz (typical), 72 Hz (using computer displays) Video: 60 Hz (US NTSC), 50 Hz (Europe and Asia, PAL) Computer displays: 60-100 Hz (CRT), no flicker (LCD) Fluorescent lights: 120 Hz (US), 100 Hz (Europe, Asia) CS248 Lecture 18 Kurt Akeley, Fall 2007 Frame rate vs. flicker rate Increasing flicker rate above frame rate: Avoids flicker-rate problems But introduces visual artifacts Image doubling (2x) or even tripling (3x) Media Frame rate Flicker rate Movie 24 48 or 72 Television 25 or 30* 50 or 60 Visual simulation 60 60 CS248 Lecture 18 Kurt Akeley, Fall 2007 Interlaced displays Two fields per “frame” Display odd lines in the first field Display even lines in the second field “Frame” is misleading: True interlaced sampling is “flying spot” Each pixel is sampled and displayed at proportional times Motion artifacts are avoided Interlaced frames (e.g., video display of a movie) All pixels are sampled at the same moment But display is sequential, causing motion artifacts Still common in video 1080i is standard 1080p is becoming more common CS248 Lecture 18 Big battle during definition of HDTV! Kurt Akeley, Fall 2007 Interlacing and antialiasing Small moving objects can disappear Object subtends a single pixel Fields are rendered properly (not from a single frame) One solution is antilaliasing with a large filter kernel Rendered objects necessarily subtend more than a single pixel Field n CS248 Lecture 18 Field n+1 Field n+2 Kurt Akeley, Fall 2007 Color sequential displays Time-sequential red, green, blue (and sometimes white) Examples: Many digital projectors Professional head-mounted displays Should render each “frame” separately Movies don’t So time sequential projectors yield “rainbow” effects Simulation systems do So motion artifacts are avoided CS248 Lecture 18 Kurt Akeley, Fall 2007 Mach banding – slope discontinuities Same peak intensities CS248 Lecture 18 Kurt Akeley, Fall 2007 Human response is not linear Twice as many photons/sec does not appear twice as bright Instead 5.7 times as many photons appear twice as bright Brightness (human perception) and intensity (actual photon rate) are related by Steven’s Power Law: B = k ×I 0.4 CS248 Lecture 18 Kurt Akeley, Fall 2007 Human sensitivity is not linear either Can distinguish intensity differences of 1% Static images Photopic (intensities bright enough for cones to see) This corresponds to a linear change in brightness 0.4 VB = k ×(1.01×I) - k ×I 0.4 0.4 = k ×(0.01) CS248 Lecture 18 Kurt Akeley, Fall 2007 Motion matters CS248 Lecture 18 Kurt Akeley, Fall 2007 Numeric representation Optimal numeric representation would arrange for adjacent intensities to be (barely) indistinguishable. Thus optimal numeric representation is nonlinear in intensity (relative differences of 1 percent) but linear in brightness (absolute differences of k(0.01)0.4) CS248 Lecture 18 Kurt Akeley, Fall 2007 Contrast ratio Visible contrast: 4-5 orders of magnitude within a scene (at the same time) 6 orders of magnitude of “adaptation” Can take up to 40 minutes, though Bits CS248 Lecture 18 Delta I per step Delta I total 8 1.0 % 12.6 8 1.8 % 100 10 0.7 % 1000 Kurt Akeley, Fall 2007 Solutions Brightness-linear storage Use linear arithmetic (get incorrect answers) Use non-linear arithmetic (get correct answers) Convert convert to intensity-linear, operate, convert back Implement nonlinear arithmetic Intensity-linear storage Gamma correct (convert to brightness-linear form) when displaying CS248 Lecture 18 Kurt Akeley, Fall 2007 Brightness-linear storage Intensities can be added, brightnesses cannot Store image linear in brightness (unusual in 3-D systems) n Best use of available storage precision 256 representable levels are enough Requires conversion for each pixel operation (e.g., blend) Gamma converter CS248 Lecture 18 8 8-bit frame buffer 8 DAC Display Kurt Akeley, Fall 2007 Intensity-linear storage Store image linear in intensity (typical in 3-D systems) n Native arithmetic format Requires conversion during display Large brightness steps at low intensities 256 DAC levels is OK, but frame buffer needs more n-bit frame buffer CS248 Lecture 18 n Gamma converter 8 DAC Display Kurt Akeley, Fall 2007 What is n ? Assume 0.4166 8-bit DAC Gamma of 2.4 æinput ö 2.4 output = çç n 255 ÷ ÷ ÷ çè 2 - 1 ø Output n=8 255 254 Output n=10 255 255 3 2 1 0 40 34 25 0 22 19 14 0 Output n=12 255 255 Output n=14 255 255 Output n=16 255 255 7 6 4 0 4 3 2 0 … Table input 2**n-1 2**n-2 CS248 Lecture 18 13 11 8 0 Kurt Akeley, Fall 2007 Display gamut No finite set of primaries can reproduce the entire gamut. But more primaries do a better job. CS248 Lecture 18 Kurt Akeley, Fall 2007 Perception and Performance (adapted from my VR2004 keynote) CS248 Lecture 18 Kurt Akeley, Fall 2007 Latency For an out-the-window display 100 to 150 milliseconds For a head-mounted display 5 to 15 milliseconds** Total response latency, sum of Tracking/input delay, plus Rendering delay, plus Display delay A 72 Hz display refreshes every 14 ms CS248 Lecture 18 ** source: Fred Brooks Kurt Akeley, Fall 2007 Latency solution Reduce system latency to 5-15 ms range Requires 2-4 ms frame time (250-500 Hz) Assuming 3-frame latency Estimated cost: 5x CS248 Lecture 18 Kurt Akeley, Fall 2007 Running total Cost 5x Feature Low latency CS248 Lecture 18 Notes Frame rate 250-500 Hz Kurt Akeley, Fall 2007 Stereo solution Binocular disparity is a very strong visual cue Must render separately for each eye Occlusion View-dependent lighting (e.g. reflections, specularity) Alternatives tend to be hacks Estimated cost: 2x CS248 Lecture 18 Kurt Akeley, Fall 2007 Running total Cost 5x 2x Feature Low latency Stereo CS248 Lecture 18 Notes Frame rate 250-500 Hz Two independent views Kurt Akeley, Fall 2007 Incorrect retinal cue – blur gradient Correct CS248 Lecture 18 Incorrect Kurt Akeley, Fall 2007 Focus cue solution Multiple image plane display Fixed relationship to viewer (e.g. head mounted) Low resolution in depth Non-occluding images with depth filtering Separate left and right displays (2x cost already accounted) Leverages 2D technology Amounts to a 2.5D display f Cost estimate: 3x CS248 Lecture 18 Kurt Akeley, Fall 2007 Running total Cost 5x 2x 3x Feature Low latency Stereo Correct focus cues CS248 Lecture 18 Notes Frame rate 250-500 Hz Two independent views Multi-plane display Kurt Akeley, Fall 2007 High Dynamic Range (HDR) Human limitations Numbers from Sunnybrook Technologies 1,000,000:1 range of sensitivity 100,000:1 contrast within scene Current displays CRT 300:1 contrast ratio LCD 1000:1 contrast ratio SIGGRAPH 2003 ET Sunnybrook Technologies CS248 Lecture 18 Kurt Akeley, Fall 2007 Sunnybrook Technologies Dual-density display Conventional LCD panel in front (full-resolution) White LED array used as back-light (~1/50 resolution) CS248 Lecture 18 Kurt Akeley, Fall 2007 Sunnybrook Technologies Scattering masks low resolution LEDs CS248 Lecture 18 Kurt Akeley, Fall 2007 HDR solution Requires 16-bit framebuffer components Rendering Blending Full-scene anti-aliasing Requires multi-resolution rendering Full-resolution for LCD, corrected for back-lighting Low-resolution for back-lighting Estimated cost: 2x CS248 Lecture 18 Kurt Akeley, Fall 2007 Running total Cost 5x 2x 3x 2x Feature Low latency Stereo Correct focus cues High dynamic range CS248 Lecture 18 Notes Frame rate 250-500 Hz Two independent views Multi-plane display Multi-resolution rendering Kurt Akeley, Fall 2007 Field of view Human field of view (FOV) Monocular: 160 deg (wide) x 135 deg (high) Binocular: 200 deg (wide) Binocular overlap: 120 deg (wide) Typical screen FOV 55 deg (wide) x 41 deg (high) d CS248 Lecture 18 d Kurt Akeley, Fall 2007 Optical flow matters “Women Go With the (Optical) Flow”, Desney S. Tan, Mary Czerwinski, George Robertson. http://research.microsoft.com/users/marycz/chi2003flow.pdf CS248 Lecture 18 Kurt Akeley, Fall 2007 FOV solution Double horizontal FOV to 110 degrees Double vertical FOV to 80 degrees Cleverness to distribute resolution ? e.g. cylindrical projection Estimated cost: 5x CS248 Lecture 18 Kurt Akeley, Fall 2007 Pixels subtend different angles Assumes planar display 9 8 7 Center pixel 6 Edge Pixel 4 5 3 2 1 0 0 CS248 Lecture 18 20 40 60 80 100 120 140 Field of view Kurt Akeley, Fall 2007 Running total Cost 5x 2x 3x 2x 5x Feature Low latency Stereo Correct focus cues High dynamic range Full field of view CS248 Lecture 18 Notes Frame rate 250-500 Hz Two independent views Multi-plane display Multi-resolution rendering 110 deg (wide) x 80 deg (high) Kurt Akeley, Fall 2007 Foveal resolution Foveal sampling density is ½ arc min Display pixel should subtend ½ arc min Typical monitor pixel subtends 2 arc min 1600 pixels at (dist = width) IBM T221 (aka Big Bertha) LCD Display Resolution: 3840 (wide) x 2400 (high) Dimensions: 19” (wide) x 12” (high) Estimated cost: 15x CS248 Lecture 18 Kurt Akeley, Fall 2007 Running total Cost 5x 2x 3x 2x 5x 15x Feature Low latency Stereo Correct focus cues High dynamic range Full field of view Foveal resolution CS248 Lecture 18 Notes Frame rate 250-500 Hz Two independent views Multi-plane display Multi-resolution rendering 110 deg (wide) x 80 deg (high) ½ arc min Kurt Akeley, Fall 2007 Full-scene antialiasing SAGE Render 16 sample / pixel Reconstruction 5x5 pixel filter 400 samples / pixel ~1000 FLOPs / pixel Estimated cost: 5x “The SAGE Graphics Architecture”, Michael Deering and David Naegle, Proceedings of SIGGRAPH 2002 CS248 Lecture 18 Kurt Akeley, Fall 2007 Running total Cost 5x 2x 3x 2x 5x 15x 5x Feature Low latency Stereo Correct focus cues High dynamic range Full field of view Foveal resolution Full-scene AA CS248 Lecture 18 Notes Frame rate 250-500 Hz Two independent views Multi-plane display Multi-resolution rendering 110 deg (wide) x 80 deg (high) ½ arc minute 16 samples / pixel, 5x5 pixel filter Kurt Akeley, Fall 2007 Let’s sum it all up Cost 5x 2x 3x 2x 5x 15x 5x Feature Low latency Stereo Correct focus cues High dynamic range Full field of view Foveal resolution Full-scene AA Notes Frame rate 250-500 Hz Two independent views Multi-plane display Multi-resolution rendering 110 deg (wide) x 80 deg (high) ½ arc minute 16 samples / pixel, 5x5 pixel filter 22,500x CS248 Lecture 18 Kurt Akeley, Fall 2007 This will keep GPU vendors busy ... Multiple 2.2 CAGR 2.0 CAGR 1.8 CAGR 1000 9 years 10 years 12 years 5000 11 years 12 years 15 years 10000 12 years 13 years 16 years 50000 15 years 16 years 18 years 22,500x CS248 Lecture 18 Kurt Akeley, Fall 2007 Vision research Almost all experimental vision research is now done using computer graphics There are research opportunities in this area: CS248 Lecture 18 http://bankslab.berkeley.edu/ Kurt Akeley, Fall 2007 Purpose of computer graphics? Communication is the purpose Human perception is the context Techniques leverage visual perception abilities Fidelity is a tool, not (necessarily) the goal Virtual reality is great, but Don’t want to be limited to reality Want to do super reality Non-photorealistic rendering (NPR) is valuable – Bill Buxton, Sketching User Experiences, 2006 No apology is required for “approximations” Especially for interactive graphics CS248 Lecture 18 Kurt Akeley, Fall 2007 Summary Rule 1: All discontinuous frame-to-frame changes correspond to discontinuous scene or visibility changes CS248 Lecture 18 Kurt Akeley, Fall 2007 Assignments Next lecture: Computational Photography (Marc Levoy) Reading assignment: none (work on your projects) CS248 Lecture 18 Kurt Akeley, Fall 2007 End CS248 Lecture 18 Kurt Akeley, Fall 2007