A Young Mathematician’s Reflection on Vision: Illposedness & Regularizations Jackie (Jianhong) Shen School of Mathematics University of Minnesota, Minneapolis, MN 55455 Workshop on Regularization in Statistics Banff International Research Station, Canada; September 6-11, 2003 Abstract Vision, the perception of the 3-D world from its 2-D partial projections onto the left and right retinas, is fundamentally an illposed inverse problem. But after millions of years' of evolution, human vision has become astonishingly accurate and satisfactory. How could it have become such a remarkable inverse problem solver, and what are the hidden (or subconscious) regularization techniques it employs? This talk attempts to reflect and shed some light on these “billion-dollar” or Nobel-level questions (the works of several Nobel Laureates will be mentioned indeed), based on the limited but unique experience and philosophy of a young mathematician. In particular, we discuss the topological, geometric, statistical/Bayesian, and psychological regularization techniques of visual perception. I owe deep appreciation to my former advisors, mentors, and teachers leading me to the Door of Vision : Professors Gil Strang, Tony Chan, Stan Osher, David Mumford, Stu Geman, and Dan Kersten. 2 Agenda An Abstract View of Vision: Imaging and Perception Illposedness of Visual Perception: Root and Solutions Conscious or Subconscious Regularizations 3 Topological Regularization: Generic Viewpoint Principle Geometric Regularization: Perception & Role of Curvature Statistical Regularization: Gibbs Fields and Learning Psychological Regularization: Role of Weber’s Law. Dedication “ Dedicated to all pioneering mathematicians in Mathematical Image and Vision Analysis (Miva), on whose shoulders we the younger generations are standing, and on whose shoulders, we the younger must think deeper, speak louder, and look further beyond …” - Jackie Shen 4 An Abstract View of Vision : Passive Imaging & Active Visual Perception Biological or Digital (Passive) Imaging Process I: Illuminance or incident light G: 3-D surface geometry & topology R: Reflectance or material property Passive Imaging Process: Optical imaging process (human vision or digital camera) can be modeled as a function (or operator): u( x 2 ) F[G( x 3 ), R( x 3 ); a1, a2 ,, an ] Here, 2 and 3 indicate the dimension of the spatial variables. G and R denote the configuration (scene | geometry) and the reflectance. All the a’s are parameters such as I and q. 6 Active Visual Perception I: Illuminance or incident light G: 3-D surface geometry & topology R: Reflectance or material property Active Visual Perception: Perception is to reconstruct the 3-D world (geometry, topology, material surface properties, light source, etc.) from the observed 2-D image: [G( x 3 ), R( x 3 ); a1 , a2 ,, an ] V [u( x 2 )], [G( x 3 ),R( x 3 )] V [u( x 2 ); a1 , a2 ,, an ]. 7 or A Quick Overview of Mathematicians’ Missions Develop mathematical formulations for all the important psychological and cognitive discoveries in visual perception. As in geometry and physics (Hamiltonian, Statistical or Quantum Mechanics), unify and extract the most fundamental laws or axioms of perception, and develop their mathematical foundations. Decode the computational and information-theoretic efficiency behind human visual perception, and integrate such knowledge into digital and computational decision and optimization algorithms. Develop novel mathematical tools and theories arising from such studies, and apply them to other scientific and engineering fields, such as data visualization/mining; pattern recognition/learning/coding; multiscale modeling. 8 Is It Risky for a Young Mathematician to Study Perception? Quoting Albert Einstein: “The object of all science, whether natural science or psychology , is to co-ordinate our experiences and to bring them into a logical system.” - from “Space and Time in Pre-Relativity Physics,” May, 1921. To me, as for Special or General Relativities, being logical necessarily means being mathematical : o Cause-effect modeling and description (e.g., learning theory); o o o o 9 Formulating basic psychological/perceptual laws (e.g. Weber); Simulating brain computation using computers; Verifying existing data, and furnishing reasonable predictions. etc. Illposedness of Visual Perception: Root of Illposedness Solutions to Illposedness Vision is an Inverse Computer Graphics Problem Dan Kersten (1997) 3-D Computer Graphics: (Hollywood animated movies) G: Geometric configuration of a 3-D scene (bike, table,…) R: Material surface property: reflectance field. I : ILLUMINANCE (incident lights, light source and type). Goal: Generate a 2-D image U, which looks exactly like the image that one sees if facing such a real 3-D scene. Visual Perception is an Inverse C.G. Problem (Kersten): Given a 2-D image U, one attempts to reconstruct its 3-D world: G, R, I, viewing position & angle, etc. 11 Vision is an illposed Inverse Problem (A-1) A. Geometry is not invertible: depth or range is lost ! Mathematical Model (1): Projective Imaging P: (x1, x2, x3) (x1, x2) is not invertible! For any given 2-D curve g2, there are infinitely many 3-D curves g3, so that P(g3) = g2 . Example. g3 = (cos t, sin t, t ) g2 is just like the projection of a circle 12 Vision is an illposed Inverse Problem (A-2) Mathematical Model (2): Perspective Imaging a hyperbola a parabola an ellipse Imaging plane (or retina) All three different types of curves are imaged as a 2-D circle ! 13 Vision is an illposed Inverse Problem (B) B. The Reflectance-Illuminance entanglement. R I different 3-D scenes R I 14 identical images ! But vision still makes sense, after millions of years’ evolution , which implies that: The human vision system is a well developed system of software and hardware, which can solve this highly illposed inverse problem efficiently and robustly. Fundamental Questions: What kind of regularization techniques the human vision system employs to conquer the illposedness? What kind of features or variables to regularize ? 15 Deterministic Model: Tikhonov Regularization Tikhonov conditioning technique for inverse problems: forward data generating: U0 = F(X). backward (inverse) reconstruction of X: min d [ F(X), U0 ] + R(X), where d is a suitable discrepancy metric, and R is the regularizing/conditioning term. Example. (Cleaning and sharpening of astronomical images) U0 = h * X + n . (h : atmosphere blurring ; n: white noise) Such an inverse problem is typically solved by min X 16 2 (U 0 h X ) 2 dx | X | dx, 1. Tikhonov Meets Bayes: Perception as Inference Bayesian Perception: X = (geometry G, reflectance R, …). Prob (U 0 | X ) Prob ( X ) max Prob ( X | U 0 ) X Prob (U 0 ) Gibbs’ energy formula: Prob (y) = 1/Z exp ( - E[y] / k T ). Thus, in terms of energies, we have min E[ X | U 0 ] E[U 0 | X ] E[ X ] const , X which leads to Tikhonov Inversion ! Conclusion: It is our a priori knowledge of the world (i.e., Prob [X ] ) that regularizes our visual perception! 17 A Priori Knowledge (Common Sense) of the World Regularizes Vision G: Knowledge of curve and shape geometry; I: Knowledge of light sources & illuminance (sun, lamps, indoor or outdoor, …); R: Knowledge of materials (wood, bricks, …) and surface reflectance (metal shines and water sparkles, …) Q : Knowledge of the viewers (often standing perpendicularly to the ground, viewing more horizontally, several feet away for indoor scenes, …) …… 18 Example I: Prior Knowledge on Curves & Boundaries Straight lines: length(g), or 1-D Hausdorff. Euler’s elastica (Mumford [1994], Chan-Kang-Shen [2002]): g ( a b 2 ) ds. Piecewise elastica (Shah [2001]): e[g , S ] g \S (a b 2 )ds c # S , which allows corners or hinges (i.e. the discrete set S) along the curve. 19 Example II. Prior Knowledge on Material Reflectance Reflectance distribution R(x) in the visual field The Mumford-Shah [1989] free-boundary model: \g | R | 2 dx e[g ]. curve energy Many challenging free-boundary problems. Connections to interface motions (such as the mean curvature motion). 20 Deeper Question: What to Learn & How How does the vision system subconsciously choose what knowledge to learn and store, out of massive visual data in daily life? For such knowledge, to which degree of regularity or compression that the vision system “decides” to process, in order to achieve maximum efficiency and robustness? How to mathematically model (or quantify) such activities? Partial answer by Nobel Laureate (1972) Gerald Edelman: Neural Darwinism: (Basic Books, Inc., 1987) The vision system in each individual operates as a selective system resembling natural section in evolution, but operating by different mechanisms. 21 Topological Regularization: Principle of Generic Viewpoint vs. Theorem of Transversality in Differential Topology Transversal Phenomenon 1 3-D world 2 projection Digital or Biological 2-D Retina A B g1 C g2 g 1 and g 2 are transversal, in the sense of differential topology: g 1 and g 2 are not tangent at any point A g 1 g 2 . 23 Transversal Is Generic: Stability & Universality Suppose that (2, 3 are dimensions, and P is the projection) x 2 (t ) P[ x 3 (t )] and y 2 (t ) P[ y 3 (t )] are transversal. Then for any perturbation Pe to P , within an e distance under a suitable smoothness norm, xe2 (t ) Pe [ x 3 (t )] and ye2 (t ) Pe [ y 3 (t )] are still transversal. Stability or Openness Theorem (Milnor): Let M and N be two smooth manifolds, and P N submanifold. Then any smooth mapping f: M N can be approximated arbitrarily closely by one which is transversal to P. Universality or Denseness P M N 24 Principle of Generic Viewpoint in Vision Generic Viewpoint: (Nakayama & Shimojo, Science [1992]; Freeman, Nature [1994]) What we see is generic w.r.t. the viewpoint ! Mathematical Model: Suppose we see an image u( x 2 ) F[G( x 3 ), R( x 3 ); a1, a2 ,, an ] Then lim e 0 D{u ( x 2 ), F [G ( x 3 ), R( x 3 ); a1e , a2e ,, ane ]} 0, where ake ' s denote small perturbations of the a’s, controlled by e, and D is a vision motivated measure, metric or distance. In terms of curves, D can be based on invariants of differential topology: smoothness, corners, connected components, branching, … 25 How G.V. Principle Regularizes Perception: 3 Examples How the Generic Viewpoint Principle regularizes visual perception (conditioning the illposedness): Set up preference/biases among all possible scenes. (Uniqueness) Lead to topologically unique solutions. (Stability) Lead to the stability of perceptual solutions. 26 Example 1: Curve Perception 2 1 A pitchfork Two curves g Given the image g , which 3-D scene (curve) satisfies the G.V. Principle? Answer: 1. For 2, when the viewing angle q is perturbed a little bit, the image would be: OR which is not diffeomorphic to g (since branching degree d and number of connected components c are invariants). Take the metric D, for example, D = difference in d + difference in c + other visual metric terms. 27 Example 2: Shape Perception A painted planar plate a uniform V-wing u Given the image u, which 3-D scenario satisfies the G. V. Principle? Answer: Planar Plate. If the viewer moves a little bit (i.e. viewing angle q qe), the image of the V-wing scene would become ue The two 2-D shapes are NOT diffeomorphic since corner is an invariant. (This is because: if f is a diffeomorphism, then the Jacobian Jf is non-singular, and a corner can never be smoothed out leading to a straight line. 28 Example 3: shape perception : Simian or Human u v d d=0 For image u, we “see” two human faces. No doubt. If we move the face contours closer till touching (image v), do we still percept two human faces? or two simians? Psychologists show that most of us percept the latter. WHY? My explanation based on G. V. Principle: In 3-D, give the two faces a slight distance difference (from the viewer). Then the perception of two human faces violates the G.V. Principle, while the perception of two (100% transparent except along the outlines) simian faces satisfies the G.V. Principle, or the Stability Principle. [Classical Interpretation is based on Gestalt.] 29 Two simian faces Two human faces Dan Kersten, Vision Psychology, UMN Geometric Regularization: Perception and Roles of Curvature Why Curvature: (A) Architectural Evidence St. Louis Arch, USA Eiffel Tower, Paris MIT’s Big Dome, Cambridge, USA If human vision were blind to curvature, architects would not have bothered to build these beautiful structures ! (Disclaimer: All pictures are Internet downloads, not taken by the author.) 31 Why Curvature: (B) Anatomic Evidence The work of Nobel Laureates (1981) David Hubel & Torstern Wiesel (1962): x Visual cortical surface y In my opinion, this provides the most intimate evidence of visual processing of the curvature information. WHY ? Because 50 mm Curvature= Spatial Organization of Orientations: dq/ds ! Down to the interior z Quantized array of cortical visual neurons and their oriented receptive fields 32 Why Curvature: (C) Heuristic Mathematical Argument Taylor expansion for a curve: x( s) x(0) x (0)s x(0)s 2 / 2 . x (0) s Question: Which orders do matter to human visual perception? Answer: Like Newtonian mechanics, only up to the second order. Heuristic Mathematical Argument: First order matters: since we can detect rotation! x (0) x (0) ev is like a rotation Re x( s) if v x (0). Second order matters: since we have a strong sense of concavity/convexity/inflection. Third order? x( s) s 2 , s [1,1]. x( s) s 2 0.1s3 , s [1,1]. 33 What Do All These Mean? The proceeding evidences clearly demonstrate the significance, feasibility, legitimacy of curvature processing in visual perception. So what? It implies to: We focus on this point. Employ curvature as a deterministic regularizer for solving numerous ill-posed inverse problems in image and vision analysis ( Mumford’s proposal of Euler’s elastica regularizer (1994), also Masnou-Morel (1998, ICIP ), Chan-Kang-Shen, (2002, SJAP ), Esedoglu-Shen (2002, EJAP ) ); Encode curvature in data structure and organization (e.g., Candes-Donoho’s curvelets (1999; 2002, IEEE Trans. IP )); Develop and study curvature-based mathematical models (e.g., Chan-Shen’s CDD 3rd nonlinear PDE model (2001, JVCIR ), Esedoglu- Shen’s proposal of Mumford-Shah-Euler model (2002, EJAP ); Euclidean and affine invariant scale spaces (Alvarez-Guichard-Lions-Morel, 1993; CalabiOlver-Tannenbaum, 1996); mean curvature motions (Evans-Spruck, 1991)). 34 Perceptual Interpolation of Missing Boundaries np Occluding object nq p q np g p nq q Problem: Find a curve g (t), 0<t<1, which passes through the two endpoints p and q , and looks “natural.” Vision background: Most objects in our material world are not transparent. Occlusion is universal. Human vision must be able to integrate the dissected information (by suitable interpolation), in order to successfully perceive the world. Why regularization is needed: (1) (non-uniqueness) otherwise too many curves to choose from: Brownian paths, say; (2) to regularize is to properly model “being natural.” 35 Second Order Polynomial Regularization Modeling “Being Natural” g (0) p, g (1) q; and n p , g(0) 0, nq , g(1) 0. np g p nq q Let us try polynomial regularization: 2 3 g (t ) a b t c t dt ... But which order is generically sufficient? Count the constraints: in 2-D, totally 2+2+1+1=6 conditions. n Theorem. As long as p nq 0 , there exists a unique parabolic interpolant in the form of g (t ) qt 2b t(1 t ) p(1 t )2 . 2 This solution echoes the earlier claim of second order sufficiency 36 Second Order Geometric Regularization: Euler’s Elastica g (0) p, g (1) q; and n p , g(0) 0, nq , g(1) 0. Mumford [1994] np g nq p q Variational approach: under these constraints, to minimize g ( a b 2 ) ds, called Euler’s elastica. Energy was studied by Euler (1744) to model the steady shape of a thin and torsion free elastica rod. The equilibrium equation: ' ' ( s) a 2 with the first integral ds 2b 2d S R 2 4 which is an elliptic integral, and the solution can be expressed by elliptic functions (Mumford [1994]). 37 , Image Interpolation or Inpainting Inpainting is an artistic way of saying Image Interpolation (as first used in IP by Bertalmio et al (SIGGRAPH, 2000)); Partial image information loss is very common: Occlusion caused by non-transparent objects; Data loss in wireless transmission; Cracks in ancient paintings due to pigment aging/weather; Insufficient number of image acquisition sensors. etc Example I: Occlusion 38 Example II: Cracks Read Shen (SIAM News, 36(5), Inpainting and Fundamental Problem of IP, 2003) Chan-Shen’s Inpainting Model via BV Regularizer Chan-Shen (SIAM J. Appl. Math., 62(3), 2001) u 0 | \ D given missing on D Existence is guaranteed, but uniqueness is not. The TV inpainting model: BV regularizer min E[u | u 0 , D] | Du | u 2 \ D | Ku u 0 |2 dx, least square (for uniform Gaussian noise) The associated formal Euler-Lagrange equation on : u 0 u t 0 K D ( x )( K u u ), Fidelity Index D ( x) 1 \ D ( x). with Neumann adiabatic condition along the boundary of . 39 Formally looks very similar to Rudin-Osher-Fatemi’s denoising model ! Chan-Shen’s Model for Inpainting Noisy & Blurry Images Chan-Shen (AMS Contemporary Math., 2002) movie forever 40 Suppose K=Gt, is the Gaussian kernel. Then, the model gives a good inverting of heat diffusion. Without the BV regularization, backward diffusion is notoriously ill-posed. The BV Regularizer is Insufficient for Inpainting (Kanisza, Nitzberg-Mumford, Chan-Shen) w Long distance connection is too expensive for the TV cost ! l Cheaper to simply break it. We need curvature! When l >> w 41 Human perception BV breaks it Lifting Curve Regularity to Image Regularity Using the level-sets of an image, we can “lift” a curve model to an image model (formally; theoretical study by Bellettini, et al., 1992): E2 [u ] D 1 0 Curvature of level sets ( a b 2 ) | u | dx d g :u u | u ( a b 2 ) ds n | Connection to the mean curvature flow (Evans-Spruck, 1991): ut ( ij d dt 42 D u xi u x j | u | 2 )u xi u x j , | u | dx 2 | u | dx. D Elastica Inpainting Model: Curvature Incorporated Chan-Kang-Shen (SJAP, 2002; also ref. to Masnou-Morel, 1998 ) E[u | u 0 , D] ( ) | u |dx 2 \ D n is the curvature, ( ) a b 2 . | u u 0 |2 dx, Theorem (associated E.-L. PDE). t The gradient descent flow is given by u V D ( x)(u 0 u ), t t ( ' ( ) | u |) V ( ) n , | u | t n level sets of u where V is called the flux field , with proper boundary conditions. 43 Elastica Inpainting: Nonlinear Tranport + CDD Chan-Kang-Shen (SJAP, 2002 ) Transport along the isophotes (level sets): t ( ' ( ) | u |) Tangential component Vt , | u | t t n Curvature driven diffusion (CDD) across the isophotes: Normal component u Vn ( )n ( ) . | u | Conclusion: Elastica inpainting unifies the earlier work of Bertalmio, Sapiro, Caselles, and Ballester (SIGGRAPH, 2000) on transport based inpainting, and that of Chan and Shen (JVCIR, 2001) on CDD inpainting (motivated by human visual perception). 44 level sets Elastica Inpainting. I. Smoother Completion Effect 1: as b/a increases, connection becomes smoother. Chan-Kang-Shen (SJAP, 2002 ) 45 Euler' s elastica : ( ) a b 2 . Elastica Inpainting. II. Long Distance is Cheaper Effect 2: as b/a increases, long distance connection gets cheaper. Euler' s elastica : ( ) a b 2 . For more theoretical and computational (4th order nonlinear!) details, please see Chan-Kang-Shen (SJAP, 63(2), 2002). 46 Inpainting Regularized by Mumford-Shah Regulerizer Chan-Shen (SJAP, 2000), Tsai-Yezzi-Willsky (2001), Esedoglu-Shen (EJAP, 2002) The Mumford-Shah (1989) image model was initially designed for the segmentation application: Ems [u , ] E[u | ] E[] g 2 \ | u |2 dx length (), Mumford-Shah based inpainting is to minimize: Ems [u, | u 0 , D, K ] Ems [u, ] E[u 0 | u, , D, K ]. Inpainting domain possible blurring A free boundary optimization problem. 47 Mumford-Shah Inpainting: Algorithm Esedoglu-Shen (Europ. J. Appl. Math., 2002) For the current guess of edge completion , find u to minimize Ems [u | , u 0 , D] E[u | ] E[u 0 | u, , D] const . equivalent to solving the elliptic equation on \: gu D ( x)(u 0 u ) 0. This updated guess of dx dt u then guides the motion of : ( [ R] )n. M. C. Motion R+ R- Jump across of the roughness measure R g2 | u |2 2D (u u 0 ) 2 . [We can then benefit from the level-set implementation by Chan-Vese.] 48 M.-S. Inpainting via -Convergence Approximation Esedoglu-Shen (Europ. J. Appl. Math., 2002) The -convergence approximation of Ambrosio-Tortorelli (1990): Ems [u , ] g g 2 \ | u |2 dx length (), (1 z ) 2 2 Ee [u, z ] ( z o(e )) | u | dx e | z | 2 4e 2 2 dx. z=1 z=0 edge is approximated by a signature function z. 49 Esedoglu-Shen shows that inpainting is the perfect market for -convergence Leading to Simple Elliptic Implementation Esedoglu-Shen (2002) u 0 | \ D given missing on D The associated equilibrium PDEs are two coupled elliptic equations for u and z, with Neuman boundary conditions: D ( x)(u u 0 ) g ( z 2u ) 0, z 1 (g | u | ) z 2ez 0, 2e 2 which can be solved numerically by any efficient elliptic solver. 50 Applications: Disocclusion and Text Removal Esedoglu-Shen (2002) 51 inpainted u the edge signature z Inpainting domain inpainted u Insufficiency of Mumford-Shah Regularity for Inpainting Defect I: Artificial corners Defect II: Fail to realize the Connectivity Principle, like BV. 52 Esedoglu-Shen’s Proposal: Mumford-Shah-Euler Esedoglu-Shen (2002) Idea: change the straight-line curve model embedded in the Mumford-Shah model to Euler’s elastica: Emse [u, ] g 2 \ | u |2 dx e(), e() length () 2 ds, the elastica energy. The -convergence approximation (conjecture) of De Giorgi (1991): Ee [u , z ] g W ( z) 2 2 2 ( z o ( e )) | u | dx e | z | dx 2 4e e 2 W ' ( z) 2 e z dx, 4e W ( z ) (1 z ) 2 (1 z ) 2 is the double - well potential. For the technical and computational details, please see Esedoglu-Shen. 53 Inpainting Based on Mumford-Shah-Euler Regularity Esedoglu-Shen (2002) Issues: • Very costly: 4th order PDE • Many local minima • Without approximation, it is difficult to implement geometry 54 Conclusion for Curvature Based Regularization Curvature processing has its anatomical vision foundation Curvature processing has its cognitive/perceptual foundation Curvature based regularization is necessary for Image analysis and coding Image processing and modeling Image computation: algorithms and implementation schemes Curvature imposes both theoretical & computational challenges: non-quadratic objective functionals; nonlinear 3rd and 4th order PDEs; insufficient theory for wellposedness (existence, uniqueness, definition domains, etc.) … 55 Statistical Regularization: Gibbs Fields and Learning Stochastic View of Images: Random Fields Geman-Geman (1984), Grenander (1993), Zhu-Mumford (1997) An observed image u is treated as a randomly drawn sample. The sample space is an ensemble of images with its own (often unknown) distribution. An ensemble equipped with a probability distribution m or p naturally leads to a random field (R. F.) on the image domain. Two distinguished features of such stochastic view: We are not interested in the pixelwise details of a particular image, rather, the key statistical features of the R. F. associated. The R. F. is assumed to be ergodic : for a typical sample image u, spatial statistics converge to ensemble/field statistics. In this view, image regularization is built into the distribution m . 57 Stochastic View of Images: Examples (Shen-Jung, 2003) (Shen-Jung, 2003) Reaction-diffusion spots Reaction-diffusion stripes Wood texture (Internet download) Crops (Internet download) 58 Stochastic Regularization: Gibbs Fields Geman-Geman (1984), Grenander (1993), Zhu-Mumford (1997) Stochastic regularity is encoded or specified by the random field distribution p (u ) . If p is a uniform distribution, then there is not much tangible “information” associated to the images, or in Shannon’s language, the information content is the lowest (equivalently, no regularity is in presence as the entropy reaches highest). In Grenander’s (founder of Pattern Theory, Brown U) language, textures are built from basic building elements (“atoms”), and these “atoms,” like molecules, are bound together by local regularity energies, leading to informative images. Thus Gibbs Fields Model seems natural to characterize such stochastic regularity: Gibbs Canonical Ensemble/Formula: p(u ) 1 Z exp( E[u ] / kT ) partition function 59 “regularity energy” visual “temperature” Regularization: Visual Potentials and Their Duals What can a box of air (molecules) tell us: Microscopically never stops fluctuating (lacking regularity) Macroscopically there are a few key feature potentials: o Temperature T o Pressure p o Chemical potentials m These potentials have their dual (additive) variables o Energy E, dual to the inverse temperature 1/kT. o Volume V, dual to the pressure p. o Mole numbers N, dual to the chemical potential m. Gibbs Generalized Ensemble Model says, Prob (a micro state) = 1/Z .exp(- E[.] - p V[.] + m N[.] ). Conclusion: to apply Gibbs regularity in image analysis, one needs to properly identify visually meaningful potentials and their duals. (These feature parameters will regularize images.) 60 Example of Gibbs Canonical Images: Binary Ising Images Ising’s Model (1925, for ferromagnetic phase transition originally): Binary images on the Cartesian lattice Z2. u=u(i,j)=+1 or –1 (spin up or down). In an external (biasing) magnetic field H , the energy associated to each observed field u is: E[u ] Jk H u ( a ) u ( b ) k a ~b u(c) c o J : internal energy of magnetic dipole (neighbor pair) o a, b, c : representing general pixels, ~: neighboring. But generic (natural) images are not generated from such clean physics. Challenge is to develop good models for (generalized) energies, and properly model short-range or long-range visual interactions. 61 What Features to Regularize: Visual Filters Zhu-Mumford (1997) Treat the human vision system as a system of linear filters: T= (F1 , F2 , …, FN ). Each filter Fn is characterized by its special capability in resolving a particular orientation q with a particular spatial frequency w at a particular scale s . That is, Fn’s are parametrized by the feature vector ( q, w, s ). Basic Assumption: To human vision, the random image fields are completely describable (at least in satisfactory approximation) by the filter response T[u]. All other features are blindly filtered out, and treated visually insignificant. [ This is the hidden rule of visual regularization!! ] Justification (from Statistical Mechanics): Though the dimension of the phase space of a box of air molecules is huge, in equilibrium such a system can be accurately described by three feature parameters: temperature, pressure, and volume. 62 From Visual Filters to Potentials: Maximum Entropy Learning of Zhu-Mumford Zhu-Mumford (1997) Following the proceeding basic assumption, a Gibbs image model would be ONLY based on the statistics of the filter outputs: T[u] = ( F1 [u], F2 [u], … , FN [u] ). But how exactly? Zhu-Mumford’s MEL Model/Scheme: Each filter output vn = Fn [u] is by itself a random field. In the ideal case, suppose we do know the random field distributions: qn ( vn ), n =1:N. Then, these can be used as a set of constraints on the original Gibbs image u. That is, p=p(u) should lead to Prob( vn ) = qn ( vn ), n =1:N. Under these set of constraints, one can use Gibbs variational formulation of Statistical Mechanics to find the unique Gibbs field that maximizes the entropy, which necessarily takes the form of p(u ) 1 Z exp (- n ,v n (n, v n ) F n [ u ]v n ). Dirac’s delta In reality, the spatial structure of vn is often ignored and each is treated as a field of i.i.d.’s. Thus the joint p.d.f qn is a direct tensor product, and is replaced by its 1-D histogram. 63 Learning of Regularization Will Be Momentous … A brief conclusion: Statistical regularization is often achieved through visually meaningful filters and filtering processes. The Gibbs field model can be learned based on the empirical statistics (e.g. histograms) of these filter outputs of a typical image sample, and the ergodicity assumption is thus crucial in such learning processes. Maximum Entropy based Learning Processes (Melp) will play an increasingly important role in vision and pattern analysis. 64 Psychological Regularization: Role of Weber’s Law: A Case Study Weber’s Law Shen (Physica D, 175, 2003) Weber’s Law in Psychology: “Let u denote the mean field of a background (sound/light), and u the intensity increment just detectable by human perception (ears/eyes) [or the so-called JND: justnoticeable-difference]. Then u/u = constant.” First qualitatively described by German physiologist E. H. Weber in 1834; Later formulated quantitatively by the great experimental psychologist G. T. Fechner in 1858. Search your own experience for validating Weber’s Law: In a fully packed stadium: high bgsound u => have to cry loud to be effectively heard by other folks (i.e., u has to be high as well); “月明星稀” (An ancient Chinese idiom): translated to “In a night with a bright full moon, the stars always look scarce.” 66 Is Weber’s Law Psychological or Physiological ? James Keener (Math. Physiology, 1998) Jackie Shen’s Theorem: Any commonly shared psychological phenomenon (among most of the 6,000,000,000 people on this planet) has to be physiological. Weber’s Law expresses the light adaptive capability of the frontal end of the entire vision system – the two retinas. Without Weber’s Law, the retinas would not be able to operate over a wide range of light intensities: from several photons to bright solar light, since the membrane potential V of a neuron cell has a saturated maximum value. Weber’s Law is the result of a feedback mechanism (TranchinaPerskin, 1988) of the retina system, which is “implemented” by the biochemical physiology (ion channels) of the photoreceptors (McNaughton, 1990). 67 Weber’s Law: Regularization Can Respect Real Perception Shen (Physica D, vol. 175, 2003) What does Weber’s Law have to do with regularization? Before Shen (2003), all variational image regularizers in image and vision analysis are defined by either the Sobolev norm | u |2 dx as in the Linear Filtering Theory, and the Mumford-Shah model (1989), or the Total Variation (TV) Radon measure | Du | as in the Rudin-Osher-Fatemi model (Physica D, 1992). The fundamental assumption of such regularizers is that “human visual sensitivity to small fluctuations (or irregularities) depends on nothing else but themselves.” But this is inappropriate according to Weber’s Law ! 68 Weberized Regularization and Applications Shen (Physica D, vol. 175, 2003) Thus Shen (2003) proposed to Weberize (a word conveniently coined) the conventional Sobolev or TV regularizers to or, | u | 2 dx u2 | Du | u For example, the Weberized TV denoising and deblurring model (for additive Gaussian noise) would be to minimize the energy | Du | 0 0 2 E[u | u , K ] u 2 | Ku u | dx, where the light intensity field (or image) u is non-negative. 69 Existence and Uniqueness of Weberization Shen (Physica D, vol. 175, 2003) E[u | u 0 ] | Du | u 2 | u u 0 | 2 dx Admissible space of u: D {u 0 | u L2 (), TV (ln u ) , u 12 u 0 } Existence Theorem: Assume that 0 u 0 A and ln u 0 L1loc () . Then there exists at least one minimizer in the admissible space D. Uniqueness Theorem: 0 Assume that u = z(x) in D is a minimizer and z u ( x) / 2 at each pixel x. Then z(x) is unique. 70 Weberized TV Restoration: An Example Minimum irregularities are adaptively distributed according to Weber’s Law rougher smoother Weberized new model still maintains the old virtue of faithfully restoring sharp edges. 71 Profile of one horizontal slice from noisy image after Weberized TV restoration One-Sentence Conclusion of Today’s Talk Regularization is crucial for visual perception, and presents numerous challenges as well as opportunities for further statistical and mathematical modeling. 72 That is all, folks… Thank you for your patience! 73 Acknowledgments School of Mathematics and IMA, University of Minnesota (UMN). Tony Chan, Stan Osher, Luminita Vese, Selim Esedoglu (UCLA); S.-H. Kang (U. Kentucky); Yoon-Mo Jung (UMN). Gil Strang (my Ph.D. advisor, MIT) for his vision, guide, and support on research. David Mumford and Stu Geman (Division Appl Math, Brown U). Dan Kersten and Paul Schrater (Psychology & EECS, UMN). S. Masnou and J.-M. Morel (France); G. Sapiro and M. Bertalmio (EECS, UMN). Fadil Santosa, Peter Olver, Hans Othmer, Bob Gulliver, Willard Miller, Doug Arnold, Mitch Luskin (Colleagues at Math, UMN). National Science Foundations (NSF), Program of Applied Mathematics; Office of Navy Research (ONR). All the generous support and warm words from: Jayant Shah, Andrea Bertozzi, David Donoho, James Murray, Rachid Deriche. 74