MULTIMEDIA SIGNAL PROCESSING BASIC PROBLEMS IN PROCESSING MEDIA INFORMATION MMSP Irek Defée Kinect – new media interface • Before we proceed we mention important development in the progress of media interfaces • This is a device and system called Kinect made by Microsoft. Kinect is available as product from the beginning of November 2010 Kinect is a part of Microsoft Xbox game platform but it can be bought separately! MMSP Irek Defée What is Kinect? • Kinect is a new type of hardware for interacting with people - with proper software support of course • Kinect looks like this MMSP Irek Defée What is inside Kinect? There is a hardware worth about 40 euro, working in the following schematics plus software which extracts signals and sends them for processing to Xbox. Processing takes about 5% of Xbox power (Xeon processor) MMSP Irek Defée How the Kinect works? Kinect has FOUR microphones to retrieve spatial sound and attenuate noise, interferences and compensate for room acoustics Kinect has small color camera with 640x480 resolution MMSP Irek Defée Most advanced aspect Kinect ”eyes” Eyes of Kinect are made by ab INFRARED MEASUREMENT SYSTEMLaser beam is send from the objective and received by sensor as can be seen above. These sensors can move to adjust for the distance and height. This device produces MAP OF DEPTH to objects in a room. The device can thus ’see’ in bad light or in darkness. Before the use it is TRAINED with movements of persons in the room. You can see on the right that in infrared the beam makes lots of measurement dots MMSP Irek Defée What Kinect does? Kinect recognizes voice IN ROOMS and can be used for voice control of applications Kinect recognizes persons and body movements which is used in applications But before this Kinect is TRAINED interactively like shown in pictures After the training person and body movements will be recognized. More than one person can be identified in a scene MMSP Irek Defée Why Kinect is revolutionary? • It is the first practical natural interface for machines communicating with people • It works in normal rooms • It is combining acoustical and visual sense • It is recognizing full body movements, even complicated ones • It is recognizing persons • It works well, it is not perfect but one can predict there will be much more in the future MMSP Irek Defée Kinect applications Games and interactive playing (sports, dancing) More applications: exercising, rehbilitation, child development Control of devices by voice, gestures Automation, robotics More…. we do not know yet… but the public drivers are partially available MMSP Irek Defée Back to the lectures • We continue with the overview of the biological systems and priniciples of sensory information processing to finish it with some conclusions MMSP Irek Defée FROM PREVIOUS LECTURES WE KNOW THAT MULTIMEDIA INFORMATION PROCESSING IS EXCELLENTLY DONE BY THE HUMAN INFORMATION PROCESSING SYSTEM MMSP Irek Defée • OUR PROBLEM IS: Biological systems perform processing of audiovisual information using special ”hardware” (which could be called ’wetware’) and ’software’ that is algorithms. The question is: Can we make processing of audiovisual information using different hardware and software? Maybe algorithms could be similar? MMSP Irek Defée Let us take visual processing as example IN HUMAN VISUAL SYSTEM PROCESSING STARTS IMMEDIATELY IN THE RETINA AND THERE ARE COLOR PROCESSING AND BLACK AND WHITE LIGHT ACQUISITION AND PROCESSING SYSTEMS MMSP Irek Defée FROM COLOR AND BLACK & WHITE RECEPTORS SIGNALS GO TO INITIAL PROCESSING ELEMENTS OUTPUT LINKS IT IS IMPORTANT TO NOTICE THAT THE NUMBER OF COLOR PROCESSING ELEMENTS IS MUCH LOWER THAN BLACK AND WHITE MMSP Irek Defée • WHAT THESE PROCESSING ELEMENTS DO? I MOST RECENT MEASUREMENTS OF RETINAL NEURAL CELLS SHOW THAT THEIR RECEPTIVE FIELDS ARE QUITE IRREGULAR IN THE FOLLOWING PAGES SOME INFORMATION ABOUT WHAT THESE CELLS ARE DOING IS GIVEN MMSP Irek Defée • BAR OF LIGHT IS MOVED OVER PHOTORCEPTORS IN DIFFERENT DIRECTIONS OUTPUT OF THE PHOTORECPTORS IS SUMMED WITH POSITIVE SIGN (EXICITATION) OR NEGATIVE SIGN (INHIBITION) MMSP Irek Defée DEPENDING ON THE DIRECTION OF MOTION SIGNALS SUM UP STRONGLY OR NOT MMSP Irek Defée • HERE THE MEASURED SIGNALS ARE SHOWN FOR CELLS WHICH REACT STRONGLY TO WHITE BAR ON BLACK BACKGROUND AND OPPOSITE (off) MMSP Irek Defée • HERE WE SEE THE RESPONSE MEASURED IN TIME MMSP Irek Defée • WE CAN SEE THAT INITIAL PROCESSING IN THE EYE INCLUDES DETECTION OF DIRECTIONAL CHANGES IN LIGHT INTENSITY THIS MIGHT BE DONE FOR DIFFERENT COLORS TOO MMSP Irek Defée WE CAN NOW ASK FOLLOWING QUESTIONS: WHY THE PROCESSING IS ORGANISED IN THIS WAY? FOR THE ANSWER WE CAN THINK THAT THE PROCESSING IS OPTIMISED IN SOME WAY. WHAT MIGHT BE OPTIMISATION CRITERIA? WHAT ARE THE GENERAL PRINCIPLES OF HUMAN/BIOLOGICAL INFORMATION PROCESSING? MMSP Irek Defée OVERLAPPING SQUARES OR NOT??? MMSP Irek Defée • WHY WE SEE HERE THREE SQUARES AND NOT CUT OUT SQUARES? THIS IS BECAUSE THE VISUAL SYSTEM PRODUCES INTERPRETATION WHICH IS MOST PLAUSIBLE (GENERIC) BUT IT MAY BE WRONG TOO, ALTHOUGH WE WOULD BE NOTE THAT ONLY ONE SURPRISED IT WOULD REALLY SQUARE IS FULLY BE!!! VISIBLE, OTHERS ARE MMSP Irek Defée COVERED, IN FACT THEY MAY NOT BE SQUARES • THE INTERPRETATION PRODUCED IS FOR DETECTING MOST PROBABLE OBJECTS THE UPPER FIGURE IS DETECTED AS ARCH OVERLAID ON THE SAWTOOH THIS IS THE MOST PROBABLE INTERPRETATION THE BOTTOM FIGURE INTERPRETATION IS SURPRISING, BUT IT COULD ALSO BE PRODUCED IF THERE WILL BE MORE EVIDENCE MMSP Irek Defée • VISUAL SYSTEM ASSUMES THAT LIGHT IS COMING FROM TOP LIGHT DIRECTION SAME PICTURE UPSIDE DOWN MMSP Irek Defée • The statistics-based system works normally in almost perfect way. As we could see it fails sometimes when input signals are highly improbable and/or if most probable interpretation is not correct. This can be seen in visual illusions. We will look at them closer since recent statistical approach is explaining them. This provides for us a hint what kind of processing is done. MMSP Irek Defée WE CAN NOW ASK FOLLOWING QUESTIONS: WHY THE PROCESSING IS ORGANISED IN THIS WAY? FOR THE ANSWER WE CAN THINK THAT THE PROCESSING IS OPTIMISED IN SOME WAY. WHAT MIGHT BE OPTIMISATION CRITERIA? WHAT ARE THE GENERAL PRINCIPLES OF HUMAN/BIOLOGICAL INFORMATION PROCESSING? MMSP Irek Defée Principles we can identify now: • Statistical processing matched to the real world signal statistics – provides responses to most probable signals. This is very natural principle • Minimization of information processed, as much information as possible is eliminated, minimum information needed to provide response is used. This principle allows to minimize energy and processing effort. MMSP Irek Defée • A book which appeared in 2005 based on earlier research: MMSP Irek Defée • The authors are visual psychologists, they consider vision as a system interpreting world from images projected onto the eye: Light from external source bounces of objects and is projected. This projection is not unique (e.g. objects of different size will have the same projection depending on their distance MMSP Irek Defée • In visual illusions projection gives rise to improper interpretation Natural scene, illusion persists Stimuli changes, illusion persists, MMSP Irek Defée This picture gives strong of depth because of combination of many mutually consistent cues: -perspective -texture gradient -Shading and shadow MMSP Irek Defée • Geometry of natural scenes Geometrical illusions represent wrong interpretation od real world. To find out why researchers took pictures with depth map Laser range scanner for Measuring distance Real pictures with corresponding distances marked by colors MMSP Irek Defée • If large number of such pictures is taken a database can be created in which real world objects are matched with distances and statistics is calculated. Example: subjective metrics Let’s think about lines of different lengths which are seen in real world. If all length would have the same probability there would be linear relation between the stimulation for every length. But if this is not the case, some length will be stimulated more often. This can lead to distortions in perception. MMSP Irek Defée • Example: Line length illusion Variation of apparent length as function of orientation In experiments people report changing length depending on angle MMSP Irek Defée • Why it is so? Let’s sample lines in pictures from database The points in the picture were compared with measured by laser range to see if they correspond to lines in real world. Total of 1.2x10^7 line segments were collected Grid of templates White – accepted lines, to overlay on picture Black – rejected lines with straight lines Probability distribution of of lines vs. length for different orientations MMSP Irek Defée Cumulative distribution (lines shorter than x) This shows how many lines at certain orientation corresponded to real lines of length shorter or equal to x • Prediction of apparent length based on probability Take e.g lines of length 7 at orientation 20 deg, their cumulative probability is 0.15 which means that 15% lines is shorter than 7 pixels and 85% is longer. For all orientations we get this plot This is very smilar to the one measured in experiments with people!!! MMSP Irek Defée • Why such biases exist? In nature lines do not appear often, horizontal lines are typically generated from horizontal flat surfaces Vertical lines are limited by gravity and by this rare and lines at 20-30deg even more Rare, and they are mostly projected from perspective MMSP Irek Defée • Visual illusions: Angles All angles in this picture have 90 deg but when they are projected on the eye, projections may differ up to 60 deg A) Bias in angle estimation between two lines B,C,D) Angle illusions MMSP Irek Defée • To explain this a database of angles is made, as before Extraction of angles Probability distributions for different Types of angles (bottom line) in natural scenes and scenes with human created objects We can see bias: angles close to 90 deg are less likely to occur MMSP Irek Defée Probability distribution of angles is not linear, cumulative probability is biased • Bias and illusions Angles close to 90 deg are more likely To come from planar surface, which is typically larger than surface from lines interesecting at smaller angles. Thus 90 deg angles are less likely Thus predicted perceived angle is different from actual one, for 90 deg it is the same The magnitude of angle misperception (lines) vs. experimentally measured values MMSP Irek Defée • Explanation of angle illusions Why vertical line is tilted? We take reference line at 60 deg (black) and check probability of occurence of physical sources of a second line oriented at different angles. Since the angle between the lines is 30 deg we look at the probability for 30 deg and then into cumulative probability (previous page) which gives value 0.184 which multiplied by 180 gives angle 33,2 deg in agreement with measurements MMSP Irek Defée • Size illusion According to the previous explanations the reason for this illusion is: Various size illusions of center and surrounding Probability distributions of the possible sources of the targets, given their different contexts, are different To check this hypothesis database was searched for circular objects and probabilities of the sources of targets in the context were calculated: MMSP Irek Defée Experimental conditions a) The inner circle is surrounded by the 4 circles with changing diameters b) Probability of occurence of center circle with specific size for outer circles with different diameters. Dashed line shows probability for circle with 14 pixels diameter. (Bigger surrounding circles are much less likely to appear) c) Cumulative probability for 14 pixel circle d) Examples of scenes with large circles and small circles Why there are statistical differences? Circles originate from planar projections, larger circles are less likely. Why the presence of surrounding circles changes the occurence of target central circles differently? Larger circles arise from larger planes in the world, they are flat areas – then it is more probable that the central circle will be larger. In other words, the presence of larger surrounding circles increases the probability of of occurence of physical sources of larger central circles. In result probability Distribution of central circles is changing according to the size of surrounding circles. MMSP Irek Defée • Changing the interval between center and surrounding circles Probabilities when the distance is changing Dashed line is for circle of size 14 Cumulative Probability for the 14 pixel circle MMSP Irek Defée • Comparison of inner circle with single circle a) b) c) Probability distribution of singel circle vs. diameter Probability for single circle superimposed with probability of central circle surrounded by outer circle, dashed line is for 24 pixel circle, probability curve is for outer circle 32 pixel diameter, cumulative probability is much higher – there is bias When the outer circle is much bigger the cumulative probability is smaller The changing cumulative probability ratios and dependence on the central and outer circle sizes is well seen – and illusion depends on these parameters in exactly the same way MMSP Irek Defée • Distance illusions a) When objects are close perceived distance is overestimated to physical one b) Objects which are close to each other are perceived as being at the same distance c) The distance to close objects is overestimated, the distance to far objects is underestimated d) Objects on the ground when they are about 7m distance appear closer and with increasing distance they appear more elevated MMSP Irek Defée According to the methodology probability distribution of distances is measured but there are several variables here: Probability of all distances from scanner MMSP Irek Defée Probability of the differences in distances between objects for three different horizontal angles Probability of horizontal distances different heights with respect to eye level • Interpretation of these probabilities a) This curve for all distances has strong peak for distance of 3m . This is in agreement with experiments in which people seeing single objects hanging in completely dark scene report them as being in the distance of 2-4 m b) When the angular separation between the objects is small they tend to be seen at equal distance but this tendency decreases when the angle is increasing c) The dependency of probability of distance vs. eye level has peak at distance of 4 m. Thus for objects at distance less than 4 m will be overestimated and those at distance more than 4 m will be underestimated. This agrees with experiments MMSP Irek Defée Why this happens? • The size illusion Again, for explanation database is searched for such patterns and probabilities are calculated. Here we consider case when both gigures ar inline, on the left/right The size illusion does not depend nn particular type of endings Templates used It can be induced even without line and even (but less strongly) with dots MMSP Irek Defée Templates overlaid on pictures • Results of probability calculations a) b) c) d) Figures are in-line extending to the left or to the right MMSP Irek Defée Probability of lines with specific length and arrows pointing inwards and outwards Cumulative probabilities Superimposed cumulative probabilities showing differences Example of two lines of length 50 pixels. One can see that cumulative probability for outward arrows is higher which corresponds to the bigger length. • Angle illusion The line is interrupted by vertical occluder It is then perceived as two segments shifted Why this happens? Again statistics of such patterns is calculated from the database od pictures MMSP Irek Defée • Templates for calculation a) Shows the templates, for each red line there is one template corresponding to the shift b) The templates are matched in the pictures and statistics can be calculated c) Other templates can be used for different configurations of this illusion d) Definition of the difference in location of the line segments MMSP Irek Defée • Probability distributions measured We can see peaks which are at nonzero shift So the most probable interpretation from this statistics is that that there is nonzero shift MMSP Irek Defée • One can also study what is the effect of angle of the line and the width of the distractor Change of line orientation Change of width of the distractor As can be seen whent the are larger, The peak moves towards greater shifts which implies that the illusion will be stronger – and it is really so MMSP Irek Defée • The processing of information in biological systems is statistical – it aims for producing MOST PROBABLE response to the signals coming from real world. This type of processing must be based on statistics of signals and models from real world. Result of processing is most plausible answer for ”normal conditions” and assumptions. This we have seen in the examples before and they are repeated next. MMSP Irek Defée CONCLUSION • Statistics based processing seems to be very strong in explaining visual illusions (many of them in the same way) The principle of statistical processing is powerful: The system collects information about most likely distribution of signals and provides most probable interpretations for them. This will work in most cases. Only when signals are very nontypical it will fail but this is rare. MMSP Irek Defée BUT…. • We have to remember that biological systems are able to deal with extreme variations of signals and still extract right information from them. This will be illustrated now by the example of face recognition Faces can be distorted in many ways and still recognized. We can guess something about PRINCIPLES OF FACE PROCESSING MMSP Irek Defée We can recognize FAMILIAR faces from extremely low resolution pictures. How this is done? – We do not have clear idea – but it points to the minimization of processed information MMSP Irek Defée Contour information is not enough MMSP Irek Defée Face is processed somehow as a ”whole” and not as composed by parts. From the combined picture on the left we see new face, when we split it we recognize other faces MMSP Irek Defée Eyebrows are very important for the identification of faces MMSP Irek Defée Faces can be recognized despite extreme distortions MMSP Irek Defée Faces seem to be encoded in memory in exaggerated. caricature way: A) Average face (averaged from a number of persons B) Some typical face C) Face created by taking bid deviation from average Such faces are recognized even better than typical ones MMSP Irek Defée Newborn babies turn more attention to more face-like objects (upper row) than not face-like MMSP Irek Defée Faces and antifaces: If face within green circle is observed for some time the center one will not be correctly recognized but as one in the red circle (more distance from the center means more differences) This means that there is some kind of prototype encoding and tuning to it MMSP Irek Defée Impact of skin pigmentation Row 1: Faces differ only in shape Row 2: Faces differ only in skin pigmentation but not shape Row 3: Faces differ in shape and pigmentation We see that pigmentation has significant impact (row 2) MMSP Irek Defée Color helps: Left original Middle black and white Right color only, eyes can be located more precisely MMSP Irek Defée From negative picture it is impossible to identify faces MMSP Irek Defée Face recognition is strongly compensated for the direction of ilumination, pictures above are easily recognized as same person MMSP Irek Defée Resonse of neural cell of monkey in the face processing area of the brain. Response to something like face is much more stronger than for hand. (But remember that milions and milions of cells are processing at the same time) Measurement from human brain: signal from face-like pictures is much stronger than from other objects MMSP Irek Defée The examples shown for faces indicate how sophisticated is information processing in biological systems. What is very amazing is getting correct results despite extreme distortions. For the most part, we do not know how this is done and we have difficulty in thinking how To develop algorithms which would have similar capabilities. This is the topic for studies in the future MMSP Irek Defée