30th International Conference on INFORMATION TECHNOLOGY INTERFACES Cavtat, Croatia, June 23-26, 2008 Visualization in science Nataša Tepić What is visualization? dictionary: Visualisation is a relatively new term which describes the process of representing information or ideas by diagrams or graphs. expansion: maps, plots, animations, video, movies, ... You must never tell a thing. You must illustrate it. We learn through the eye and not the noggin. Will Rogers (1879 - 1935) One picture is worth ten thousand words. Frederick R. Barnard: Printer's Ink 10.03.1927. JMP The greatest value of a picture is when it forces us to notice what we never expected to see. John Tukey History till 16th century data visualization = maps the oldest known map (town map) 6200 BC Museum at Konya, Turkey the first world map - Anaximander from Miletus in Asia Minor (610-546 BC), Turkey (his map has been lost, Herodotus describes it in books “The Histories” II & IV) History in 15th century - Nikolaus Krebs (Nicholas of Cusa, Nicolaus Cusanus) developed graphs of distance vs. speed, presumably of the theoretical relation during 16th century - development of geometric diagrams and various maps for data exploration– official start of data visualization during 17th century - analytic geometry (René Descartes, Pierre de Fermat, ...) , theories of errors of measurement and estimation, the birth of probability theory, and the beginnings of demographic statistics and '' political arithmetic'' What do we see? form color depth motion Human visual system human sight reacts more intensively on contrast than on intensity colors which we *see* are not completely identical to the colors in the nature purpose of human sight is constant object recognition regardless of angles, distance or lighting Seeing is a Complex Process Our brain constructs image from: information from our eyes information stored in our brain perception process of collecting information about world through our senses and their interpreting perception depends on cultural heritage perception changes with experience Do you ever get something like this via e-mail? Optical illusions Illusions trick us into perceiving something differently than it actually exists, so what we see does not correspond to physical reality. The word illusion comes from the Latin verb illudere meaning, "to mock." Why are optical illusions important for data visualization? inappropriate visual stimulation can confuse our brain manipulative visual stimulation can cause wrong interpretation Optical illusions Problems with visual perception: area angles perspective How much is the area of circle B? A B 10 ?? answer: 17 angle problem perspective problem co-effects Müller-Lyer co-effects co-effects co-effects (pattern completion) Johann Poggendorff’s illusion co-effects co-effects co-effects co-effects Color Meaning Colors are non-verbal communication. They have symbolism and color meanings that go beyond ink. red blue yellow green orange purple pink action, confidence, courage, vitality unity, harmony, calmness, coolness, conservatism joy, optimism, summer, cowardice, greed spring, fertility, youth, environment, good luck energy, heat, enthusiasm, playfulness royalty, nobility, ceremony, magic, mystery femininity, love, beauty male female A B Color Blindness Ishihara Test for Color Blindness About 12 - 20 percent of white males and a tiny fraction of females are color blind. 1 Top Bottom Normal Color Vision Left Middle Right 25 29 45 56 6 8 Red-Green Color Blind Left Middle Right 25 Spots Spots 56 Spots Spots small squares –same color or not? theory .... Edward E. Tufte (professor emeritus of statistics, graphic design, and political economy) "The Leonardo da Vinci of data." New York Times he coined the term "chartjunk“. chartjunk This chart shows only five hard-to-read numbers, 1, 2, 4, 8 and 16, but the digital file of the image is 11216 bytes (numbers) in size. theory.... Tufte uses the term data-ink ratio and argues strongly against the inclusion of any non-informative decoration in visual presentations of quantitative information and claims that ink should only be used to convey significant data and aid in its interpretation. Lurking behind chartjunk is contempt both for information and for the audience. Chartjunk promoters imagine that numbers and details are boring, dull, and tedious, requiring ornament to enliven. Cosmetic decoration, which frequently distorts the data, will never salvage an underlying lack of content. If the numbers are boring, then you've got the wrong numbers. Credibility vanishes in clouds of chartjunk; who would trust a chart that looks like a video game? Edward Tufte, "Envisioning Information", 1990 If a picture is not worth a 1000 words, to hell with it! Ad Reinhardt Maps maps show variations of a variable value with respect to an physical/geographic area actual value of the response variable corresponds to: height of the object (block, polyhedron, prism, spike,...) color Maps problems: of the object perspective color legend height Maps benefits: position physical/geographic information the best statistical graphic ever drawn Like good writing, good graphical displays of data communicate ideas with clarity, precision, and efficiency. The French engineer, Charles Minard (1781-1870), illustrated the disastrous result of Napoleon's failed Russian campaign of 1812. The graph shows the size of the army by the width of the band across the map of the campaign on its outward and return legs, with temperature on the retreat shown on the line graph at the bottom. Many consider Minard's original the best statistical graphic ever drawn. Why? the best statistical graphic ever drawn He took a two dimensional space and managed to accurately depict five data variables: size of invading army, size of retreating army, geographic location, temperature, and of course, time. The multivariate data is presented in such a way as to provide an intriguing narrative as to the fate of Napoleon’s army. - new version (SAS/IML) cholera epidemic in London 1854. Dr. John Snow (1855) observed that cholera occurred almost entirely among those who lived near (and drank from) the Broad Street water pump. He had the handle of the contaminated pump removed, ending the neighborhood epidemic which had taken more than 500 lives. Florence Nightingale - mother of modern nursing After witnessing deplorable sanitary conditions in the Crimea, she wrote Notes on Matters Affecting the Health, Efficiency and Hospital Administration of the British Army (1858), an influential text including several graphs which she called "Coxcombs". This figure (reproduced with SAS/Graph) makes it abundantly clear that far more deaths were attributable to non-battle causes ("preventable causes") than to battle-related causes. Escaping the 2D plane: The Stereogram By the end of the 19th century, as more statistical data became available, the limitations of 2 dimensions of the plane for the representation of data were becoming more apparent. Several systems for representing 3D data were developed between 1869-1880. - author Luigi Perozzo - Annali di Statistica, 1880. - this figure shows the population of Sweden from 1750-1875 by age groups Chemical examples: 1. periodic table of chemical elements Dimitri Mendeleev (1834-1907, Russian chemist) He predicted the chemical and physical properties of unknown elements (e.g. Ga, Ge), and left spaces open in his periodic table for them. order in periodic table by mass number Chemical examples: 2. wavelengths of the X-ray emissions of the elements Henry Moseley (1887-1915, British chemist, Rutherford’s student) In 1913 Moseley published the results of his measurements of the wavelengths of the X-ray spectral lines of a number of elements which showed that the ordering of the wavelengths of the X-ray emissions of the elements coincided with the ordering of the elements by atomic number. It became apparent that atomic weight was not the significant player in the periodic law as Mendeleev, Meyers and others had proposed, but rather, the properties of the Chemical examples: 2. wavelengths of the X-ray emissions of the elements Henry Moseley (1887-1915, British chemist, Rutherford’s student) Moseley's graph represents an outstanding piece of numerical and graphical detective work. He noted that there were slight departures from linearity which he could not explain; nor could he explain the multiple lines at the top and bottom of the figure. The explanation came later with the discovery of the spin of the electron. Literature SRCE http://www.srce.hr/stat-sas/tecajevi.html Gallery of Data Visualization - The Best and Worst of Statistical Graphics http://www.math.yorku.ca/SCS/Gallery/ Milestones in the History of Thematic Cartography, Statistical Graphics, and Data Visualization http://www.math.yorku.ca/SCS/Gallery/milestone/ Predavanja Ross Ihaka (Statistics 120 - Information Visualisation) http://www.stat.auckland.ac.nz/~ihaka/120/ http://www.math.yorku.ca/SCS/sugi/saslogo.html http://chemweb.calpoly.edu/ (Jennifer Retsek's Homepage) http://www.ritsumei.ac.jp/~akitaoka/index-e.html http://junkcharts.typepad.com/ http://www-personal.engin.umich.edu/~jpboyd/sciviz_1_graphbadly.pdf http://CAUSEweb_org https://www.edwardtufte.com/ http://www.csc.villanova.edu/~map/1040/Tufte1.ppt#1 http://homepages.dcc.ufmg.br/~jussara/metq/aula10.ppt#89 http://www.gautschy.ch/~alfred/SciIll/CraftingIll.html http://spacecraft.ssl.umd.edu/old_site/academics/483F02/04_eng_graphics_2002.pdf