Interactive Graphics Stat 579 Outline • (Time Series) • Interactive Approaches • iplots • rggobi NASA Meteorological Data 24 x 24 grid across Central America • satellite captured data: temperature, near surface temperature (surftemp) pressure, ozone, cloud coverage: low (cloudlow) medium (cloudmid) high (cloudhigh) • for each location monthly averages for Jan 1995 to Dec 2000 Gridx 1 to 24 Gridy 1 to 24 • What is a Time Series? 305 300 295 ts for each location multiple measurements 290 285 280 qplot(time, temperature, geom="point", data=subset(nasa, (x==1) & (y==1))) 275 10 20 30 40 50 60 70 40 50 60 70 40 50 60 70 TimeIndx 305 300 ts connected by a line 295 290 285 qplot(time, temperature, geom="line", data=subset(nasa, (x==1) & (y==1))) 280 275 10 20 30 TimeIndx 305 qplot(time, temperature, geom="line", data=subset(nasa, (x==1) & (y %in% c (1,15))), group=y) 300 295 ts but only connect the right points 290 285 280 275 10 20 30 TimeIndx Practice each location, draw a time series for pressure. • For What do you expect? Are there surprising values? Which are they? near surface temperatures for each location • Plot Which locations show the highest range in temperatures? Which locations show the highest overall increase in temperatures? use ddply to get these summaries Interactive Graphics • Based on Linked Graphics • Most important common tools: Selection & Highlighting, Identifying Points • Plot-specific interactive tools Install iplots • For Windows: install.packages(“iplots”) • For Macs: download & install JGR from http://rosuda.org/JGR/down.shtml (includes iplots and depending packages) • Help files & documentation: http://www.rosuda.org/iplots/ iplots Graphics help(library= “iplots”) • iplot: Scatterplot • ihist: Histogram • ibar: Barchart • ibox: Boxplot • imosaic: Mosaicplot • ipcp: Parallel Coordinate Plot Select & Identify • iset.selected() gives index vector of current selection • iset.select(indices) highlights specified values in current data set Practice • load the iplots package near surface temperatures for each location • Plot Which locations show the highest range in temperatures? Which locations show the highest overall increase in temperatures? package “rggobi” • developed by Duncan Temple-Lang, Debby Swayne, Michael Lawrence, Hadley Wickham • Gtk based plots • allows interactive link between R and ggobi gd<-ggobi(nasa.wide[,1:5])[1] clust9<-cutree(nasa.dend,k=9) glyph_type(gd)<-4 glyph_size(gd)<-3 glyph_color(gd) <- clust9+1 which(selected.GGobiData(gd)) summary(gd[,1:5]) gd$clusters <- clust9 Practice • Use a different clustering algorithm (e.g. try method= “Complete” or “Average”) • visualize spatial distribution of clustering results using ggobi • compare the clustering results - what is your interpretation? Mosaicplots • Area representation of contingency table • Interactive tools: arrows re-organize order of variables and #dimensions shown Your turn: Whodunnit • Dataset whodunnit contains “story” • Load data set • Use interactive tools to figure out what happened • Once you know (or suspect) the story behind this data, collect graphical evidence for your theory.