Data Analysis UNLOCKING THE SECRETS HIDDEN IN YOUR DATA Why Do Data Analysis ? Avoids incorrect assumptions Does the data makes sense? Which one is better? 402000 400000 398000 396000 394000 392000 390000 y = -20x2 + 60x + 400100 R2 = 0.9988 388000 386000 384000 382000 0 5 10 15 20 25 30 35 30 35 402000 400000 398000 396000 time 0 10 20 30 elevation 400000 399000 393000 384000 394000 392000 390000 388000 y = 0.3333x3 - 35x2 + 216.67x + 400000 R2 = 1 386000 384000 382000 0 5 10 15 20 25 Why Do Data Analysis ? Are your assumptions correct? Did you collect enough data? If this is a model of a following body which is better? Be careful what's better mathematically is not always better scientifically 430000 y = 0.3333x3 - 35x2 + 216.67x + 400000 R2 = 1 410000 390000 370000 350000 330000 310000 290000 y = -20x2 + 60x + 400100 R2 = 0.9988 270000 250000 0 10 20 30 40 50 60 70 80 90 Ways to Analyze Data Plotting Data Ways to visually understand data Statistics Makes is easier to compare data Mean, Median, Mode Makes it clear if you have NOISY data Range, Variance, Standard Deviation 30 25 20 Mean Pink Pink 15 Mean Blue Blue 10 5 0 0 10 20 30 40 50 60 Ways to Analyze Data Derivatives (Slopes) Tell if changes in parameters affect data Parameter 2 has a greater effect than Parameter 1 Get more information from data Great Derivative 4 3.5 Slope = 0.39 3 2.5 Base Case Parameter 1 2 Slope = 0.16 Parameter 2 1.5 1 Slope = 0.08 0.5 0 0.00 2.00 4.00 6.00 8.00 10.00 12.00 Plotting Data – Extracting from Netlogo Two ways 1st Way: Write code to extract the data you want – see File Output Example in the Code Examples Open file in setup procedure Create a write-to-file procedure Plotting Data – Extracting from Netlogo 2nd way: Extract data from Netlogo graphs Have Netlogo generate graph on Interface page (example on later slide) Create a setup-plot procedure and a do-plot procedure Call the setup-plot procedure in setup procedure Call do-plot procedure in go procedure Plotting Data – Extracting from Netlogo Run model until sufficient data obtained (PC) Right Click on Graph/(Mac) Select Export Choose location and File name - select save Excel File is created – Next Slide Contains all the information in the plot and input parameters used. Contains excess information about the plot (color, pen down, mode, interval…) LET’S DO IT – Open Rabbits Grass Weeds Plotting Data – Extracting from Netlogo This is what You need Plotting Data – Different Types of Plots All plots from http://www.statcan.ca Bar Charts – preferred snacks Pie Charts – music preference Pets purchased at pet store Plotting Data – Different Types of Plots All plots from http://www.statcan.ca Line Graphs – cell phone use http://www.statcan.ca Scatter Plots http://en.wikipedia.org/wiki/Scatterplot Plotting Data – Activity in Excel LET’S DO IT Open File Car Data Insert Chart Select type of chart XY Scatter Select Data Range Highlight data to be plotted Plotting Data – Activity in Excel Label each data series Label Graph and Axis Select where you want graph to be (on that page -worksheet – or on another worksheet in same file) Statistics Statistics help you Summarize data Describe data Analyze data Hard to describe the difference Between the two data sets 22 Now it is easy to summarize, describe and analyze the data…. The blue and the pink data have the AVERAGE value (mean) but the blue data is “NOISIER” (greater standard deviation). Therefore… 22 18 18 Noisy Noisier 14 14 Mean (both) Noisy Noisy + 2SD Noisier Noisy - 2SD 10 10 Noisier + 2SD Noisier - 2SD 6 6 2 2 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Statistics – How to Calculate in Excel +,-,*,/ used for addition, subtraction, multiplication and division. Each cell has a label based on the column and row. Use cells to perform calculations instead of numbers. Example : =(A4+B4)/C4 Perform calculations on an entire column - copy and paste the equation .Warning : this changes the cell number for each line. Fix a specific cell - use the $ symbol, example (A4+B4)/$C$1 Excel has many built in statistical functions Makes life easy! E1 Statistics – Measurements of Central Tendency Mean (Average), Median, and Mode Definitions Mean (Average) – Sum divided by the number of data points Median – Middle data point when arranged from highest to lowest Mode – Most frequent value LET’S DO IT : StarlogoTNG : Fish and Plankton Netlogo : Rabbits and Grass Use data set to calculate Mean (Average) Median, Mode, Max and Min Select Cell where you want the value of the function to appear Select Insert then Function Select Statistical Select function wanted (AVERAGE, MEDIAN, or MODE) then hit OK Select Range of data you want to analyze by clicking on range symbol and highlighting range. Hit enter or OK Statistics – Measurements of Data Spread Range, Variance and Standard Deviation Definitions Range = maximum - minimum Variance = measures noise of the data around the mean value. Standard Deviation (S) is the square root of the variance. Most commonly used measure of spread (same units as the data). Another reason to use S: ~68% of the data are in the interval Mean – S to Mean + S ~95% of the data are in the interval Mean – 2 S to Mean + 2 S ~99% of the data are in the interval Mean – 3 S to Mean + 3 S 300 250 Number of Rabvits Rabbit Population 200 150 100 50 0 0 500 EXCEL does it for you!!! LET’S DO IT : StarlogoTNG : Fish and Plankton Netlogo : Rabbits and Grass 1000 1500 Ticks Rabbits Mean Mean - 2 S Mean + 2 S 2000 Distance 30 25 20 15 10 5 0 0 2 4 6 8 10 12 10 12 10 12 Slope of distance 8 7 6 Velocity What are Derivatives? A simple calculation using data Instantaneous rate of change = SLOPE Why use Derivatives? Get more information from data More Ways to comparison data Car moving down a road Data = the distance traveled Velocity = the 1st derivative of distance Acceleration = 2nd derivative of distance = the 1st derivative of velocity 35 5 4 3 2 1 0 0 2 4 6 8 Slope of velocity 2 Acceleration Derivatives 40 1 0 0 2 4 6 -1 -2 -3 -4 Time 8 How to Calculate a Derivative Mathematically: x = position t = time B3 B2 A3 A2 x t You Don’t Have To Use This Use this in Excel In Excel x t x 2 x1 t 2 t1 LET’S DO IT : StarlogoTNG : Fish and Plankton Netlogo : Rabbits and Grass