Why histograms and scatter plots? (visualization) Be ready to plot simple graphs Why do we need visualization in business life imp aspects of it? Why are we using maps for example MT most of visualization, visualization mistakes, why correct or wrong, bin size of the histogram, Why it is imp to plot same data graph for diff bin sizes. READ THE PROJECT DOCUMENT What kind of visualization will you use choose at the beginning? Why 3D visualization is not precise bcz we don’t have volumetric perception, Steven’s power function (for senses). For data exploration check the web page. Understanding anomalies, trends, how to deal with them. Checklist, skewness left right, dynamics, missing data. Inputation is for missing data Future eng mining and transformation // not inc For outliers you can use tranformations Future engineering end of the course case studies. Not in mt. XML , web standards HTML, JASON how do we organize data look into them. XML flexible human and computer readable infinite possibilities. How the data is stored using the schema. Not the language itself but how do we operate relational data bases? Why pandas makes our lives easier? To push data to data server distributed data sets ETL? Enabling techs are very imp. Bcz complex services. Prepare one page cheat sheet. Single var histogram Relationship between two variables scatter plot Jason is also human readable and also for machine-to-machine interaction Smaller memory if we do it binary (not human readable) Kurtosis? Skewness? Correlation covariance and variance are imp. Z test imp, T test imp, chi-sqr (not that imp) 1.Scatter 2.bar 3.line 4.pie 5.color most imp to least For very few elements pie chart is okay but for many elements not We are not very good at understanding the area 2nd mt more on personal view on the how to explore data and handle missing info (about your project) Start thinking about your project.