Uploaded by Marcos J Preciado

AQ Data visualization – Tables and Charts

advertisement
Data visualization – Tables and Charts
Data preparation
1. Raw data
2. Structure data
3. Data processing
4. Exploration data analysis (EDA)
5. Insight reports, visual graphs
Data visualization involves
● Creating a summary table for the data
● Generating charts to help interpret, and learn the data
Purposes:
● Summarize data information by highlighting important
relationships and trends
● Identify data errors if any
Tables: Usually have headers, title, observations and variables.
When to use tables?
● Refer to specific numerical values
● Make precise comparisons between different values, not just
relative comparisons
● Variables have different units or very different magnitudes
LESS IS BETTER (such as lines separating data)
Data-ink ratio= ink used to convey the meaning of the data / total ink
used in a table or chart
Table design principles
● Avoid using Vertical Lines unless they are necessary for clarity
● Horizontal lines are necessary only for
○ Separating column titles from data values or
○ When indicating a calculation has taken place
● Use shading to separate columns
● Numbers are right aligned to highlight any differences
● If you are showing digits, be sure to have the same numbers of
digits (such as one piece of data has a decimal then make all
numbers have decimals)
Tables
● Crosstabulation: a table describing relationship between two
variables
Scatter charts
● Relationship between two numerical variables
Line charts:
● Similar to scatter charts with the dots connected
● Useful when one variable is time
Bar charts:
● Categorical data is the minimum (like how many are in Jan or in
Feb or Mar). Should you include 0 in the bar. Always include 0 in
bar charts.
● When there are two categorical data most likely use clustered
column chart
Pie charts:
● Suggest avoiding pie charts,
● Frequently used to compare categorical data.
● Comparing information using areas and angles, which are very
hard for humans to judge.
● Most loathed graph of all time
Bubble charts:
● Visualize 3 variables in a 2 dimensional chart
Heat Maps:
● Use colors to convey information
Scatter-Chart Matrix:
● Study the relationships across many pairs of (numerical)
variables
Distributions plots:
● Histograms: Distribution of one numerical variable. Make sure
the width of the bins are proportional and “make a story”
● Box plots: (side-by-side) Useful for comparing subgroups. Can
use categorical data. Use this chart to identify outliers and try to
explore whether they are due to experimental errors.
Jittering:
● Uncrowds the data by allowing more markers to be seen
● Moving markers by a small random amount
● Tells a better story, moves observations to show multitudes of
data when they are stacked on top of each other.
Map Chart: Combating countries
Geographic information systems (GIS): A system that merges maps
and statistics to present data over different geographic areas.
Data Dashboard: Data visualization tool that illustrates multiple
metrics and automatically updates these metrics as new data become
available.
Key performance indicators (KPI’s) in dashboards for example
● Automobile dashboards: Current speed, fuel level and oil
pressure.
● Business dashboard: Financial position, inventory on hand,
customer service metrics.
Principle of effective data dashboards
● Present al KPIs as a single screen that users can quickly access
● Provide timely summary information on important KPI
● Call attention to unusual measures that may require attention
● Color should be used carefully
Tableau benefits
● Quick and interactive visualization, easy, drag drop, couple of
menus, no coding, merge different datasets, handles more data
than excel
Tableau disadvantages:
● Use excel to clean data, fill in missing values, or create new
variables and then use tableau for data visualization
Dimension: Categorical data
Measure: Numerical data
Download