Uploaded by Abdul Rehman Abbasi

Data Visualization: Concepts, Techniques, Best Practices

advertisement
Data Visualization: Key Concepts,
Techniques, and Best Practices
by
Dr. Abdul Rehman Abbasi
What is Data Visualization?
Data visualization is the
graphical representation of
information and data.
By using visual elements like
charts, graphs, and maps, data
visualization tools provide an
accessible way to observe trends,
outliers, and patterns in data.
2
Importance of Data Visualization
Simplifies Complex Data: Makes large datasets easier to understand.
Reveals Insights: Highlights trends, correlations, and anomalies.
Improves Decision-Making: Provides a clearer basis for informed decisions.
Enhances Communication: Makes presentations more engaging and impactful.
3
Complex versus Simple
4
5
6
7
8
Source: NEPRA State of Industry Report 2023
9
10
Source: IAEA Annual Report 2023
11
Source: IAEA Annual Report 2023
12
Key Elements of Effective Data Visualization
Clarity: The visualization should be easy to interpret.
Accuracy: Data must be presented without distortion or bias.
Relevance: Focus on what’s important for the target audience.
Design: Use colors, labels, and layouts effectively to avoid clutter.
13
Common Data Visualization Techniques
2. Graphs
3. Maps
4. Specialized
Visualizations
Line Chart
Scatter
Plot
Choropleth
Map
Tree Map
Bar Chart
Histogram
Heat Map
Network
Graph
Pie Chart
Box Plot
1. Charts
Gantt Chart
Stacked Bar
Chart
14
Common Data Visualization Techniques
1. Charts
2. Graphs
3. Maps
• Line Chart: Shows trends
over time.
• Bar Chart: Compares
categories or groups.
• Pie Chart: Displays
proportions or percentages.
• Stacked Bar Chart:
Represents cumulative
values across categories.
• Scatter Plot: Shows
relationships or correlations
between two variables.
• Histogram: Displays
frequency distributions.
• Box Plot: Summarizes
distributions and outliers.
• Choropleth Map: Uses
color shading to show data
values across geographical
regions.
• Heat Map: Highlights
density or intensity of data
points.
4. Specialized
Visualizations
• Tree Map: Represents
hierarchical data using
nested rectangles.
• Network Graph: Shows
relationships and
connections.
• Gantt Chart: Tracks project
schedules and timelines.
15
Source: NEPRA State of Industry Report 2023
16
Source: NEPRA State of Industry Report 2023
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Best Practices in Data Visualization
Choose the Right
Chart
Match the type of data with an appropriate visualization.
Prioritize
Simplicity
Eliminate unnecessary elements (e.g., 3D effects, excessive labels).
Focus on the
Audience
Tailor visuals to the knowledge level and interests of the audience.
Use Colors
Strategically
Use consistent color schemes to avoid confusion.
Include Labels and
Legends
Clearly label axes, data points, and categories.
Maintain
Proportionality
Ensure that the visual representation accurately reflects the underlying data.
Avoid overcomplicated visuals for simple datasets.
Use clean, uncluttered designs to focus on the data.
Provide context with titles, legends, and annotations.
Highlight key data points with contrasting colors.
Provide a legend for any symbols or color codes.
33
Tools for Data Visualization
• General Tools
• Microsoft Excel, Google Sheets: Basic charting tools for small datasets.
• Advanced Tools
• Tableau: Professional-grade tool for interactive dashboards.
• Power BI: Integrates well with Microsoft tools for business analytics.
• Google Data Studio: For creating interactive reports from Google data
sources.
• Programming Libraries
• Python:
• Matplotlib: Simple plots and charts.
• Seaborn: Statistical data visualization.
• Plotly: Interactive and dynamic visuals.
• R:
• ggplot2 for sophisticated visualizations
34
Matplotlib
• import matplotlib.pyplot as plt
• # Data
• # Numbers from 1 to 8
• x = range(1, 8)
• # Square of each number
• y = [num ** 2 for num in x]
• # Create the plot
• plt.figure(figsize=(8, 5))
• plt.plot(x, y, marker='o', linestyle='-', color='blue', label='y = x^2')
• # Add labels, title, and legend
• plt.title('Square of Numbers', fontsize=14)
• plt.xlabel(‘X Label (x)', fontsize=12)
• plt.ylabel(‘Y Label (y)', fontsize=12)
• plt.grid(True, linestyle='--', alpha=0.6)
• plt.legend(fontsize=12)
• # Show the plot
•
plt.show()
35
Seaborn
• Seaborn is built on top of Matplotlib and provides a highlevel interface for drawing attractive and informative
statistical graphics. It is particularly useful for visualizing
univariate and bivariate data. Some of its advantages
include:
• Built-in Themes: Provides built-in themes for better aesthetics.
• Statistical Functions: Includes statistical functions for better data
insights.
• Ease of Use: Simplifies complex visualizations with fewer lines of
code.
• Integration: Works seamlessly with Pandas data frames.
36
37
38
Assignment-3(a)
• The following information is available for two alternative machines for
a given service. Compare the true equivalent uniform annual cost for
these an indicate which one to choose? Draw a visualization for better
decision making?
Machine A
Capital cost (in Million Rs.)
Life (in years)
Machine B
3,000 10,000
4
10
Salvage value (in Million Rs.)
None
2,000
Annual operating cost (in Million Rs.)
1,400
700
Min Return on Investment (ROI) in %
10
15
39
Assignment-3(b)
40
Assignment-3(b): Draw the following:
Bar Chart:
Line Chart:
• Compare Units Sold or Revenue for
each Region or Product.
• Show trends in Revenue or Profit
Margin over Months for different
Products or Regions.
Stacked Bar Chart:
Pie Chart:
• Represent cumulative Revenue by
Region or Month.
• Show the percentage share of
Revenue by Product or Region.
Heatmap:
Scatter Plot:
• Visualize the Profit Margin or Units
Sold across Regions and Products.
• Display the relationship between
Units Sold and Profit Margin for
different Products.
41
Assignment 3 Due Date:
th
January 7 , 2025
Tuesday before 3 P.M.
42
Download