Course Script

advertisement
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Welcome
Introduction
Welcome to the IMA Data Analytics & Visualization Fundamentals Certificate course.
We are at a dynamic time in the profession as new technologies are emerging and creating a
lasting impact on the way we do business. The Digital Age is upon us, and bringing with it are
challenges, but also opportunities. As finance and accounting professionals, we are in a unique
position to harness the power of these technological advancements and expand our roles of value
steward into creating value.
When you're ready click to get started.
Course Overview
This course is based on IMA’s thought leadership found in its research and publications, along
with contributions from various leaders across the organization. You’ll hear straight from industry
experts who share insights into their proficiency and perspectives on topics such as the future of
the profession, emerging technologies, and data analytics and visualization.
Through four modules, you’ll be introduced to the impact of technology and analytics on the
management accounting profession and delve into understanding and applying data analytics and
visualization through a case-based scenario.
The four modules are:
Module 1 - Becoming Data-Driven will set the stage by introducing some of the changes that are
happening in finance and accounting, how business processes will be impacted, and what skills
you will need to succeed. You’ll also be introduced to concepts of data science, data analytics, and
data governance as an overall foundation of the course.
Module 2 - Visualizing the Present & Predicting the Future will discuss how to harness the power
of data through analytics and effectively communicate the data with visualizations. All of this will
be depicted using a fictitious case scenario that describes a dilemma in which data analytics and
Page | 1
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
effective data visualization is necessary to resolve it. Throughout the module, as the
case progresses, a video series will elaborate on key visualization concepts and
techniques.
In Module 3 - Applying Data Analytics and Visualization, you’ll learn how to apply some data
analytics and visualization concepts covered in prior modules to solve issues presented in
the case-study scenario. After that, you’ll be in the driver’s seat as you take your newly
acquired knowledge on a test drive. You’ll navigate through various exercises relating to the
scenario
described in the Huskie Motor case study and apply the concepts to solve problems. This
practical approach to data analytics and visualization will prepare you for applying this
knowledge to your own organization.
In Module 4 - Conclusion and Final Assessment, we’ll wrap up the course with a summary and
provide you with some valuable resources that will enable you to expand your learnings and
dig deeper into these topics. You will then be presented with a final assessment that will test
your knowledge of all topics discussed throughout the course.
Data analytics can be performed using a multitude of software available in the market. In this
course, you’ll be introduced to many of these software tools. However, in order to perform the
requisite analytics, you are required to have access to Microsoft Excel. It will be necessary for
you to have working knowledge of basic functions, including pivot tables, in Microsoft Excel.
Acknowledgements
IMA would like to give thanks and recognition to Ann Dzuranin, Dean’s Distinguished Professor
of Analytics at Northern Illinois University and co-author of Huskie Motor Corporation:
Visualizing the Present and Predicting the Future case study, and Daniel Smith, the head of
innovation and founder of TheoryLane LLC, for their expertise and contributions to this course.
IMA would also like to acknowledge the other valued speakers and authors of the publications
and resources used in this course. For more information on all of the contributors, refer to the
biographies document within the Resources link above.
Course Requirements
Knowledge check questions are included throughout each module. The purpose of
these knowledge check questions is to assess your grasp of the material. You’ll have
two attempts for each question. You must complete all knowledge check questions
before taking the final assessment.
Page | 2
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
To earn your Data Analytics and Data Visualization Fundamentals Certificate and NASBA CPE
credit, you are required to pass the Final Assessment in Module 4 with a 70% or higher score,
which is based on content from all preceding modules. After passing the final assessment, you
can access and print your professional certificate, NASBA CPE Certificate and digital badge. The
digital badge can be uploaded to your LinkedIn profile as well as shared on social media.
You have one year from the time you enroll in the course to complete it. You do not have to
complete the course in one sitting. If you exit the course before you finish, your current location
will be remembered, and you can pick up from where you left off when you return to the course.
Course Navigation
You can use the Main Menu as your guide as you navigate through the course. You may use the
Script tab to view the full script to follow along with the narration.
This course is designed to allow you to work at your own pace, either in one sitting, or by
completing some of the course, returning later and continuing right from where you left off.
By clicking on the three horizontal line icon in the upper left corner of the screen, you will be able
to open and close the course outline. Here, you may jump to any aspect of the course outline by
clicking on it. To close this outline, simply click the icon again.
On the bottom right-hand corner of your screen, there are two arrows. In this course, we’ll refer
to the arrow pointing to the right as the “next” button, and the arrow pointing to the left of the
“back” button.
Now, click the next button to continue.
Course Resources
The Resources link, located to the top left of the screen, includes many of the materials you will
need for this course including:
A course script
The course contributors’ bios
Spreadsheets to download
And other helpful materials
You may print the course script provided so that you can take notes as you proceed through the
Page | 3
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
course.
The spreadsheets found in the Resources link are to be used to complete and answer the
exercises in Module 3. You’ll need to download these files before you begin Module 3.
When you’re ready, click the NEXT button to learn more about this course.
Course Learning Objectives
Let’s take a moment to review the learning objectives for the I M A Data Analytics & Visualization
Fundamentals CertificateTM course. At the conclusion of the course, you should be able to:
Recognize the impact of technology and analytics on the accounting profession.
Demonstrate how data analytics can influence organizational strategy.
Identify ways data visualizations effectively enable appropriate business decisions
These are the overall learning objectives for this course. Each module will begin with a set
of unique goals that align with these overall course learning objectives.
Course Roadmap
As we previously outlined, this course consists of four modules. The beginning of each module
starts with the module goals. These goals outline the topics that will be discussed in that
module.
Each module is divided into lessons. You will see a menu screen as you begin each lesson.
The menu screen will list the sections that are included in that lesson.
We highly recommend that you go through this course in sequential order for an optimal learning
experience. Click Module 1 to proceed to Module 1 Lesson 1.
Module 1: Becoming Data‐Driven
Introduction
Welcome to Module 1, Becoming Data-Driven!
This module will introduce changes to the accounting and finance profession that are on the
rise caused by advancements in technology, and discuss the potential lasting impact of these
emerging technologies.
When you're ready click to get started.
Page | 4
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Module 1 Goals
Upon completing Module 1, learners will be able to:
Recognize the impact of technological advancements on the finance and
accounting profession.
Describe the typical life cycle of data.
Explain the impact data governance has on business and its stakeholders.
Define types of data analytics and how each progressively creates value within
an organization.
Module Menu
Module 1 consists of two lessons, Lesson 1: Changes to the Profession and Lesson 2:
Making Sense of Data. After completing the lessons, we’ll wrap up the module with some
concluding thoughts.
Click Lesson 1 to proceed.
Lesson 1: Changes to the Profession
Section 1: Technology is Changing
As technology transforms the management accounting profession at an ever-increasing rate, are
you concerned about having the necessary skills to further your career? Do you want to enhance
your skill set but don’t know where to start? Through this lesson, we will define the essential
knowledge and skills needed to adapt to the modern workplace and set the stage for how you
will be able to develop these skills.
Video: The Future of the Profession
Now, let’s hear from IMA’s President and CEO, Jeff Thomson as he discusses his thoughts on
the future of the finance and accounting profession.
Video Transcript (Est. time: 13:28 min):
Hello, I’m Jeff Thomson, president and CEO of IMA, the Institute of Management Accountants.
And I’m proud and honored to be speaking with you about the future of the accountant in
business, especially in a digital age. When you think about the challenge of digitalization of the
value chain, it really is a challenge, but it’s also an opportunity. So think robotics process
Page | 5
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
automation, think automation in general. Think about machine learning, cognitive computing.
The challenge of this course is to how to harness all of these technologies to add value. But a
challenge also relates specifically to our profession. Will this automation result in fewer jobs,
displaced jobs? I would argue that it will result in different types of jobs. Jobs that will add more
value of foresight and insight to the organization, but only if we commit to upskilling in areas
such as data analytics and data visualization.
So let’s talk briefly about the CFO team today and how it’s evolved over time. And as many of
you know, I was a CFO in industry during this evolution of the role expansion of the CFO and the
CFO team. So in the past, the CFO team had a very, very critical fiduciary responsibility and that
was safeguarding of assets, what we would call value stewardship, protecting value, internal
controls, accurately and fairly reporting on the financial condition of the corporation. But over
time, that
role has expanded to include those table stakes plus additional responsibilities in the area of
value creation, merger and acquisition activity, financial planning and analysis, and much more.
So the evolution has resulted in greater expectations for the CFO and his or her team. In fact, the
CFO is often referred to as the chief futures officer or the chief value officer. In many cases, the
CFO of large and small companies have responsibility for operations, for strategy, for IT, and for
more. So with this expectation, with this challenge comes the opportunity, to add greater
influence and greater relevance as individuals and as professionals in accounting.
So technology and analytics is sweeping us, and moving at such an incredible pace that you
almost can’t keep up with all of the technology, but really we must all embrace technology.
Millennials are digital natives. It comes very, very naturally. But even in the middle of our careers,
as we see some jobs being lost or displaced, it’s important that we learn new skills and new
capabilities, especially in data analytics, data visualization, data governance, and even storytelling.
When we’re trying to influence the organization on a business case or a new merger or a new
product or service, we have to tell a story. We have to tell a story beyond the numbers that’s
futuristic, inspiring, and
leads to great outcomes.
But the other aspect of learning and embracing technology is not just the Millennials and those
mid-career in terms of upskilling and staying current, and harnessing technology, but it’s also
tone at the top. It’s critically important that those at the top embrace analytics as the new science
of winning in the market, of being a competitive differentiator, to know your consumers even
better than they know themselves in terms of needs and wants and desires of future purchases.
So it’s very important that senior leaders embrace technology, not necessarily to become experts
in programming and coding, but to understand the value that analytics and technology can bring
to
a differentiated value proposition to support training, courseware, education, and more so,
many would say that it’s crunch time for finance and accounting. You know, finance talent
Page | 6
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
models are evolving quickly with a premium placed on data scientists, business analysts, and
storytellers. Everyone’s talking about it, the big four, Downmarket, McKinsey, Accenture,
etcetera, etcetera.
And so this is the trend we’re seeing to embrace technology, but to also step up and upscale in
the area of analytics so that we can add more foresight and insight. And the softer skills of us as
humans-empathy, active listening, professional judgment, professional skepticism-really is not
replaceable by a robot.
So some key messages, automation’s been around for a long time. What’s different now? Look,
we’ve been through multiple industrial revolutions. So why is automation the talk of the town, so
to speak, whether you’re a consumer or a professional. Well simple, the reason it’s different this
time is because we’ve programmed robots to be much smarter to think, to synthesize, to react,
to adapt. However, keep in mind that it’s human beings doing the programming and harnessing
the computing and cognitive power of the machine and not vice versa. Second key message,
look, we have a choice here. We’re at a turning point in our profession. We could choose to be
more
relevant and influential or less relevant and influential. Think about it this way. As automation
over time replaces or displaces more routine, repetitive tasks, it is a reality that some of our jobs,
some of our tasks will go away, especially in auditing and transaction processing. That’s at one
end of
the spectrum. At the other end of the spectrum, if we say, you know what? This advanced
analytics and providing insight and foresight is not what I was trained to do as an accountant. It’s
really not for me. I’m going to outsource the analytics and the deeper thinking, the deeper
judgmental capabilities to a consultant, to a machine, to whatever. Well, my question to you is
what’s left? What’s left of our great profession. So this is a call to action for us to upskill in
strategy management, data science, data analytics, and more. It’s a business imperative critical to
the
future of our profession. So nothing less than our relevance and influence is at stake.
And when you think about the rate of change of technology, I mean just two years ago, who knew
what blockchain is or was. Today we’re talking about more and more use cases of blockchain in
financial services and insurance. And more, five years ago we weren’t even aware of what
robotics process automation is all about. And now there’s hundreds of use cases in organizations
large and
small to create greater operational efficiencies. And then the final point, closer to home as
your CEO, CEO of IMA, is we are absolutely committed to being leaders in preparing you for a
great future. By the way, that future is now.
So I’ve referred to data science several times, and it could be a little bit intimidating. Change is
hard. Upskilling is necessary, but data science is really the overlap, the intersection of three
broad domains that I’d like to describe to you in turn. The first is business context. You know,
whether you are an FPA professional, financial planning and analysis, a statistician, an
Page | 7
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
econometrician, regardless of your level of analytical horsepower, you have to understand the
business question being asked. What is the problem we’re trying to solve? What new discoveries
are we trying to seek? As we mine through data and understand data, what are the opportunities
to learn something else to put us at a great competitive position? So business context,
understanding how customers behave, understand how value and cash flows in the organization
within your industry vertical is very, very important. To ask the right questions and to seek
answers in the data as a data explorer, business context, critically important. The second element
of data science is the technical modeling, some level of statistics, some level of knowledge of
forecasting tools and techniques. And this can be a little bit intimidating and overwhelming, but it
also is very manageable. If you focus on the problem at hand and the opportunities to be seized,
I’m not suggesting that every finance and accounting professional be a Ph.D. statistician. But I am
suggesting that the more data science, the more statistics you know and can apply, the stronger
you will be in this brave new world of digitalization. There was actually a book written called
exploratory data analysis written by a famous statistician. And that very much is what statistics is
all about. It’s not just reporting averages and standard deviations, it’s to kind of seek new
possibilities by mining through large sets of data and doing forward-looking analysis. In fact,
according to Glassdoor, data scientist is the No. 1-ranked job in terms of satisfaction and entrylevel pay. So we need to infuse more and more of that at a reasonable pace, at a reasonable level
into our profession. And then the third component of data science is the actual applications. It
could be Excel, advanced analytics and data mining and Excel. It could be open source software
like R, visualization, like Tableau, Python, Power BI. So learning these tools and application
capabilities is also critically important.
I made reference earlier to a book that while it’s over 12 years old is timeless and it’s called
Competing on Analytics, The New Science of Winning. It’s one of my favorite books because the title
speaks volumes for what we must be doing in a competitive age. When you think about winning
in the market, you think about apps and things that face consumers, not things that are kind of in
the background but make no mistake about it in a incredibly treacherous competitive
environment with geopolitics at play, nontraditional competition analytics is and can be the new
science of winning. But you know, analytics should not be equated with information technology.
It is the human and organizational aspects of analytical competition that truly differentiates. You
know, there is a language barrier though when we think about analytics, data scientists, IT
professionals, finance and accounting. We’ve got to get over it. We’ve got to solve it. We’ve got to
become data translators. Finance and accounting professionals need to learn and understand
more about IT. IT and information systems professionals need to learn and understand more
about the language of finance and accounting. Because guess what? At the end of the supply
chain, consumers and shareholders are expecting that we’re doing that translation. So there’s
plenty of opportunities to upskill in these areas and to speak the language of the machine and
speak the language of business and who better than management accountants, CMAs, CFO
teams to do that translation to create great, great outcomes. How to prepare for the future of
work.
Page | 8
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
In summary, look, there is no simple or prescriptive solutions. The environment is moving too
quickly and too rapidly. Technology is evolving very, very quickly and rapidly, and we must not
only keep pace, we must stay ahead. So as we embrace this new future, which is here today, we
need a combination of technological know-how, problem solving, and critical thinking as well as
the human skills, the soft skills as perseverance, collaboration, and empathy. We are absolutely
committed to preparing you today and in the near and future, whether it’s five years out or 10
years out, that’s our obligation to you. Thank you.
Innovation and Change
These emerging technologies continue to drive innovation and change throughout the business
landscape. While these specific trends can seem abstract, accounting and finance professionals
do need to have an understanding of what these technologies are.
Arguably more important than any specific technical knowledge, however, is the ability to
leverage these innovations to become strategic partners.
Section 3: The Changing Role of the Management Accountant
As management accountants adapt to new technologies, they are well-suited to
increasingly assume the role of facilitator, using the breadth and depth of their knowledge
to find opportunities for improvement and craft a shared vision for product stakeholders.
Management Accountants' Skills
Management accountants already possess most of the skills needed to implement advanced
analytics: They have a holistic view of business, they intuitively understand the interrelation
between financial and strategic business decisions, and they have strong written and verbal
communication skills. When management accountants can combine these skills with
technological knowledge and increased aptitude, they have the power to add further value to
their organizations.
In this course, we’ll explore how to build on these on these skills by understanding some of
the current technology trends. These trends lead to the creation of data that can be analyzed
for value-added, strategic decision-making.
Section 2: Keeping Up With the Trends
As a management accountant in the digital age, it’s important to have a fundamental
working knowledge of technological trends. Those include robotic process automation,
artificial intelligence, and blockchain.
Page | 9
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Important Technology Trends Defined
Let’s take a moment to define each of these.
Robotic Process Automation (RPA) is an advanced and intelligent form of process
automation leveraging software tools (or software robots) to effectively and efficiently
perform tasks traditionally performed by humans.
Artificial Intelligence (AI) is a field of computer science concerned with empowering machines
to think, behave, and act like human beings to draw conclusions and make judgements.
Examples include speech recognition, machine learning, and natural language processing.
Blockchain is a decentralized and distributed ledger of encrypted information. Uses for
blockchain include cryptocurrencies, smart contracts, and transparent proof of work.
Management accountants can leverage emerging technologies such as AI and RPA to streamline
processes and decrease the amount of time spent on repetitive tasks. They can instead focus
their time on strategy and decision making.
Leveraging Technology For Leadership
Once accounting and finance professionals are armed with the knowledge of which technology
trends are currently generating attention and attracting investment, there are several
approaches and tactics they should embrace as they leverage technology to elevate themselves
as strategic thinkers and leaders. Click each icon to learn more.
Communicate Implications
Be able to communicate the implications of technology for the organization and its strategic
priorities. Not every new technology platform or tool is going to be equally helpful for every
organization. Regardless of whether it’s RPA, AI, or some other emerging technology, every
tool has its own set of pros and cons. Taking an objective perspective on technology and
articulating the pros and cons of these tools are critical steps for effective leaders to take in an
increasingly digital world.
Link Stakeholders
Link different stakeholders across the company into the conversation. One of the most common
pitfalls that can trip up even the most sophisticated and well-prepared organizations is when
important dialogue occurs in silos. Discussing technology tools and options with only the
accounting and finance team isn’t going to drive change and might very well lead to project
failure. True leaders and strategic management teams understand that involving as many
participants from different departments as possible creates a more diverse team, better
solutions, and more robust use cases.
Page | 10
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Identify and Present Easy Wins
Identify and present easy wins. Having a conceptual understanding of, and a vision for, how
technologies will eventually transform a profession, subset, or field is a great start but
doesn’t usually generate actionable ideas for professionals to implement. After
communicating the potential of a technology or process and linking in a variety of
stakeholder groups, presenting
goals that are achievable in the short term and documenting first steps toward achieving them
will get the ball rolling within the organization, increase the likelihood of stakeholder buy-in, and
help to create enthusiasm for future projects that are likely to be more complicated.
Section 4: The Big Picture of Data
The best way to begin is to learn the terminology surrounding data. The first term is
“architecture,” and just like when designing a building, the architecture is the overall structure of
whatever it is we’re trying to create. First term is “architecture,” and just like when designing a
building, the architecture is the overall structure of whatever it is we’re trying to create.
The next term is “solutions” because the only 100% consistent aspect of any analytics project is
that we are trying to solve some problem or answer a question. “Solution architecture” may
take many forms from a simple report performing calculations once a week to an automated
artificial intelligence (AI) ultra-high-frequency trading application.
“Patterns” are templates or guidelines that solution architects will reuse frequently.
Architect a Solution
In our new data-focused terminology, we hope to architect a solution using an established
pattern. There are different types of solutions that may use different patterns and/or expertise.
At the highest level (a.k.a. “least detailed” or “10,000-foot view”), we typically group solutions into
domains.
Just as a building architecture may require domain solutions in landscaping, interior design,
plumbing, and more, an analytics solution architecture may include domains such as data
storage, data creation, and different business specializations. To architect effective analytics
solutions, we require knowledge and skills in multiple domains of business data.
High‐Level Domain Design Pattern
Next, we use the following high-level domain design pattern: data creation, data storage,
and business domain(s). The business domain may include a single business domain, such
Page | 11
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
as marketing, or many, such as marketing, supply chain, and finance.
Data Governance, a set of processes and policies that help an organization manage and secure
its data, which we will discuss in later lessons, is an integral part of every domain.
Section 6: Data Driven Culture
There are four essential elements in establishing a data-driven organization. These include datasavvy people, quality data, appropriate tools, and processes and incentives that support
analytical decision making. Organizations attempting to adopt leading-edge analytics often face
challenges in each of these dimensions. The result is an inability to effectively support managerial
decision making through the use of analytical technologies.
Much of the focus on implementation of advanced analytics has been on the tangible elements of
a successful data-driven organization (people, data, and tools). Less attention has been paid to
the fourth factor -- organizational intent. An organization committed to the goal of being datadriven will work to develop the people, data, and tools needed to accomplish that objective.
Resolve to Be Data‐Driven
Becoming a data-driven organization requires creating structures, processes, and
incentives to support analytical decision making. It requires the organization to resolve
to be data-driven and define what it hopes to accomplish through the use of Big Data
and analytics. The top leadership of the organization needs to describe how analytics will
shape the business’s performance.
Six Factors For a Data‐Driven Culture
While many organizations are striving to implement a data-driven culture, success isn’t
assured. Achieving this goal requires that certain elements be present. There are six key
factors for successfully establishing a data-driven organizational culture. Click each icon to
learn more.
Having the right tone at the top. Setting the right tone at the top is critical for most
organizational initiatives, and this includes developing a data-driven culture. In most
organizations, executives are championing the use of leading-edge analytics, although in some
companies the initiative is being led from the bottom up, with various departments being first to
embrace it.
In addition to championing the use of leading-edge analytics, executives in a data-driven
Page | 12
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
organization must consider ethics in the context of technology and analytics. While ethics is
not covered in this course, it is important for the executive team to incorporate ethics policies
for managing technologies.
Having strategies for the effective use of technology. The ability to use leading-edge
analytic techniques effectively is important for a variety of reasons. Companies whose decision
making is reactive to the competition are less likely to have developed strategies for the
effective use of techniques and technologies. Being reactive instead of proactive implies that
these organizations lack the ability to predict trends or to turn customer data into useful
insights that can be used to enhance the organization’s business.
Having a commitment to collecting and using data from both internal and external
sources to support analytics efforts. To harness the potential of leading-edge analytics,
organizations need to utilize a wide variety of data sources. This is especially true when it comes
to strategy development and execution. In this regard, about half of organizations use data from
both internal and external sources. Of concern is that the other half of organizations are only
using
internal data, only using data to validate strategy post-execution, or (in a few cases) not using
data at all! Using a wide variety of data sources yields better insights. Organizations that truly
want to derive value from their data must be comfortable with complexity and remain flexible
enough to respond to what the data tells them.
Using both monetary and nonmonetary rewards to promote analytical decision
making. Slightly more than half of organizations use incentives to promote analytical
decision making. These can be monetary, nonmonetary, or both. Yet nearly half of
organizations aren’t doing so. This may be a mistake: The use of incentives is key to conveying
the importance of developing
enhanced analytics capabilities throughout an organization. Those that do believe in the
importance of developing such capabilities are more likely to create the appropriate culture
by providing incentives to their employees.
Having a willingness to adequately provide resources to the analytics efforts. Organizations
often are facing resource challenges concerning the development of enhanced analytics
capabilities. By far, the most frequently cited challenge is the ability to find staff with the
necessary skill set. The next most common resource challenge is budget. A third challenge, related
to the previous two, is a lack of staffing resources and competing priorities. Clearly, these four
essential elements needed for companies to develop advanced analytical capabilities are
Page | 13
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
interrelated: data- savvy people, quality data, state-of-the-art tools, and organizational intent.
Alignment of analytics efforts throughout the organization. Responsibility for analytics can
reside in various parts of an organization. It has been argued that CFOs should “own” analytics
as they are regarded as impartial “guardians of the truth.” Most companies seem to agree, with
finance being an owner (although often not the sole owner) of analytics. Other popular options
include analytics being owned by IT, a dedicated analytics group, or operations, or having each
department independently maintaining its own analytics capabilities. Of course, these options
aren’t mutually exclusive, with a variety of possible combinations, the most popular being
analytics jointly owned by finance and IT.
The Benefits are Clear
The benefits of implementing a data-driven culture are clear. Organizations possessing such
cultures more effectively perform key business processes such as strategy formulation and
performance evaluation. In implementing such a culture, establishing processes and
incentives that support analytical decision making (i.e., organizational intent) is critical.
When deciding to venture along the path of implementing leading-edge analytics, evaluate the
extent to which the six factors discussed are present in your organization. By ensuring that
they are, you can improve the chances of successful implementation and achieving the
competitive benefits that come with being data driven.
Section 5: Business Domain Pattern
What does the interrelation of data creation, storage, and business domains look like?
Consider a traditional business selling widgets directly to customers.
Data is created by customer activity such as purchasing widgets or browsing websites. Data
integration moves data generating application and into data storage. Raw data transactions are
cleaned and shaped with other data into an aggregate data store. Aggregate data is used for
descriptive, diagnostic, predictive, and prescriptive analytics by various business domains. The
business domains use the analysis to make business decisions on how to sell more widgets.
Business activity resulting from business decisions influences customer activity.
The entire point of this cycle is for the business to better influence the customer. This could be
through product design, marketing, support, warranty, shipping, inventory, etc. The business
exists to make decisions that result in value to its customers translated into profit. Therefore, the
Page | 14
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
goal of an analytics solution architect is to accelerate the cycle of customer activity to business
activity.
https://sfmagazine.com/post-entry/january-2019-leverage-technology-to-become-a-strategicpartner/
Data Science
This is data science: the intersection of data, analytics, and business decision making. The current
state of data science is to accelerate the data-to-decision process.
The Management Accountants’ understanding of data science better facilitates the data-todecision process for their organizations. In this course, we will focus on the Statistics &
Analytics component of data science.
Enhancing Analytical Capabilities Is Critical to Success
The Digital Age is upon us, and it brings with it challenges and opportunities for businesses.
Management accountants have the opportunity, and need, to develop their data and
technological skills so they can use advanced analytics and glean new insights from their data for
their organizations.
The collection, assessment, interpretation, and use of data are enabling companies to create new
business models and make existing ones more efficient. Most organizations now believe that
enhancing their digital and analytical capabilities is critical to their continued success and survival
and are leaning on management accountants to take the lead in facilitating this strategic
initiative.
As management accountants develop their skills and complete their repertoire of strategic
business competencies, they can share their holistic view of business and the interrelation
between financial and strategic business decisions to communicate the importance of
creating and establishing a data-driven organization.
Lesson 2: Making Sense of Data
Section 1: Introduction to Data
Big Data is top of mind for many finance leaders. The question is: How can we leverage all
Page | 15
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
this data to drive business success? How can we can turn data into meaningful action?
The progression includes identifying key information from the data, turning that information
into knowledge, creating valuable insight from that knowledge, and taking action.
In this lesson, we will further define the progression and life cycle of data and introduce how
and where data analytics fits in.
Phases of the Data Life Cycle
Data is an essential part of any enterprise. An enterprise that’s agile and innovative requires an
understanding of data as it flows through the organization, interacts within various
departments, and transforms itself. Though an international standard for the data life cycle
doesn’t exist, the following phases, in order, are identified as typical during data life cycle
management. Click each image in the order shown to learn more.
Capture. Business is surrounded by data, but an enterprise needs to capture it in order to
make use of it. Data capture occurs in three major distinct ways:
· Data Entry. Manual or automated entry of data into the data warehouse to create new
data values.
· Data Acquisition. Acquiring or transferring data from an already existing data source
or data warehouse.
· Connected Devices. Internet of Things (or IoT), or the interconnection of computing
devices enabling them to send and receive data, has will continue to transform the way data
is captured by making it real time and continuous as devices listen to and interact with the
environment and each other. These devices capture and transmit the data so it can be stored.
Qualify. Have you ever wondered why month-/year-end close processes are prone to errors or
why reconciliations take such a long time? Inaccurate or incomplete data may lead to major
problems later in the data life cycle. These problems may include critical business processes
being held up, bad decision making, or final reports running afoul of compliance because of
erroneous data values. In this phase, data is assessed for its quality and completeness using a set
of predefined rules.
The Capture and Qualify phases are traditionally seen as under the purview of the IT team,
which sets up the system architecture. But management accountants, with their knowledge of
accounting processes and the way data will be utilized in later phases, have the ability and
responsibility to envision the framework of the system in partnership with IT.
Page | 16
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Transform. The advent of Big Data has led to enterprises being able to capture a seemingly
infinite amount of data. Couple this situation with IoT, and soon you can be drowning in data.
Thus, enterprises have to transform, synthesize, and simplify the data so it can be utilized by
functional departments. This phase is commonly called “analytical modeling” in the financial
world. A certain
level of functional expertise is required at this phase as data from different sources is
linked together to find the intrinsic value hidden beneath.
Utilize. The final aim of data is to help enterprises make good business decisions, and, in some
cases, data itself is the final product or service of the enterprise. Either way, the true value of
data is unlocked in this phase, and the previous efforts made in data capture, qualification, and
transformation finally bear fruit. Management accountants act as business partners during the
Transform and Utilize phases. As business partners, we need to translate the data values into
business stories that help enterprise leadership understand the magnitude of their decisions
and their long-term impact.
Report. This phase relates to external reporting. Internal management reporting for decision
making is realized during the Utilize phase in the data life cycle. External reporting could involve
quarterly/yearly financial reports, financial data sent to other vendors for bids, and other
compliance reports. Reporting of data is a key phase of the data life cycle that is ripe for
automation. Since rules, definitions, and requirements for these reports either rarely change or
have slight changes year after year, automated processes designed by management
accountants help to create and publish these reports in a more efficient manner.
Archive. This is the beginning of the end for the data that the enterprise has spent a considerable
amount of time and resources on to unlock its value. Data archiving is the transfer of data from
an active stage to a passive stage so that it can be retrieved and reutilized as needed.
Purge. The final phase of the data life cycle is the removal of the data (and any copies) from the
enterprise. It occurs in the data archive and is sometimes accompanied by a communication
both inside and outside the enterprise.
Financial compliance rules within an enterprise or those imposed by regulatory bodies
normally drive the Archive and Purge phases. And management accountants act as custodians
to ensure that these compliance rules are followed within the enterprise for financial data.
Page | 17
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Section 2: Introduction to Data Governance
As users of systems, technology, and data, finance professionals are invested in the integrity
of our information and its sources. IT and data governance can provide the structure and
rules to ensure data accuracy and availability while managing the associated risks.
Every organization with shared data is concerned with data integrity. Data governance, a
specific sub-element of IT governance, parallels the capabilities of corporate governance and IT
governance at the data level. This can be as simple as a set of rules specifying what data (for
example system fields) is to be entered by whom, when, and from what source, to as complex
as you want (as an example multiple levels of data entry, audit, and control structures).
In this lesson, we will identify how organizations can build trust in data by implementing a
data- driven culture with effective and efficient data governance.
Data Governance and the Data Life Cycle
Data governance helps an enterprise administer the data as it flows through the various phases
of the data life cycle discussed in section 1 of this lesson.
During the Capture phase, enterprises need to identify the capture points for the data and
define the data that will be captured.
As data enters the Qualify phase, the rules of data governance act as a check to ensure
that inaccurate data is identified, assessed for completeness, and secured.
At the Transform and Utilize phase, focus shifts toward adherence to transformation rules and
the legal utilization of the data according to regulatory standards for decision-making purposes.
As
the Reporting phase is all about showcasing data to external parties, data governance lists
the steps to take when inaccurate data is reported outside the enterprise.
Archiving data relies on a set of rules that define what occurs, as well as when and how. And in the
Page | 18
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Purge phase, it’s critical to set a purge schedule for the data as per the retention period
requirements.
Why Data Governance is Needed
Effective and efficient data governance can facilitate powerful analytics and decision making
across an organization. Demand for advanced analytics is increasing, and, as previously
discussed in the course, so, too, is the expectation that management accountants will perform
the analytics.
But how often have we heard of an analysis resulting in a business decision that later proved
problematic because of faulty data? As technology makes it easier for employees to access data,
write reports, and conduct their own analysis, data governance becomes an even more
important safeguard to ensure the integrity of underlying data.
As businesses become more data-driven, data governance provides the foundation for
growth into predictive modeling and automation.
Data‐Driven Culture
When asked, 99 percent of leaders of large organizations say they want a data-driven culture to
maximize the value of data through analytics. As we also previously identified, they aim to
make business decisions faster and more accurately through automation.
Why emphasize culture? The limitation for achieving analytics maturity isn’t usually related to
data or technology but, rather, people’s reluctance to use data and technology to answer
business questions-in other words, using data analytics rather than intuition as a driver for
business decisions. A shift is needed toward a culture that trusts that these data-driven decisions
will be effective.
Governance Attributes
For this analytics process to function effectively, the data inputs (“raw materials”) must be
consistent and reliable for the information outputs (“finished goods”) to be relevant and
comparable. Relevant, reliable, comparable, and consistent are the four desired attributes of
accounting information. Effective governance means that data used in decision making is of
consistent quality and from reliable sources. Efficient governance leverages connectivity and
technology to enable the comparison of data from many different sources and to deliver
relevant analysis.
Governance Problems
Page | 19
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Unfortunately, only about one-third of data-driven culture initiatives succeed in larger firms.
Often the reasons for failure stem from insufficient data governance. If an organization suggests
its data quality is insufficient to use for decision making, that signals ineffective data governance.
If the data can’t be accessed, that’s a symptom of inefficient data governance. Click each icon to
learn more.
Ineffectiveness. “I think you forgot to adjust your dates for time zone…”; “I’ve heard ‘churn’ defined
three different ways today…”; “We can’t do that analysis; we don’t have good data….” All common
phrases for emerging data-driven cultures who struggle with effective data governance. Issues
with data and analyses, from little mistakes in calculations to instability in data sets, all build to
create a culture of mistrust in data and, by extension, mistrust in any decision based on that
data. The root cause of these issues is often ineffective data governance.
Inefficiency. Certain phrases signal inefficient data governance: “Will you email me that database
extract?”; “Why can’t I access that reporting database? There’s no way for me to get approval?”;
“It’s going to take more than a week to access?”; “I’ll just get someone to build out a new data
platform for my department.” Inefficient data governance is usually more difficult to resolve than
ineffectiveness. Sometimes it’s necessary due to regulations, e.g., open-source language
restrictions or the EU General Data Protection Regulation (GDPR); in other cases, it’s a symptom
of an organization failing to commit sufficient resources to data governance.
Often the easiest way to govern a data set is to block access. But blocking access can create
inefficiencies: While no access means no one can compromise the data, it also means no one
can use it to improve the business.
Blocked Access. Often the easiest way to govern a data set is to block access. But blocking access
can create inefficiencies. While no access means no one can compromise the data, it also means
no one can use it to improve the business.
Nonconnected Data Sharing is also inefficient. If users are constantly getting data from File
Transfer Protocol (FTP) or email, then it will be difficult for them to create a report that updates
automatically and impossible to automate decisions. Commitment to analytics governance
means giving people access to data in a way that facilitates analytics maturity to create further
value, even though that takes time.
Principles for Implementation
Page | 20
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Efficient and effective data governance provides clear ownership and standards for data and
data processes to ensure data quality. Although many approaches exist for data governance
implementation, most share the following principles. Click each icon to learn more.
Accountability. There must be clearly defined ownership of, and accountability for, different
types of data. Interconnected data managed throughout the organization requires consistent
practices in order to maintain its effectiveness and value. In most organizations, data oversight
doesn’t reside within one department. Human resources is the keeper of employee-related
data, for example, while accounting maintains financial data. Shared governance brings
consistency by establishing organization-wide policies and procedures.
Standardization. Data is an asset and must be protected like one. Clear policies on access,
definitions, privacy, and security standards are needed. The committee must define the policies,
and each department head must ensure adherence. This approach will ensure that the
organization is in compliance with regulations, such as GDPR (General Data Protection
Regulation).
Quality. Analysis is a critical tool for decision making and is only as good as the data upon which
it relies. The quality of data should be managed from the time it’s captured. Good data
governance includes defining one set of data-quality standards for the organization and
establishing consistency in how that data quality is measured and recorded.
We live in a fast-paced world where management accountants are asked to provide insight and
foresight through analytics, often on short notice. Data governance helps ensure that our data is
readily accessible and accurate. That’s especially true in situations where data is spread
throughout disparate systems and departments. Often the analysis we’re asked to perform
relies on data that we don’t oversee. By coordinating with other data owners in the organization,
we can
protect the integrity of data and spend more time on value-added analysis than on scrubbing
data.
Committee of Sponsoring Organizations'
Standards for data and data processes to ensure data quality are outlined in various frameworks.
The Committee of Sponsoring Organizations’ (or COSO’s) mission is to provide thought leadership
through the development of comprehensive frameworks and guidance on enterprise risk
management, internal control and fraud deterrence designed to improve organizational
Page | 21
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
performance and governance and to reduce the extent of fraud in organizations.
The COSO Internal Control - Integrated Framework, is a globally recognized framework for
developing and accessing internal controls.
The purpose of the COSO Internal Control – Integrated Framework is to help management
better control the organization and to provide a board of directors’ with an added ability to
oversee internal control. Internal control enables an organization to deal more effectively with
changing economic and competitive environments, leadership, priorities, and evolving
business models. The framework identifies five components that are of comprised of 17
principles. For more information, you can download the framework from the Resources link
above.
The COSO Internal Control ‐‐ Integrated Framework
As a high-level overview of the COSO Internal Control - Integrated Framework, we’ll refer to the
COSO cube and two of its principles that can be applied to data governance. These principles
reside within the Control Activities and Information & Communication components of the cube.
Click each icon to learn more about these principles and how they apply to data governance.
Principle 11 states “The organization selects and develops general control activities over
technology to support the achievement of objectives.”
Ensuring an acceptable system of internal control over systems and processes will protect
stakeholders and the data integrity itself. Management must carefully select appropriate control
activities over the technology infrastructure that help ensure data completeness, accuracy, and
availability of technology processing.
Principle 13 states “The organization obtains or generates and uses relevant, quality information
to support the functioning of internal control.”
To achieve this, information systems should capture internal and external sources of data, and
process and transform relevant data into information that is timely, current, accurate, complete,
accessible, protected, verifiable and retained. Information must be reviewed to assess its
relevance in supporting the internal control components.
Although we pointed out two principles from two different components within this COSO Internal
Control – Integrated Framework, factors relating technology must be considered throughout all five
Page | 22
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
components and seventeen principles to provide reasonable assurance of appropriate controls. An
effective system of internal control will demonstrate that all components are present and functioning
to the achievement of objectives throughout the entire entity structure.
Section 3: Introduction to Data Analytics
Data analytics is the science of examining data with the purpose of creating actionable insight.
It gives you the ability to react quickly to an increasingly complex, volatile, and competitive
environment. Most organizations know that enhancing their analytical capabilities is critical to
their success and survival-helping them gain a competitive advantage or helping them maintain
their current market position by helping managers and organizational leaders make better
business decisions.
Data Analytics and Strategic Management
Data analytics must be anchored in the entire strategic management process: strategic
analysis, strategy formulation, strategy execution, and strategy evaluation.
It is all about quantifying business issues and making decisions with more accurate and factbased data.
Business analytics and business intelligence use data mining -- examining large databases to
generate information and extract patterns, statistics, and modeling software to support datadriven business decision-making.
Data Analysis Models
There are many data analytics models available including regression analysis, classification
analysis, customer segmentation, market basket analysis, and others. Let’s take a moment
to define some of these models.
Regression analysis is a statistical model used for obtaining an equation that best fits a set of
data and is used to show relationships between variables.
Classification analysis attempts to find variables that are related to a categorical (often binary)
variable.
Customer segmentation, also known as clustering, groups customers into similar clusters,
based on the values of their variables. This method is similar to classification except that there
are not fixed groups. The purpose of clustering is to discover the number of groups and their
characteristics, based entirely on data.
Page | 23
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Market basket analysis tries to find products that customers purchase together in the same
“market basket.” In a supermarket setting this knowledge can help a manager position or price
various products in the store. In banking or other retail settings, it can help managers to crosssell (sell a product to a customer already purchasing a related product) or up-sell (sell a more
expensive product than a customer originally intended to purchase).
It is important to note that Artificial Intelligence and machine learning also play an important
role in performing data analytics.
Now you will be presented with an overview of how data analytics can be used to create value
in an organization. organization.
Video: Value Creation Through Data Analytics
Video Transcript (Est. time: 3:44 min):
Technological advances in gathering and processing data are increasing at an
unprecedented speed. Data analytics includes the extraction and analysis of data using
quantitative and qualitative techniques to gain insights, improve predictions, and support
decision making.
For management accountants, the ability to exploit this data through meaningful data analytics
is a critical component of the accounting and finance profession. In their evolving roles,
management accountants will be required to acquire enhanced skills in data mining, analysis,
and effective communication through data visualization. As business partners, management
accountants become storytellers, providing relevant hindsight, insight, and foresight to their
organization, effectively turning information into intelligence through data analytics.
Let’s take a few moments to review the key aspects of data analytics and their contribution to
the value creation within an organization. Value is added to an organization as the sophistication
level of the analytics being performed becomes more robust.
We begin by leveraging historical data of an organization to provide hindsight into what
happened. This type of analytics is known as descriptive analytics. Many accountants leverage
tools in Excel, such as pivot tables and graphing to perform descriptive analytics. There are also
many other
tools in the market being used to perform this type of analytics.
Page | 24
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Although understanding what happened is very important, further hindsight explaining why
things
happened is even more valuable. The type of analytics that help us understand why things
happened is referred to as diagnostic analytics. Tools often used to perform diagnostic
analytics may include Excel’s customer segmentation, what-if analysis, and multi-variable
regression. Here again, other software products in the market are also being used.
Up to now, we’ve defined analytics that provide hindsight. What if we could gain insight into what
is likely to happen going forward? Predictive analytics adds value to an organization by
attaining this type of insight into what is likely to happen. Accountants can access Excel tools like
exponential trend smoothing, and solver to perform predictive analytics. As software
sophistication continuously improves, the market will continue to expand upon product offerings
that provide this type of insight.
Taking the value added through analytics even further, Prescriptive analytics help organizations
attain foresight needed to decide what an organization should do or what actions should be
taken to create added value. Applying some of the predictive analytic tools from Excel may also
help in performing prescriptive analytics.
Finally, Adaptive analytics help further gain value by providing additional foresight into how
machine learning can help. As organizations continue to innovate new technologies with
artificial intelligence, more adaptive analytics will be applied.
Advances in technology and analytics is rapidly changing the role of the management
accountant, how their work is done, and which types of accounting functions are becoming
obsolete.
Embracing the changes brought on by artificial intelligence and further technological
achievements, will enable management accountants to play a more proactive role in
providing insight, transparency, and foresight as valued business partners within their
organization.
Understanding what it takes to perform relevant data mining, value added data analytics and
effective communication through data visualization are critical competencies associated with
the future of the accounting and finance profession.
Page | 25
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
ABC Electronics Data Analytics
You’ve just heard about how management accountants use data analytics to make informed
strategic decisions and increase company value. Data analytics provide hindsight, insight,
and foresight for helping companies achieve their goals.
Let’s learn about how the five types of data analytics can be used to tackle a declining
sales problem using a fictional scenario.
Let’s find out:
How the sales decline was identified and why it was problematic.
What they learned about reasons for the sales decline.
What options they had to choose from for addressing the problem.
Which options they chose to meet their goals.
ABC Electronics – Sales Decline Data Mining
ABC Electronics is a retailer of household electronics products including televisions,
refrigerators, washers and dryers, and more.
Through data mining, they discovered a decline in Brand C television sales.
ABC Electronics has an interactive dashboard that provides insightful visualization for analysis
and decision making. Take a moment to review ABC’s sales dashboard for their line of televisions.
Then click next to continue.
Data Analytics Phases
Use the interactive analytics value chart to learn more about how ABC Electronics made the
most of data mining and modeling to address the declining sales.
Working from hindsight to foresight, click on each analytics type to discover ABC Electronics
data revelations and decisions.
Descriptive Analytics
Descriptive analytics define a business problem. Raw data was cleaned, transformed,
and summarized. ABC then used this data to create their television sales dashboard.
Page | 26
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
The dashboard’s drill-down capabilities for descriptive information is how the Brand C sales
decline was identified.
The dashboard revealed overall television sales was up 5% but Brand C sales decreased 2%.
Diagnostic Analytics
Diagnostic analytics provides possible reasons for the business problem. It tries to find
correlations of the product’s activity using data.
In the case of ABC Electronics, Inc., diagnostic analytics tries to find possible reasons for the
sudden decline in the sales of Brand C televisions by searching for relationships between data
attributes and product activity. For ABC Electronics, attributes are factors that could have an
impact on sales include:
Time of year (seasonality)
New comparable TV from competitor
Product reviews
Product price
Social media mentions
Statistical modeling looks at the relationship between the attributes and sales. Visually, the more
linear the relationship, the better the results.
What did ABC’s analysis find?
Reasons related to the attributes are:
1. The decline happened in July and August, a typical slow sales period.
2. The competitor’s product sells for less.
3. Customer reviews gave average ratings
Predictive Analytics
Predictive data analytics can provide insight about what might happen given the current
circumstances. This is done through building analytical models, such as regression or market
basket analysis, to compare with actual results.
Based on their analysis, ABC Electronics’s prediction is that Brand C sales will continue to decline
Page | 27
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
due to the attributes of new competitor product, pricing, and customer reviews. This predicted
decline will result in leaving ABC with twice as much inventory as desired.
Is that what happened during the next sales cycle?
The results from actuals in the next reporting period matched the prediction. Brand C sales
declined 2%.
With the prediction proven correct, the model will be adjusted by assigning higher weighted
scores to the attributes associated with the decline in sales.
Prescriptive Analytics
As we just learned, predictive analytics are a company’s attempt to foresee how changes they
make might address a business need. Taking it a step further, prescriptive analytics uses complex
modelling to suggest actions to take in order to achieve possible outcomes.
The accuracy of possible outcomes depends on two important factors: data quality and model
quality.
With the prediction of twice as much Brand C inventory than needed, what do you think ABC’s best
options are?
Two possible actions were identified:
1. Offer a 30 % discount which predicts sales will improve 10%.
2. Offer a package discount for purchasing a soundbar with the television because analytics
showed customers frequently buy a soundbar when buying the television.
Adaptive Analytics
Adaptive analytics is another resource to provide foresight about what might happen given what
has happened so far. Machine learning models incorporate actual results and continuously adjust
based on new data received.
Adaptive analytics collect Big Data into one central repository. Included in Big Data is information
related to sales, marketing, email, websites, and content management systems. Machine Learning
models use this data to make more accurate predictions.
Analytics at this stage are useful in validating the accuracy of predictions. For example, if a
prediction recommends a change in marketing strategy, adaptive data analytics tells whether the
change is working and how to adjust the model if it isn’t.
Page | 28
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Adaptive analytics can also help find unrelated trends such as customers who recently purchased
homes and bought a television are also likely to buy five additional electronic products.
What did ABC Electronics do?
ABC electronics used adaptive modeling to build customer profiles based on customer behavior
data collected from web ads, emails, text messages, etc. Using the information from these specific
customer profiles, ABC Electronics will plan to offer personalized promotions such as pricing
discounts and package deals to increase sales and effectively reduce inventory.
ABC will continue to adjust their analytical models as new actuals data is received. With
continuous monitoring and adjusting, ABC will be equipped to make the best use of the data
available for making effective strategic decisions.
Analytics Value at Your Organization
Time and more data will tell if ABC Electronics is able to reduce the excess Brand C inventory.
While thinking about your organization and data analytics, what information is already gathered
that can tell a story about a business need? Or what information is not gathered that could be
helpful? How can the types of data analytics help drive decisions in your organization?
Data Analytics Competencies
According to IMA’s Management Accounting Competency Framework, within the “Technology &
Analytics” domain, data analytics competencies encompass extracting, transforming, and
analyzing data to gain insights, improve predictions, and support decision making. While the
required level of competency may vary, at a minimum it must include knowing what types of
analytical models are available and to what business problems they can be applied.
Beyond that, important, perhaps essential, skills include the ability to transform raw, unstructured
data into a form more appropriate for analysis (data wrangling), the ability to mine large data sets
to reveal patterns and provide insights, and the ability to interpret results, draw insights, and
make recommendations based on analysis. At the conclusion of this module, we will begin to work
with data sets and apply analytical models to begin making business decisions.
For more information on the IMA Management Accounting Competency Framework, refer to
“Management Accounting Competencies” under Resources on your player.
Page | 29
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Concluding Thoughts
No Longer a Human Endeavor
Accounting is no longer solely a human endeavor. Similar to what has already occurred in
manufacturing, transportation, and medical industries, robots are now performing previously
people-driven accounting and finance tasks, including transaction matching, variance analysis, and
reconciliations.
Exponential Growth
Technology is transforming the management accounting profession at an accelerated rate, but
management accountants should not be concerned about job security or possessing the skills to
adapt to their new roles in the digital age. RPA, AI, blockchain, and any other technological
advancement that contributes to the exponential growth of available data can and should be
properly leveraged to make strategic business decisions.
Strategic Business Partner
When financial data is available within minutes, management accountants have the time to serve
as true strategic business partners and analyze and visualize the data so organizations can
respond more quickly to the marketplace, capitalize on innovation opportunities, ensure
continuous integrity, and, most importantly, uphold stakeholder and consumer confidence.
In the next module, we will focus on the development and practical application of data analytics
and data visualization for accounting and finance professionals.
Module 1 Wrap‐up
You have completed Module 1 of this course and should now be able to:
Recognize the impact of technological advancements on the finance and accounting
profession.
Describe the typical life cycle of data.
Explain the impact data governance has on business and its stakeholders.
Define the various types of data analytics and how each progressively creates value within an
organization.
Page | 30
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Module 2: Visualizing the Present & Predicting the Future
Course Roadmap
Click Next to proceed to Module 2, Visualizing the Present & Predicting the Future.
Introduction
Welcome to Module 2, Visualizing the Present & Predicting the Future.
The goal of this module is to gain a deeper understanding of data analytics and data visualization
through the fictitious scenario depicted in the case study, Huskie Motors Corporation: Visualizing
the Present & Predicting the Future.
The case study is available in the Resources link of this course. You are not required to read the
case study ahead of time, however, it is available to download if you prefer to print it and follow
along.
When you're ready click to get started.
Module 2 Goals
Upon completing these lessons, learners will be able to:
Define data visualization.
Describe how data visualization can impact the way data is communicated.
Identify various data visualization tools and their different uses.
Recognize the importance of choosing the right visualizations based on your audience.
In Module 2, Knowledge Check questions are dispersed intermittently throughout the lessons. There
are a total of six knowledge check questions among the three lessons. Be sure to answer all of the six
questions in this module before moving ahead to the next module.
Module Menu
Module 2 consists of three lessons:
Page | 31
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Lesson 1: Case Study Introduction
Lesson 2: Huskie Motors Operations, and
Lesson 3: The Data Dilemma.
Click Lesson 1 to proceed.
Lesson 1: Case Study Introduction
Section 1: Introduction to the Case Study
In order to effectively work with, analyze, and make business decisions from the abundance of
data made available through automation, we have made it clear that solid data governance is vital
to the success of an organization’s strategy.
To establish a practical understanding of data analytics and data visualization, we will now
examine the opportunities and challenges created by Big Data and demonstrate how
management accountants can apply new competencies to the various data processes of an
organization.
We’ll now begin the case study Huskie Motor Corporation: Visualizing the Present and Predicting
the Future. Before we begin, it’s important to note that Huskie Motor Corporation and all of the
characters that are introduced are fictional and have no relationship to an actual organization.
Big Data
Organizations create and collect massive amounts of data as a result of their day-to-day
operations. Frequently referred to as Big Data, it represents an important asset for the
organization.
Big Data presents both opportunities and challenges for accounting professionals, who are
expected to know how this data is created, collected, stored, and accessed. As the custodians of
the organization’s assets, accountants are expected to understand and implement controls over
the storage and use of the organization’s data.
Big Data and the Management Accountant
Page | 32
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
As business professionals, accountants are expected to know how to use this vast source of data
to make better business decisions and identify potential risks. Understanding how to use Big Data
to formulate and solve business problems provides an opportunity for the accounting professional
to become a forward-thinking strategic partner in the organization. The challenge for accountants
is to develop the skill set needed to extract value from Big Data through advanced analytics.
Data Visualization
One skill that is becoming increasingly important for analysis of large data sets is
data visualization.
Data visualization is the process of displaying data to provide insights that will support better
decisions. Gartner’s 2017 Magic Quadrant for Business Intelligence and Analytics Platforms states
that “the visual-based exploration paradigm has become mainstream.”
Gartner identifies three platforms as leaders in visualization software: Tableau, Microsoft, and
Qlik. All three products provide relatively easy-to-use data visualization tools. In this lesson, you’ll
also learn about some of the other platforms that are available in the market. As new software is
introduced to the market, there may other data visualization tools that are leading in the industry.
Throughout this module, as you progress through the Huskie Motors case study, you’ll be taken
through a series of videos that discuss data visualization topics such as knowing your audience,
selecting the best visualization in context of the business question, various visualization
software, and much more. Dan Smith, accounting professional turned data scientist and leader
in data analytics, will take you through these important concepts on data visualization.
Video 1: Data Visualization ‐ Why Visualization
Now let’s hear from Dan Smith where he further introduces the concept of data
visualization, discusses why visualizations are important, and emphasizes the role and
function of data visualization in analyzing Big Data.
Video Transcript (Est. time: 9:34 min):
In this series of videos. We’ll review the tools often associated with creating data
visualizations and provide a little context of when you would use what tool or what
visualization in what scenario. As the case study progresses, you’ll see situations
where you need to understand for whom you are creating. The visualization for that is
understanding your audience, and understanding your audience means, does this
Page | 33
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
audience need to explore the data and find their own answers to their own business
questions, or does your audience simply need to be informed of the answer, or do they
want to report only providing a repeatable answer to a well-defined question? We’ll use
these concepts of exploratory, informative, and reporting audience needs as a framework
to talk about all the visualizations concepts associated with those visualizations. We’ll also
dive into the source of the data and where your data visualization tools live. The data
environment concepts, again, that’s both for the data visualization tools and how the
audience receives the information provided by those visualization tools, the visualizations
that you create.
Finally at the end of the case study, we’ll discuss the future of data analytics and the role of
the management accountant in data analysis. With that out of the way, let’s talk about why
visualizations are important. The formal definition of visual analytics is defined as the
creation and study of the visual representation of information or the process of displaying
data to support decision making, but what does that mean? What the creation and study
of a visual representation of information, quote unquote, is really saying is that we’re
simply taking data and using the data, the raw data to create information using a
visualization. Now there’s an important distinction between data and information. Data
are the raw materials for the creation of information data. Our numbers, rows in a table
there, transactions before anyone sees them. Information is processed. Data, data, which
is processed in such a way to make some type of decision or gaining knowledge about the
data or what that data represents.
And what about the need for visualizations? Why are they so important? In modern
analysis, visualizations are closely associated with how humans consume information,
how humans take data and process it as information. You see, humans naturally use
visual indicators to understand their environment. Way back at the beginning of humanity
we would use visual cues to know if a plant was ripe or a saber-tooth tiger was about to
attack us. Well, understanding a bar chart or a line chart is not exactly the same as
understanding the threat of a wild animal visibility into the state of our environment.
Specifically the business environment is nonetheless very important for the modern
human, and because humans are so well adapted to processing massive amounts of
information visually, we can use visualizations to communicate complex ideas and
interactions between groups of data using visual representation of those and ideas and
interactions.
Using visualizations is much more efficient than if we had to communicate that
Page | 34
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
information through text or speech alone. For example, imagine explaining double-entry
accounting using only your words, no pictures, T tables, ledgers, or whiteboards. I
remember my first accounting professor talked about debits and credits being on the
right hand and the left hand side of a, of a, a T table over the general ledger. He even
stood on the table and had debit and credit written on his shoes to reinforce that visual
concept. You see, he was being a great professor because he understood that we had to
have a shared visual context of what was being explained. Without visualization, without
something tangible, you cannot know what the other person is visualizing in their mind.
It’s very difficult to be on the same page to share a vision if you don’t have something to
see in the first place.
So visualizations, in other words, create shared context between the audience receiving
the information and you the individual that’s communicating this context to them.
Speaking of shared context, let’s do a quick overview of the visualizations that we’re going
to be covering. Now, you may be unfamiliar with some of the terms that I use in this quick
summary, factors, groups, aggregates, distribution trends, etcetera, or unfamiliar with the
visualizations themselves. Don’t worry too much about that. Just make a note. We’ll define
all of these later, either in the videos, in the course content, or through reference material
which I’ll provide to you.
So let’s get started on this quick summary bar charts for comparing single values or
aggregate values between a small number of groups, a small number of groups that all
share a common numeric baseline. Bar charts are very effective when comparing the
composition of groups or factors within a larger group.
A pie chart for a single larger group or a stacked bar chart if you have many larger groups.
Are the visualizations typically used? When we display trends over time, we use lines or
line graphs when we have two separate fields, two separate sets of values, and we want to
explore the relationship between them that’s facilitated by scatterplots or principal
coordinates plots. When we look for the relationship between many groups within those
two values, so multiple factors within a set of two values, we use a colored size or shape
scatterplot or we may use something like a heat map to indicate high vs. low values within
a table. This is known as a heat map just because it looks like there’s hot and cold areas
on your table. You’d say a way of getting a quick visual indicator of hot spots while at the
same time providing raw numerical context.
Going onto more advanced visualizations. When we want to look at statistical measures or
Page | 35
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
traditionally statistical measures like average mean median values, these are known as
measures of central tendency. We need to understand how the data is distributed around
that central tendency, so we use visualizations like histograms to see the distribution of
data around an average mean, median so that we can see how spread out the data may
be. If we want to compare multiple groups and their central tendencies and distributions,
we would use a box whisker or a candlestick plot. When we have a lot of information that
we want to communicate and perhaps we need to add an interactive component to it,
allowing people to explore that information, we would use a dashboard. A dashboard is a
combination of visualizations oftentimes with either filters which allow us to modify the
underlying data that populates the visualizations or, that is, filter the data.
Either filters or we can include a drill-down component where you have one visualization
interactively modify another visualization. I would be able to click on one so that I could
see more details about that visualization.
That was a very fast summary. We defined visualization as the creation and study of the
visual representation of information. We learned visualization is important because it
assists with informing an audience and the exploration of data. Finally, we introduced
many visualization techniques. Again, don’t worry if you’re unfamiliar with some of the
visuals or the language I mentioned. We’ll go into more detail later and also there may be
some visualizations that you really like or that you’re really interested in that I didn’t
mention, like a Sankey diagram, waterfall chart, or ribbon chart. We’ll dive into the core
visualization concepts used across all of these visualizations, so the underlying shapes
and lines, the mechanisms of how they’re communicating information. This will allow us
to understand the fundamentals of all visualizations, visual analytics, everything from
simple bar charts to complex infographics that you find online on websites. I look forward
to this journey, individual analytics with you.
Next up, we’ll be talking about the tools used to create visualizations.
Video 2: Data Visualization ‐ Tools of the Trade
While the case study identifies three platforms as leaders in visualization software,
there are many others to be aware of. Dan will now discuss the various visualization
tools, and the strengths and weaknesses of each.
Page | 36
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Video Transcript (Est. time: 13:37 min):
Let’s begin our discussion on visual analytics by discussing visual analytics tools. In this
video we’re going to talk about what, exactly is a visualization tool and why are they
becoming more and more prevalent in business. We’ll cover the general concepts applied
by visualization tools so you can begin to feel more comfortable using a wide range of tools
as opposed to feeling like you have to specialize in just one tool over another. And finally
we’ll outline the general categories of types of visualization tools, the spreadsheet- focused
ones, the ones that concentrate on visualizations first, the workflow-centric tools, and the
code-based tools. In the case study, they talk about how the visual-based exploration
paradigm has become mainstream. Now that’s a very consultanty way of saying that people
now expect to be able to connect and explore their data in a visual platform. Well, what do I
mean by visual platform?
Well, literally anything that can connect to a data source and modify the data in a way that
displays a visualization can be called a data platform. Okay, fine. So what do I mean by
data source, and what do I mean by modify data? Well, we’ll get to all of that towards the
end of the case study when we’re talking about the data and data environment concepts.
But for now, understand that particularly when it comes to Big Data, you’re not going to
be able to pull all the data into a single spreadsheet because the data is simply too large.
But more on that later. Let’s focus for now on the tools themselves. So what, what do I
mean when I say tool? Because any of the tools, Tableau, Power BI, Click, Excel, Jupyter
Notebooks, R Studio, RapidMiner Nine, the list goes on and on.
Any of these tools are able to create pretty much the same visual analytics. They just have
their own way of doing it. And that’s because all these tools apply the same fundamental
concepts. And I’m not talking about just visualization concepts, I’m talking about the
highest-level analytics concepts. You see we tend to use words like tools, concepts, and
technology interchangeably, but technology is the application of concepts using a tool.
Visual analytics tools apply analytics concepts to create useful information. Competency in
a tool is knowing how to build a bar chart in Tableau or in Power BI. Competency in the
concept is knowing when to use a bar chart or should you use a line chart or should you
use something else instead? How does the data need to be shaped so that you can do
that? How do I connect to the data? There’s a bunch of underlying concepts that make
these tools easy. And if you think about it, when you hire a contractor to oversee the
construction of your house, for example, would you care what kind of hammer they used?
Probably not. The tool isn’t as important as the person’s capabilities. Just like when you
first learned to be a management accountant, did you have to learn Excel first? I doubt it. I
Page | 37
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
don’t remember there being questions on Excel when I passed the CMA. Building an
analytics solution, visualization or otherwise, can be viewed through the same lens. Our
concern is how we are applying the visualization concepts so that they can be understood.
Now, this may seem like an unimportant distinction. Why am I going on and on about
this? Because it’s so important that you understand the difference between a tool and the
concept the tool is applying. That’s how you can see me jumping from Power BI to Tableau
to RapidMiner to all these different tools. It’s not because I’m a genius. No, I’m no smarter
than anyone else. It’s just I know the underlying concepts so I can easily jump from one to
another vs. having to learn specifically how to do something in one tool.
We’ve talked in depth about why concepts are important. What are the concepts? Same
analytics concepts used in data science. You connect to the data, you transform that data
and information, and you represent the data in an understandable way. The specific
concepts will be covered in the visualization concepts video, and the data concepts and
connecting to data concepts will be covered in the data and data environments concepts
video. But at a high level we connect, we transform, we represent. So what are the
categories of these tools? Well, and this is purely my personal experience. I feel there are
roughly four general categories to which visualization tools fall under. The categories that
I found are a spreadsheet-focused, things like Excel, Domo, the, a visualization first, the
ones where they focus on you being able to create a visualization. Those are the famous
ones like Tableau and Power BI.
Then you have ones that are centered more on the workflow, workflow process
management utilities like RapidMiner and Nine. Finally, code-centric tools like Jupyter
Notebooks, Zeplin. Even our studio notebooks like let’s dive into each one of these. So first
are our spreadsheet focus tools and Excel is the bread and butter of accounting. Many
visualization features have been added to Excel. Excel itself is not a terrible visualization
platform, particularly if you’re using pivot tables and pivot charts instead of making new
tabs and new data sets for each visualization. Pivot charts actually work a lot like the other
visualization tools, so if you’re using those, you’re probably comfortable with the
visualization focus tools already. Because of Excel’s market dominance, many apps have
tried to mimic or extend Excel’s functionality. After all if you can’t beat them, join them.
Apps like Domo, Pitch that you can do advanced analytics within Excel, but there’s a lot of
operations happening on the back end.
I’ll put up a little information in the course content to links to some of the platforms. But
Page | 38
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
I’d wait until you’re finished with all of these videos before you start exploring them so
that you can better understand the data and data environment concepts. I hear people
say that I need to see my data as a common complaint on the other tool categories,
particularly for people that started with Excel. I mentioned before, though, that
sometimes you have terabytes of data. You have so much data, you can’t bring it into your
computer. It’s just not practical to see all of the data. It won’t fit onto your computer. And
even if it could, there’ so much of it that seeing each individual row isn’t going to provide
that much value to you. There are plenty of ways where you can click on a visualization, a
component of visualization, and see what the details are about that point.
That’s called drilling down, which is a component in dashboards. And we’ll talk about that
in depth in a future video. For now, sticking to the tools themselves, one of the primary
ways of creating a dashboard, particularly interactive dashboards, are the visualizationfocused tools. These visualization-focused tools might be more likely called focused
platforms since they tend to encompass both the transformation of data, data workflows,
and the visualizations themselves, however, they’re visualization first and they tend to
abstract a lot of those data transformations. So if I say abstract, it means that the tool is
determining how to perform an operation, not the user, which may not always be ideal,
particularly if you need to communicate how that tool is doing the transformation. Since
you can’t see the back-end operation with visualization tools, though, as their name
implies, they can look gorgeous. Those gorgeous visuals abstract a lot of complex
operations down to relatively simple steps, so the good-looking visuals can be created
quickly because of all that abstraction, you can create interactive data, connected
dashboards.
Those dashboards can be made into reoccurring data, connected reports. They can extract
large data source queries from applications like Hive, Spark, Big Query, Redshift, etcetera,
for those exploratory operations. The major players in visualization-focused platforms are
Tableau, Power BI, Click, lesser-known players like Burst, Pentaho, Spotfire, and if you don’t
have access to one through your workplace and you want to learn, both Tableau and
Power BI have free versions, Tableau Public and Power BI Free License, respectively. In
terms of the capabilities of these platforms, there’s pros and cons to all of them. If you
want to use AR or create some of the more advanced visualizations, Power BI is considered
by some to be a bit better, however, they use an underlying language called Dax, D-A-X, in
order to create functions. So if you need to create custom functions or custom operations
within the platform, there’s a pretty steep learning curve.
Page | 39
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Whereas building custom calculations in Tableau, it’s generally easier because they have a
more abstracted, a SQL language that makes it easier to create custom calculations. But
that level of abstraction can make it hard to extend the platform beyond making a custom
calculation. So due to the difficulty in exploring the workflow and the different operations
used to create a visualization or an analysis, tools that are focused more on the workflow,
the transformations that are done to your data set. So if you look at things like
RapidMiner 9, they’re great for exploratory data mining and for creating visuals of the data
exploration. All you really need to know is search the name of the concept you want to
apply and you’ll see a module for it. They can be a little confusing for beginners and they
often require a lot of local processing, but these are a good middle ground between the
abstraction present in visualization tools and the more detailed operations found in code
focus tools. So speaking of the code focus tools, things like Jupyter Notebooks, our
notebooks is that plan. These are visualization tools second, code management tools first.
They are fantastic for exploratory data analysis because you have a clear record of every
transformation that happened to that data. You have a clear record of everything that’s
happened because it’s right there in code. In the notebooks, you can write your research
notes and your findings directly in the notebook and have the code recompile and be
completely reproducible on demand. However, because you need to know the
fundamentals of coding and that’s beyond the scope of this course, we won’t be covering
code-focused tools in these videos. However, I know you are a data scientist and I know
you already know the basics of coding even if you don’t think so.
I highly recommend you watch some introductory videos on BASIC, Python, or R to learn
the concepts. So in conclusion, in this video we covered what is a visualization tool, how
did they apply the same analytics concepts, but focused on different aspects of the
analytics workflow and or/user experience. The analytics concepts they apply are
connecting to data, transforming the data into information, and representing the
information in a usable format. Visualization tools may focus on spreadsheet data, the
visualizations themselves, the workflow to connect, transform, and represent data and
information or the underlying code behind all those data operations. The different tools
all have advantages and disadvantages, and you’ll be able to accomplish much the same
in all of them with varying degrees of difficulty and effectiveness. What’s more important
is how you use the analytics concepts to provide information on solved business
questions as well as explore what they do to find new business questions to solve. We’ll
introduce concepts around communicating results in your various audiences for
Page | 40
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
visualizations in our next video.
Section 2: Huskie Motor Corporation: Background
Huskie Motor Corporation (HMC) is an automobile manufacturing company with production and
sales throughout the world. Automobile manufacturing and sales is a complex and highly
competitive business. Let’s explore the Huskie Motors background in detail to learn key business
information about the organization.
Huskie Motor Corporation
Although the automotive industry has a broad global reach, only 15 companies produce 88% of
the world’s vehicles. HMC is a new and a smaller player in the automotive manufacturing market.
If it is to survive, the company must fully understand its markets, customer base, and costs to
keep profit margins positive. It has some very popular brands and high customer satisfaction-both
are critical assets at this stage of the game.
HMC: Miranda Albany
Miranda Albany was hired at HMC as a senior cost analyst three years ago, when the company
first began operations after a spin-off from Blue Diamond Automotive, a large auto manufacturing
company. Recently promoted to assistant controller, Miranda is anxious to make a good
impression on her boss.
Communication with HMC Executives
Although Miranda is sure that the data she has collected can help her management team make
better decisions, she does not have the time, or expertise, to figure out how to organize or use the
data effectively.
Miranda communicates with HMC’s executive team on a weekly basis to convey vital information
regarding marketing strategies, sales targets, and production needs. Yet she feels that her
information is often “lost in translation,” as the executive team struggles to digest the numbers.
Miranda believes that data visualization may be a crucial component in helping her effectively
connect with HMC executives.
Page | 41
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Video 3: Data Visualization – Know Your Audience
Miranda’s struggle to effectively communicate to her executive team highlights the importance of
understanding the audience’s needs, and how visualizations answer business questions. Dan will
discuss how to select visualizations based on your audience.
Video Transcript (Est. time: 13:34 min):
In this video we introduce the concept of understanding your audience interest. That is a
business question and how it determines the types of visualizations you should use. We
may create visualizations for audience interest of exploring data or perhaps informing the
audience or simply reporting a value. In our previous video, we introduced high-level
analytics concepts. In this video we overlay those concepts of connecting to data,
transforming data, and representing the information created by transforming the data,
we overlay those concepts with the data analytics process introduced in the case study as
well as the audience needs. Furthermore, the audience needs, the audience interest for
visualizations also overlaps with analytical maturity concepts. The analytical maturity
concept stages are descriptive, diagnostic, and predictive. Finally, we spend a little time on
what exactly do we mean when we say business question and what does that have to do
with the audience interest. So first let’s talk about the audience. It’s easy to assume that
visualizations and the concepts associated with visualizations are one size fits all.
We will soon learn in an upcoming video that there are rules for what visualizations work
best in a given scenario. So it’s only natural to assume that those scenarios are based on
the underlying data or information only. Who’s receiving that visualization should be
irrelevant, right? And sometimes there are boilerplate best practices. Sometimes there are
universal best practices for visualizations. However in most cases, context matters. The
audience matters.
For example, if the visualization explains information, we discovered in data that is an
informative visualization. The visualization would be very different than the numerous
visualizations we would have used to explore the data in order to discover that new
information that we are now informing someone else about. An informative visualization
is simple. It's to the point. And exploratory visualization is usually quite complex.
Page | 42
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Understanding which visualization to use and for whom is an important concept when it
comes to visual storytelling. So visual storytelling is what you’re trying to get across. The
idea of visual storytelling means that your audience is receiving the information that they
need in order to best use that information. So somebody who would need a more
exploratory type of visualization or even an interactive dashboard would be someone like
an analyst, a manager, a consultant, somebody who really understands that data and
needs to understand the entire process of how some insight came about to really have by
and to really understand what’s happening. Alternatively, someone like an executive, a VP,
a director wouldn’t really care that much about how you reached some conclusion. They
would just want to know what that conclusion was. You would still need the underlying
evidence of how you reached the conclusion, of course, but it’s just the facts. Tell me what
matters. Those are simple informative visualizations, and when the exploration has
informed people so much that they now are repeatedly asking the same question, now
that becomes something more along the lines of a report, very similar to informative but
now we’re talking about something that’s reoccurring, something that’s about stability. So
these would be, for even an external audience, would be a reporting-focused audience,
but often it would still be the same people as exploratory and informative, just a very, very,
very, simple type of visualization.
And if you notice that I talked about that progression, that general process of taking data
and creating information for an audience, well that’s similar to the data analytics process
that we outlined in the case study. You’ll see all these steps identifying the business
question connecting to data, cleaning it, etcetera. All of those involve exploratory
visualizations. Once we’ve gotten the data and explored it, cleaned it, explored it again,
and believe we have an indication of the answer to the business question. Well, then we
use more informative visuals. Informative visuals that clearly explain the answer that
we’ve determined through our exploratory effort. Many times the information that we’ve
obtained isn’t a one-off answer. The information can provide an ongoing solution. For
example, many of the financial metrics which we’ve all learned in earning our CMAs or our
basic finance courses when we first started our job as a financial analyst. Those metrics,
those measures, like price-to-equity ratio, like CAPM, like earnings before interest, debt
and taxes, even debt-to-equity ratio.
These are not metrics or KPI, that, KPI that have always existed. Somebody originally
discovered that they are an indicator of business performance so whether informative
visualizations answer a question that is asked repeatedly, often that informative
Page | 43
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
visualization will become a report. A report again is a visualization, a reporting
visualization. Even a table is a visualization, needs to be extremely precise and
communicates the answer to a business question over and over and over again reporting.
Like financial reporting and visual reporting these are very rules-driven, it’s very datadriven and requires more capability around consistency and the inference structure of the
data environment rather than necessarily the competency of how effective the
visualization may be at communicating the report. The effectiveness of the visualization
has already been demonstrated in the exploratory and informative steps and
conceptually the idea of reporting informative, exploratory, etcetera, that also aligns with
the concepts of predictive capability, descriptive, informative, the steps of analytic
maturity in analytics maturity is a business theory indicating that as an organization
becomes more capable of using their data to make informed business decisions, i.e.,
analytics, they will gain more value from their data.
The steps in analytics maturity are descriptive, diagnostic, and predictive. Many of you will
have also seen this before with prescriptive at the end, beyond the scope of this video but
it’s out there when you look it up. Going back descriptive analytics, descriptive analytics
merely communicates some historical event and leaves interpretation wholly up to the
end user. Just like a financial report. Consistency and speed are typically the most
important aspects of descriptive analytics. Diagnostic analytics, this goes a step further
than descriptive. It’s not just describing some attribute of data, it informs the audience
with additional information either to help the audience reaching more informed
conclusion or to inform the audience about the analytics conclusion. As you can see, this
is similar to informative visualizations. You will come across this in financial analytics.
When they’re taking a metric found in a diagnostic measure and they’re adding additional
information to provide context as to why that metric may have gone up or down.
Predictive analytics uses statistical models to forecast new information. Given historical
information such as changes in a KPI and the additional data around it, it’s using that to
provide a forecast of what could be a future value.
Obviously, financial forecasting is a good proxy or is a good example of predictive
analytics. Now, let’s talk about business questions. A few times I’ve said understanding the
business question or defining the business question, what does that mean exactly? Pretty
much what it sounds like. It’s the question we need to answer to help achieve our
objectives, the objectives usually solving some problem or fixing a situation. But as we
know, a simple definition, a simple question, can hide a great deal of complexity. A
business question can be interpreted as what problem are we trying to solve? And in
Page | 44
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
many cases, though, if we can define the problem, somebody has already figured out how
to solve it. Identifying the business question is by far the hardest part of this entire
process. Case in point, I once had a client in the course of six months, this client rejected
two different visual analytics contractors I had placed at their firm.
I personally went to that client at my company's expense, shadowed the third contractor
for a week, and during that time I found that while the client had asked for a highly
technical resource, somebody that was proficient in the visualization platform, they
weren’t really defining their business questions. The client wasn’t defining what they
needed out of the visualization platform. Their process of defining their business
questions was just a kind of have a roundtable discussion with the contractor about what
their problems were in general. The meeting was more of an airing of grievances rather
than a proper request requirements-gathering session. This was my mistake. I was young
and I thought if somebody was asking for specific skills, for example, a Tableau developer,
then they may only need those skills. I looked only at the problem as it was defined, not
the root problem itself. The lesson here is that just because somebody communicates
their problem does not mean they are communicating the problem.
The question that needs answering, this is the real skill of an analyst, is identifying what
question needs answering, not just the problem as it’s stated. Trying to solve a problem vs.
determining the actual problem which needs solving is the difference between a junior
developer and a senior analyst. It’s the difference between a purely technical skill and a
broader awareness of the business. It’s the difference between being reactive and
proactive. That may seem obvious to you but that’s because you’re probably one of those
senior resources, you’ve already accomplished the hard part. In this course it’s assumed
the root business question is already defined. It’s like how introductory physics assumes
everything is in a vacuum. Assuming the root business question is defined is a necessary
assumption for you to learn the technical competencies to answer said business question.
And many of you are taking this course because you feel the need to increase your
technical analytic skills.
Those skills are important. They are the price for entry. Take it from me, management
accountants are probably the best people I’ve ever met at understanding the real problem,
at defining the actual question. So use my hard-earned experience, if you’ve already
mastered the hard part of this process, and that’s understanding the bigger picture, the
bigger needs of the business, you’re already well on your way to being a data scientist, a
Page | 45
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
data analyst, whatever you want to call yourself. In conclusion, in this video we’re
beginning to categorize our visualizations based on the intended audience for said
visualizations. We may create visualizations for exploring data, informing our audience, or
simply reporting a value. The audience for visualizations overlaps with analytical maturity
concepts. Exploratory visuals are often used in predictive and diagnostic analytics, while
informative visualizations are primarily in descriptive and diagnostic analytics. Finally,
because of the goal of analytics is ultimately to either answer a business question for your
audience or identify new business questions to answer, in which case you are often the
audience, we discussed what a business question actually means. In our next video, we’ll
dive deeper into those visualization concepts we keep talking about. Stay tuned.
Miranda Hires a Consulting Firm
To help her utilize the massive amounts of data at her disposal, Miranda has interviewed
consulting firms that specialize in information technology (IT) and data engineering.
Ultimately, Miranda chose D & A Consulting because of its automotive industry expertise and its
focus on data analytics and visualization.
Video 4: Data Visualization – Visualization Concepts
D&A Consulting’s expertise in the automotive industry will enable them to choose the best
visualizations to effectively communicate their findings. Now we’ll hear from Dan as he discusses
some basic visualization concepts and the various ways data can be visualized.
Video Transcript (Est. time: 6:32 min):
Now that we have some awareness around how our visualizations will be received by
different audiences, let’s move our discussion towards the visualizations themselves. This
video concentrates on visualization theory, namely how shapes, lines, and points are used
to display information. Shapes represent a single value or groups of values. Lines
represent change or connection. Points represent the intersection of two or more values.
Page | 46
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Just as even the most advanced statistics are still based on addition, multiplication, and
division, even the most complex visualization is formed through some combination of
shapes, lines, and points, and in this video I’ll be white boarding, because white boarding
is how we often start defining questions or creating visualizations. In practice it’s good for
us to see some of those real-world principles in action, and yes, that means that [we’ll]
basically be drawing shapes in this video, and that may make visual theory seem a little
childish at first, but I assure you we are not in elementary school again because we will
quickly move from basic shapes to much more advanced concepts.
Let’s get started. We’ll begin our discussion on visualizations with a little theory, followed
by more practical examples in the upcoming videos. Generally we have shapes like bars,
blocks, or circles, sometimes separated into segments. These shapes tend to be single or
single aggregate value and aggregate value as in sum or total. The shapes, the pattern of
data within the category, or the relationship between the categories is not important. We’ll
talk more about what that means when we cover distributions and relationships
respectively. Next, we have lines. Lines tend to represent a change or difference between
values most often associated with a change over time. Lines are visually effective at
showing that one point is lower or higher than another when on the same axis.
It’s also useful to reinforce size differences or position differences when shapes alone may
be ambiguous due to their ability to indicate direction or trend. Lines are useful to simplify
complex patterns as seen in forecasting and regression. Due to their ability to show
difference and changes between shapes, they’re useful for things like data distributions as
we see in histograms. For points, the position of a point in a visualization represents the
intersection of two values, which is why points are most often used in scatterplots. That
level of granularity is useful for recognizing larger patterns within the category or group,
as well as between the categories or values. As I mentioned, you’ll see lines used to
simplify trends or relationship within a collection of points like with a scatterplot and a
regression line. In fact, you may see shapes or colors used to distinguish patterns
between groups of points.
You can see that we mix the concepts of different shapes and lines and colors within
visualizations, so as a gentle introduction to some of the more complex visualizations, let’s
look at how points, lines, and shapes are used together to visualize more complex
patterns within and between groups of data. Visualizing patterns, the data itself is called
looking at the distribution of your data. Distribution visualization such as histograms and
Page | 47
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
box whisker plots combine both lines and shapes. There are even points showing outliers
at times, and when I say distribution of your data or frequency of data, it means how often
a given value appears within a given data element, or a group, or category, or column. It
can be the sale price of cars, age of customers, time between purchases, etcetera. The
frequency distribution or shape of data gives us more insight into the underlying data
which would not be apparent if we were only given the average or median data value.
The boxes in a box whisker plot represent the middle 50 percentile of your data with a line
that represents the median or the middle value of your data. Now you’ll also notice that
there’s a line above and below that isn’t a line so much as it is a T shape or whisker. Those
whiskers represent typically the top and bottom 25% of your data, but you may also see
lines between box whisker plots to show how different one box whisker is to another, so
you get a sense of how much overlap there are between the two. And keep in mind,
histograms and box whisker plots represent the same thing, the shape or distribution of
your data. You can look at box whisker plots as a histogram simply turned on its side.
Alright, in this video we covered how simple geometric objects can be combined to
represent complex information. Shapes indicated some aggregation of values. Lines
showed change over time and/or trends, points and intersection of values and shapes, lines
and points all combined show distribution of values with a data set. Moving forward, we’ll
see how the data visualization tools apply the concepts within the context of your case
study and look in detail on specific visualization types, both informative and exploratory.
HMC Staff
Miranda asked Megan Martinez, a senior staff accountant at HMC, and Adam Green, a staff
accountant at HMC, to work with D & A on the project.
HMC: Megan Martinez
Megan has been with HMC for two years and has recently relocated to the corporate
headquarters in Dearborn, Mich. Megan’s corporate transition is almost complete, and she is
anxious to move forward in her current position.
HMC: Adam Green
Page | 48
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Adam is a young, aggressive employee who began with the company eight months ago, since he
graduated college. He has a good sense of judgment and is eager to make a good impression on
upper management. Miranda believes the two employees will provide a good mix of experience,
dedication, and teamwork.
Section 3: D&A Consulting Group: Background
D & A Consulting was started by Doug Chan and Arlo Paxton five years ago. Doug and Arlo have
been friends since college, having graduated with accounting degrees from the same university 15
years ago.
Although they initially went to work at different accounting firms, they both followed similar career
paths. After becoming managers at their respective firms, Doug and Arlo decided that their real
passion was in teaching clients how to use data to make better business decisions.
They started their own consulting firm with one primary focus: helping clients better understand
their businesses via the use of data analytics and data visualization.
Video 5: Data Visualization – Visualization Selection
D&A Consulting’s primary focus is to help clients understand their business better by using data
analytics and visualization. Dan Smith will now discuss how to determine the best visualizations in
the context of the business question.
Video Transcript (Est. time: 5:20 min):
In this video, we’re going to look at the different types of visualizations and how to select a
given visualization for a given business problem, and provide a simple flowchart to aid
you in visualization selection. All visualizations have a function. You may simply be
comparing two different collections of objects such as total revenue vs. total expenses.
You may be looking at the pattern of a single value over time, such as profit over time, or
you may be exploring a complex relationship between many different elements such as
total revenue vs. expenses by country by month. Regardless, all visualization functions are
ultimately used to answer some form of business question, and you can use a simple
decision tree once you know what types of questions you need to solve.
So let’s dive deeper into defining these patterns in visualization functions. The pattern that
we see in most, if not all, visualization decision tree diagrams, it starts with a question of
are we trying to visualize a comparison, composition, relationship, or distribution?
Page | 49
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
And are we looking at a single point in time or a change over time, are more informative.
Visualizations are the comparative and composition visuals. Comparison is when we want
to compare differences between our groups as a whole, so the aggregates of profit
between countries or between even sales channels, something along those lines. The
standard bar charts are typically associated with comparisons and are probably the most
common visualization you’ll see. So many of you may already have experience using them
when comparing categories, factors, or groups. Again, those terms tend to be used
interchangeably. When, when comparing those categories over time, particularly when you
have many categories, it’s almost always a good idea to use a line chart. We do often see
bar charts used when it’s a small number of time periods like in financial statements from
year to year, but if we have a lot like weeks or days, a line chart is far more effective at
representing those more granular changes.
Composition, visuals on the other hand, these are typically shapes or combinations of
shapes as well. They show the categories within a larger group, think population per city,
within a state or sales per region within a company. You’ll also see composition visuals
associated with pie charts. We can talk about those. They have their use, but we’ll dive into
that more in the details of our informative visualizations.
Let’s move on to the relationship and distribution visuals. Relationship visualizations tend
to focus on how the individual values of one group change given the values of another.
The last visualization family are distributions. These are scatterplots, histograms, and box
whisker plots we already mentioned. We’ll dive a lot deeper into the relationship and
distribution visualizations when we talk about exploratory data, but let’s talk about theory
again. Why do we use these visualizations when we do? Who decided these rules?
Well, we did. It’s simply the way that humans process information. Decades ago, a famous
study by Cleveland et al tested many, many people with a series of visualizations including
pie charts, bar charts, line charts, etcetera, all the things we just talked about. The study
found a few general trends in the way people interpret information visually. Visualizations
that share a baseline that start from the same point which are only viewing a single
distance or direction, for example, a bar chart, which is ultimately representing the distance
of that bar from a baseline. Those tend to be the least confusing, confusing visuals. While
people struggle to interpret the differences between area that is one circle bigger than the
other and even more so, they struggle with interpreting difference between color, shades
of color in particular. So Cleveland has a hierarchy of which visualization to use, when. We
will dive deeper into that in the upcoming videos.
Page | 50
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
In conclusion, in this video, we saw how taking the basic concepts of visualizations and
how we can use that to find the best visualization for most business questions. Using a
simple flowchart, we reviewed how research established using these visualizations
resulted in more interpretability and less human error. In our next two videos, we’ll take
an in-depth look at the informative visualizations, line, pie charts, as well as the
exploratory visualizations, scattered tree bubble and histograms. Stay tuned.
D&A's Business Purpose
Though data analytics is not a new concept in the business world, the amount of data available and
the number of sources from which it can be captured has skyrocketed. Driven by lower storage
costs and more “user-friendly” analysis, software businesses have vastly increased the amount of
data they collect and store. Yet finding the talent needed to transform that data into useful insights
is what businesses find challenging. D&A Consulting helps companies fill that void.
D&A Consulting
Doug and Arlo are excited about the opportunity to work with HMC. They have assigned their
automotive industry expert, Kevin Lydon, as the project lead, along with a D & A new hire, Jan
Morrison.
Kevin has been with D & A nearly as long as the company has been in existence. He and Doug had
met each other on a consulting project where Kevin was working as an IT engineer.
As project lead, Kevin is enthusiastic about mentoring Jan on her first assignment. He is equally
enthusiastic about the potential for improvement at HMC. He reassures Jan that this client will be a
great opportunity for her to test her data analytics and data visualization skills.
HMC: Miranda Albany, Continued
She has advocated a “data-driven” strategy for decision making at the company by capturing a
vast number of product-specific details relevant for both production and marketing.
The problem is that the company has grown so quickly that Miranda is having a difficult time
keeping up with the massive amounts of data that continue to accumulate.
To further complicate matters, there is a growing need for reporting detail and in-depth analysis
of product lines given the availability of additional data.
Video 6: Data Visualization – Informative Visualizations
Page | 51
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Next, to further understand how to communicate ideas and discover business opportunities
through data visualization within analytics, Dan will now go through informative visualizations and
how they can ultimately tell a new story for the business.
Video Transcript (Est. time: 9:46 min):
This video covers visualizations more focused on informative and reporting functions,
namely bar, pie, and line graphs. In our visualization selection flowchart, bar, pie, and line
focused visuals are associated with comparative composition and time series visualizations,
respectively. Before we begin exploring these informative and reporting visualizations,
recall that different audiences require different types of information communication, so our
first thought with visualizations should be to consider if our task is exploratory, informative,
or repeated reporting. In your case study, they mentioned the data analytics process.
Determining the correct visualization begins the same way. We need to first understand the
business question before we can determine the correct visualization. However, you may
find there is no clear business question or you are the only one who is supposed to
determine the business question, but for now let,s start with the assumption that we have
discovered a clear objective and the question we need answered is well defined.
Clear, well-defined question with quantifiable answers means that we should approach
our visualization from an informative or a reporting perspective. And most well-defined
business questions involve comparing the performance of one group to another. So let’s
look very quickly at the theory of why we use the visualizations we do for comparative
visualization. We talked about Cleveland before. In a further analysis using the concepts
that Cleveland uncovered, researchers found clear differences in people’s ability to
interpret comparative visualizations. The diagram on your screen shows distributions of
errors based on the perceived differences in comparative visuals. Looking at distance
from a common baseline. That's the first grouping of perceived errors, the lowest amount.
Next we saw difference in length or radius such as we would see with a, with a pie chart.
And the final one with the most error were visualizations that had a difference in shape
without a common baseline.
So these are things like bubble plots or tree, tree graphs. And this reinforces the concept
that we should use bar charts for comparative visuals whenever possible. And this is to
reduce errors. Whereas bubble and tree map visuals may look cool, but they’re not ideal for
business critical reports. It helps us to find some rules of thumb for our visualizations. So
now we can finally move into some examples of visualizations, such as if our business
question is what percent each region contributed to total sales. Now, because there are
Page | 52
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
only three regions, we could look for a pie or bar chart. We could pick either one, but if the
question was how much did each country contribute because there are more than four,
only a bar chart is really appropriate here unless we do some type of additional grouping.
Bar charts and pie charts are used when comparing single or cumulative values between
groups. As a general rule, use a bar chart over a pie chart, particularly when you’re talking
about actual values. We probably want to know the absolute value or actual value of a given
country in these examples rather than the percentage of total. If we were
comparing one country to another, then potentially the pie chart would be useful, but we
want to know the actual values so a bar chart makes more sense. So some rules of thumb,
you always want to use a bar chart. If you have more than five groups, if you have more
than 10 groups, consider combining some of the groups or creating a detailed
visualization that we’ll talk about in dashboarding. So now let’s look at when we are
focused not just on comparison but changing from one group to another.
If the business question is “How much did each region/country of their revenue or
contribution margin change from year to year?”, it is possible to use a stacked bar chart or
a hundred percent stacked bar if it’s percent contribution. However, given the time series
nature of the business question and just the gross quantity of them, it becomes very
difficult. It’s been shown in all the research that we’ve listed that lines connecting points
over time are more effective at communicating change. Therefore, a visualization with
connecting lines is preferred. So you can also say and suggest that of an area chart. I’ve
heard some people say this as well. Area charts don’t work very well with negative values.
So you’d see a much better representation of a change over time with the line. If the
business question is comparing group contribution to a whole over time, because this is a
compound question, that is, what is the total amount contributed by group and how
much did that amount change over time?
My preference here is to use two separate visuals, a bar chart to show total amount and
align charge. This showing percent change of total. In this situation it’s okay to use a stacked
bar but more as a visual cue to our audience. We need to consider the groups making up
our total and this is so they have a better frame of reference for the associated percent of
total visualization. Also notice I placed the first visual above the second. In the case, the
second visual is difficult to interpret without context from the first when visual
interpretation depends on another visual. While some cultures may read left to right and
others read right to left, most if not all read top to bottom. So put your simplest, highestlevel visual on top and the details on the bottom.
Page | 53
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Moving onto more reporting-scoped business questions. What if you need to know how
much of this year’s variable cost budget has been used year to date? More generally, how is
a given KPI performing against its target value? Now, for the visualization alone, some would
only want to report the value using something complicated like a gauge. People as soon as
it gets into a KPI, they want to stick a gauge on there. However, in my opinion, the next
question when we’re talking about any KPI with a performance target is almost always, well
what was it doing last period? What about last year? So why not just start with a time series
graph and insert a line for the target budget. This way you know both how
it’s performing now and if it’s above or below that target. Whatever data manipulation you
have to do to insert a line will be exactly the same as with a gauge or other visualization,
plus you’re already answering questions about historical performance.
Now as a final piece of advice when talking about informative vs. reporting visualizations,
when the business question is established and requires repeated refreshing-a refresh just
means updating the data behind the visualization-this is almost certainly a report rather
than an informative analysis. Look for keywords like KPI, SLA, metric, defined business rules.
When these are spoken about the needs of the visualization, your organization may have a
dedicated reporting team. This reporting team may specialize in maintaining such reports.
If you are not part of that group, consider reaching out to them to maintain the informative
analysis that you now want to turn into a report. Personally, I would leave that up to the
professionals.
All right. We covered a lot in a brief period of time. Some key takeaways from this video.
Bar graphs work well when comparing values between a few groups on the same scale. Pie
charts may be okay when showing percent of total for a few groups; however, a bar chart
can show the same information answered both as a comparative and composition visual.
Use line charts for change over time. Don’t be afraid to combine visualizations if you need
to tell a more complicated story. And finally, your organization may have a data and
reporting specialist to help standardize your information into a more consistent report. Our
next video, we’ll move into more complicated exploratory analysis and the associated
relationship and distribution visualization.
Video 7: Data Visualization – Exploratory Visuals
As a next step, we’ll now hear from Dan as he discusses best practices and tips for using
exploratory visuals, relating back to the business question, so one can learn about the data and
find new stories to tell.
Page | 54
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Video Transcript (Est. time: 13:04 min):
Now let’s move on to exploratory visualizations. In this section, we will look at more
advanced visualizations such as scatterplots, tree maps, bubble charts, principal
coordinates, plots, histogram, and box whisker plots, and let’s start with some financial
and sales metrics. What if we want to know what is the relationship of material costs to
after-tax profits?
Well, as a starting point, even the most advanced analyst will use scatterplots to start
visualizing patterns. Here you can see the material cost vs. the after-tax profits with a point
for each VIN number, so that’s what did it cost to make the car in terms of raw materials
and how much were we able to sell that car for. This demonstrates how scatterplots show
us the intersection of two related series of values. Think about two values in the same row,
but different columns. In Excel, you have a row of data, you have your column, and then
you have another column. That’s the same way you create a scatterplot. In a pivot chart, we
may be able to observe a general upward or downward trend in data points between the
two factors, or we may be able to observe groups in the data set. In the example on your
screen, you can see Bose, you can see the relationship between material costs and after-tax
profits, as you expect. Generally, we see profits decrease as material costs increase, that big
downward slice in the middle. However, there are two major exceptions. There is a group
in the upper right with really high material cost, but also high after-tax profits. And then you
see a group in the lower left with negative profits but also unusually, extremely low material
costs. So how do we examine this? Well, within the scatterplot we can start adding colors or
other shapes to start trying to pick out patterns. We can look at a model and see that the
CHARE model, C- H-A-R-E, constitutes the entirety of that upper-right-hand cluster. So we
can imagine that this is the luxury brand with high margins and high material costs.
Unfortunately, the lower left is a little more ambiguous. That looks like there's a bunch of
other models
within that grouping. So you see a scatterplot may not always tell enough of a story. In
fact, it rarely does.
So let’s look at some other factors that might be influencing these unusual patterns in
groupings of a cost vs. profit. We saw there were business questions related to channel,
so let’s explore viewing multiple channels against models against their profits. Let’s
explore viewing groups in tree maps. These are similar to pie charts but with squares
instead of slices. So I guess it’s more of like a brownie or a lasagna chart that I don’t know.
If you look at seeing after-tax earnings by channel one, two, three, and model, that’s a lot
of information. There are some groups here and there that may stand out. Um, but it’s not
that descriptive. Similar to a bubble chart, which a bubble is sized by the value. And then
Page | 55
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
colors delineate the different channels or models. So for each slice you get a bubble. And
the value is the size of that. I’ll be honest with you. These charts aren’t very useful by
themselves. They’re good for exploring things in a dashboard because you can mark them.
However, by themselves they don’t typically tell much of a story. It’s often more effective
to use a bar chart for this type of stand-alone visualization. But when exploring in a
dashboard, they can be rather effective. More on that later.
So as strange as it sounds coming from me for even an exploratory visualization, but also
one that can be very effectively moved from exploratory to informative, I feel a table with
a heat map is a better visualization for this scenario, and often in many scenarios, the
example on your screen, you just see a simple, well, basically, a pivot table. It’s your sales
channel by model, by the average after-tax profits. This represents the same thing you
saw on all those other diagrams. Just it’s more condensed. You can still see the same
information based on the color, but you also see a value. You see a raw number. This is
tangible and in fact, speaking of exploratory visualizations and an example of how you
would combine these so that you can explore but also inform, you can create a simple
dashboard of a drill down, and we’ll describe what a drill down means in the
dashboarding section, where you can look at your scatterplot and see how the different
clusters or values within that scatterplot relate to another visualization. Like in this case,
the table diagram that we created enough about dashboarding and groupings in
scatterplots. Let’s move on to say a business question about does an increase in sales for
one sales channel take away sales in another? So does an increase in sales for one sales
channel take away sales in another. This is a common question in the real world, and it’s
an extremely complex question.
This question is probably better served by machine learning and some complex modeling,
but for the sake of visualization, we only want to compare the percentage of sales in one
channel vs. these percentage of sales in another and how that changes over the course of
months. We want to look at it month to month in this example because we are exploring
the relationship between two groups with multiple occurrences. In this case, the
differential of percentage sales in one channel vs. the another. For a given month, we can
use a special variation of a line chart, a principle components plot, so the specific criteria,
it has to be the same scale. Two groups, some relation, and this tells us how one moves vs.
the other. In fact, there’s some additional research on this showing that a principle
components plot is better interpreted in highly ambiguous scenarios. Where you’re trying
to show there is not much of the relationship vs. places where there’s a strong
relationship. A scatterplot is well interpreted on the fringes. A principle components plot
helps people not make false assumptions of relationships. Scatterplots are very easy to do
Page | 56
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
that. If you draw a line through it, you kind of draw a picture in your head. It’s very hard to
make a fake assumption or false assumption of relationship when there isn’t one with a
principle components plot.
Off my soapbox for principal components plots and now onto my soapbox for measures
of central tendency. Now, in many cases I’ve seen people use bar charts or even pie charts
to communicate average value between groups. But an average isn’t just a single value. It
isn’t a value that, that means a single thing. So like some total sales, I can sum up my sales.
There’s no misinterpretation about what that value means. It’s a bunch of numbers put
together as a total average or arithmetic mean is a measure of central tendency. It’s
describing a group of values average only suggest the central value, but it doesn’t tell the
whole story. Average can be skewed, meaning the same average can represent vastly
different underlying values. Look at a normal distribution vs. a skewed distribution.
I’m going to show you my process of looking at a distribution for a few different factors as
they relate to contribution margin. And when we look at the frequency of our contribution
margin, we see a lot of hills. It looks like a mountain range. We see when we see a multipeaked distribution like this, we refer to it as multimodal. Multimodal means that there
are some values that occur very frequently.
Multimodal suggests there are multiple categories within this trust distribution that there
are groups, so we may see effects similar to what we saw in that scatterplot. We may see
clustering groupings of things that vary by country, region, sales channel, etcetera. Our
distribution may not be comprised of a single group of data or information. There may be
multiple blobs of, of information, of groups that we need to tease out. You can see that
some of these patterns emerge when we overlay different categories in contribution
margin distribution, but first I need to take a step back and talk about what is a
distribution.
What do I mean by distribution? You can see a histogram, but what does that really talk
about? Let’s, let’s simplify the scenario and let’s say we have two car models, A and B. Let’s
say for model A, we have contribution margin of 100, 95, 90, 87, 85 or whatever. It’s on your
screen as model B. We have similar contribution margins and this vastly simplified example,
we can see the average for model A is around 83. The average from model B is around 73.
However, if we add some additional values and model A, say something that’s well below
the, the existing average, we’re in model B, we pick just two values that are similar to the
average. Our new average is completely different. If we just look at the average for model A
vs. model B in the scenario, we would think that model B had a far higher contribution
model than model A. Whereas it’s only these outlying values that are driving things, driving
Page | 57
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
things down for model A, which is one of the primary reasons why histograms and box
whisker plots are so important.
What we’re looking at when we see a histogram are not the numbers themselves. It’s a
frequency distribution of frequency of how often those numbers occur within given
groups. We call those groups bins. In this example, let’s say we have bins of 190, 80, and
so on and we can look at the frequency count in model A and model B, we can see that
there is an outlier. Two outliers in this case, way down at the bottom for those values of
10 and five histograms are so important for an exploratory analysis because we see these
patterns. We can see when there’s outliers, we can see when there’s different patterns in
data. And the example I gave it with some outlying value that could have been from, I [don’t
know], a different country, it could have been from some new sales initiative. Uh, it could
have been from who knows where, but there’s definitely something going on that wasn’t
represented with just the data that we saw. And finally, when you see yourself clicking
through a bunch of different histograms or a bunch of different scatterplots or a bunch of
different stuff and just exploring, exploring, there are other visualizations that you can use,
visualization specifically designed to tease out these patterns. One of the more powerful
visualizations for this. In fact, an analytics technique is a decision or regression tree.
However, that is a little beyond the scope of this course. Exploratory visuals can be a lot of
fun. You know, I love them personally. They help you see your data in a brand-new way.
We learned a lot, talked about histograms, talked about, uh, scatterplots, understanding
how these exploratory visualizations are about looking at the data itself to find new
patterns, and we even did a little dashboarding. And in our next session, we’ll cover the
underlying dashboarding concepts to help you become the best you can at data
exploration and informed visualizations.
Lesson 2: Huskie Motors Operations
Section 1: Huskie Motor Corporation: Operations
HMC is currently selling in 15 countries in three regions: North America, South America, and
Europe. Table 1 provides a breakdown of the countries within each region.
Huskie Motor Corporation: Operations, Continued
Page | 58
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Automobile manufacturing and sales is a complex and highly competitive business. Though the
automotive industry has a broad global reach, only 15 countries produce 88% of the world’s
vehicles.
HMC Models
HMC is a fairly new and small player in the automotive manufacturing market. HMC currently
offers three brands: Apechete, Jackson, and Tatra. Each brand has several models as detailed in
Table 2. The models available fall within seven segments of vehicle types: compact, sub-compact,
full-size, mid-size, luxury, minivan, and sports utility.
HMC's Manufacturing Options
HMC offers several series for each model for a total of 34 different series. A breakdown of the
available series offered by model is provided. See Table 3. Each model is available in various body
styles, engines, drive configurations, transmissions, trim, color, and seat types.
Since various engine and transmission builds, see Table 4, come from one division and finishing is
done in another division, these options are described in different tables.
Click each icon to learn more.
Packages and Options
Like many automotive manufacturers, HMC offers a variety of packages and options that buyers can
add to their vehicles. Packages include a specific set of bundled options, or buyers can choose
individual options separately. Table 6 provides a list of the six packages along with the detail of
products and services contained in each package. Table 7 provides information regarding options
that may be purchased ala carte. At least one option, however, is contained in each available
package.
Click each icon to learn more.
HMC Data Movement
Recall the data life cycle from the previous module and consider how Huskie Motor Corporation’s
data ages.
To again explain how HMC gets and uses all their data, let’s take a look at a short video to further
illustrate how data is created and stored in business.
Video 7: Movement of Data
Page | 59
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
To conclude the topic of data movement, we will now move to a presentation from Dan Smith
where he further illustrates how data is created and stored in business.
Video Transcript (Est. time: 3:00 min):
First, we’re going to look at data creation and storage. And I’ll start way back in 1967 when
Dr. Melvin Conway identified that any system designed by an organization will match the
communication structures of that organization. What that means in plain English is that a
business will create processes and solutions based around how they are structuredmarketing, supply chain, accounting. They’ll all produce their own applications and
typically their own data as well. And who were the first people to start storing and
recording data all the way back in Sumeria, the accountants. In fact, I would argue that our
current data architecture is largely founded on accounting practices. Accountants use
transactional data created by customers and business operations produce reports that
are then used to inform stakeholder decisions.
So what does this look like? Well, there’s three primary domains, business data, the
creation, storage of data, and the business decisions. Now, how is it, where does this data
live? Well the customer activity generates transactional or raw data, that data is stored in a
transactional database, which is then and then shaped into a reporting data set. The
reporting data is used to make business decisions, and the resulting activity from the
business decisions create more transactional data. Now that we’ve seen how that data is
created, let’s look at how it gets from one place to another. You can see the data
movement cycle over here on the right of your slide. When that data is first created, the
information messages are typically transferred through a bunch of layers in applications,
so load balancers, temporary cashes, firewalls, etcetera. The process of getting activity
data into your data system is typically called integration and it moves data into a
transactional data store. Now, once it’s in its raw form or sometimes as part of integration,
the data is shaped or cleaned into a form usable by people in analytic applications. This is
generally called ETL, extract transform load, and the data output is stored in the reporting
database, data warehouse, or data lake. The transformation of data into business-usable
information is called analytics. We will talk about a lot about this in the next module.
Although Guy said, simply put, analytics is what facilitates data informed decision making.
Page | 60
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
And finally, business decisions attempt to influence customer activity. Making this one
little arrow right here, influencers in the upper-left-hand corner of the business data
creation and movement cycle. The entire reason that business exists and the entire
reason that the data analytics cycle exists to help the business make more informed
business decisions. So let’s look at this process in action.
Lesson 3: The Data Dilemma
Section 1: Data Dilemma
Prior to calling Miranda at HMC, Kevin meets with Jan to go over some basics regarding the project.
Kevin uses a five-step model as a data analytics framework. As a first step, he explains this model
to Jan, because he believes it will help her understand his process.
Data Analytics Process
Kevin: “So, Jan, this should be an interesting client and project. I know you are a bit nervous about
your lack of experience with data analytics design, but I have been studying this stuff for a long
time, both from an applied perspective as well as a theoretical perspective. There is a great book
on this subject, Data Analytics, by Warren Stippich and Bradley Preber. In this book, the authors
describe a five-step approach to the data analytics process:
Step 1: Define the question
Step 2: Obtain the data,
Step 3: Clean and normalize the data
Step 4: Analyze the data and understand the results, and
Step 5: Communicate the results.
I think that understanding and following these steps will help you tremendously as you find your
footing on this engagement. Our client, HMC, would like to have the project finished within a fourto six-week time frame. How do you feel about that?”
Page | 61
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
The Data Dilemma
Jan: “That seems like a pretty short window, Kevin. Do you think we can finish within that time
frame?”
Kevin: “Well, I do not think we can set a realistic time frame without knowing exactly what our
challenges are and what the data looks like. Remember, we have to be able to trust our numbers.
We do not know yet whether the data is clean. We will have to do some validity testing first.
Remember, follow the steps, think it through, and keep a calm head.”
The Data Dilemma, Continued
Jan: “What exactly is our goal for HMC?”
Kevin: “Basically, HMC captures very detailed transaction-specific data. For example, HMC has
extensive cost data, marketing information, and plan data for each car sold. The amount of data it
has is overwhelming for both the HMC management team and the executive board. Our job is to
help Miranda and her team figure out how to use the data to better understand costs and
profitability by vehicle model. They also need detailed, relevant feedback regarding sales volume
and sales location for planning purposes. Miranda would like to be able to predict sales at least
three quarters out so that the management team can better plan production schedules.”
The Data Dilemma, Continued
Kevin: “An equally important goal is to help them understand the benefits of data visualization
and give them some ideas about how to present the data to their executive board. As you know,
Jan, data visualization software, such as Tableau and Qlik, allows us to turn large volumes of raw
data from various sources into easily comprehensible graphical representations of information.
Data can be accessed live or extracted from some other source. Data can also be presented in
summary fashion while preserving the underlying detail, which can be instantaneously viewed as
desired. This type of technology could be instrumental in helping HMC improve and maintain its
competitive advantage.”
The Data Dilemma, Continued
Jan: “How does the management team store the data, and how will we be able to integrate it with
the visualization software?”
Page | 62
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Kevin: “I believe that the plan and forecast data are on Excel spreadsheets. The actual sales data
will likely come from its enterprise resource planning (ERP) system. We will need to confirm that at
our initial client meeting. We should be able to get transaction-specific information at the vehicle
identification number (VIN) level, since the VIN is a unique identifier. I am not sure how much
information is being collected at the individual vehicle level, but that could be a rich data source
for us if we can get it.”
The Data Dilemma, Continued
Jan: “Is the data they have ready to be used, or will we need to verify and organize it?”
Kevin: “We will see just how clean it is once we start pulling it in. We will initially build the analysis
using a representative sample of HMC’s data. The sample represents about 25% of the total
company data. The four- to six-week time frame should be feasible if we receive clean data. Our
contact at HMC, Miranda, is assigning two accountants from the controller’s department to pull
the sample and help clean the data. Once the client approves of the proposed analytics and the
dashboard, we can roll it out using all the data.”
Data Sampling and Analysis
Jan: “Will using a sample be enough to convince the client of the value of the analysis?”
Kevin: “Absolutely. The sample allows us to determine which type of analyses we can provide, by
looking not only at the data, but at the format of the data as well. We can use the sample data to
create demonstration dashboards for the client. The analyses we perform with the sample can
then be recreated for the full data set.”
The Data Dilemma, Continued
Jan: “I am not sure I understand what you mean by a ‘dashboard.’”
Kevin: “Basically, a dashboard is a screen that consolidates visualizations, graphs, charts, and so
on to concisely display the metrics and key performance indicators for a business. Summarized
data can come from a variety of sources and can even be presented in real time.”
Page | 63
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Jan: “I cannot wait to get started! This is going to be a tremendous learning opportunity for me.”
Video 8: Data Visualization – Dashboarding
As Kevin helped Jan understand the purpose of a dashboard, we will again turn to Dan Smith as he
further defines dashboarding and introduces concepts and best practices when using dashboards.
Video Transcript (Est. time: 18:22 min):
With the popularity of data-connected visualization tools like Tableau and Power BI, data
visualization is becoming almost synonymous with data dashboarding. Well, some, myself
included, would dispute that association. It’s still very important that we include
dashboarding when learning about data visualization. In this video, we will define
dashboarding, explore dashboarding concepts, and recommend some dashboarding best
practices.
But first, what is dashboarding? When someone thinks of a dashboard, they usually think
of a car or other device full of gauges, numbers, meters, things that tell us the immediate
state and relevant information required to understand how something is doing related to
data and information. Dashboarding can be defined as an information management tool
used to organize and display information typically with an interactive element and
connected to an underlying data source for rapid updating. And let’s, let’s think about why
in your case study a dashboard seemed like such an appealing prospect for Huskie
Motors and consider also if it’s the right choice for the overarching problem statement
and the requirements of our stakeholders. In our case study we’re asked to determine
how data will be represented for several different stakeholders. Let’s explore how we
could use a dashboard to display this information, the dashboarding concepts in each,
and if we should use a dashboard to display this information.
Our first task is related to the dashboard concept of displaying KPI and metrics. We’re
tasked with displaying the overall performance analytics, a question such as how Huskie
Motor Corp. is performing globally, how are the various brands performing in various
channels, and what are the least profitable models. Here we can apply a core concept of
dashboarding displaying KPIs and metrics. I mentioned before that lines are very effective
for displaying KPIs. Presumably there are a set of KPIs, key performance indicators, that
they can use to define performance. I would assume these were defined by somebody,
and we can include these in a dashboard that was connected to a data source that had
Page | 64
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
these KPIs defined. And let’s think of our audience for these high-level performance
metrics. Anyone who, who is interested at metrics at this level are likely going to be
executives, senior vice presidents, maybe senior directors, depending on the
organizational structure and title responsibility. In other words, metrics at this level are
foreign audience unlikely to have the time or inclination to go digging around their own
data? They have trusted advisors to give them guidance on the pertinent pieces of
information. In other words, these are informative visualizations, probably more reporting
than anything, so a dashboard may be appealing and a dashboard and the purest
definition of data connected and immediate results, but it’s not going to be an interactive
dashboard or if it is interactive, it’ll be for someone reporting results with the dashboard,
not for exploration. By our reporting results, I mean sitting in a presentation and clicking
on things in a predefined script, so the dashboard will probably be for a presentation, not
something that somebody is going to be regularly accessing online.
Our next concept, filtering and drilling down the next series of questions. In your case
study, the financial analytics here, we’re looking at prices per model and how they
changed over time or which model has the most variability in variable cost, etcetera. While
these are well-defined questions, there is a degree of slicing and dicing exploration
available. If someone wants to look at different packages or regions or country or models,
sometimes there will be complicated questions which may be hard to represent in a single
visualization. Therefore, the financial analytics questions have more opportunity for
exploration. So a more interactive dashboard is probably useful in this scenario. And by
the way, by slicing and dicing, I mean a dashboard where a user may wish to view the data
in a different way than represented by a static dashboard or table. For example, they may
wish to highlight an area of one visualization to learn more about the data point
contained in that visual. That would be a drilling-down operation that’s often used in
scatterplots and other complicated visuals like a map and tree charts to give more
information about a given data point or points. That would be like a dicing of data.
And filters are often used to change the dashboard. In contrast to the drill down, filters
typically exclude all the data for portions of the dashboard. They exclude the underlying
data. A filter is exactly the same as a filter in a pivot chart or pivot table. And a pivot table
is also an excellent example of slicing and dicing data. You just pull in different elements
and you see data in a different aggregation or context than you would, and it’s an original
form. And fundamentally, if there is a demand for data exploration, just a bit of caution,
you will never be able to satisfy all the data exploration to band with simple visualizations,
Page | 65
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
you have to include some interactivity. So a dashboard makes a lot of sense for the
questions from the finance team as a, for the operational analytics questions, those are
much more complex. They’re asking what are the top values overall in multiple groups.
Now, asking what are the top sellers over is easy. You can just sort it. But when you start
comparing the top sellers from one group to a top seller in another group, your analysis
becomes much more difficult because you’re not just ordering a column in a table. You
have to select the top individuals or regions or models or whatever it is from a group and
adding to that complexity, the calculation for the number of model options. How many
days are the various models on the lot sale? Uh, you have the potential for a very
complicated dashboard. The questions ask for by the operational analytics folks. Those
will likely require custom calculations and/or data transformations. Uh, we’ll talk about
those concepts shortly. But this is something when people start asking those types of
questions that you would probably want to call in a data expert, to help you create that
type of dashboard and those visualizations forecasting our final set of options where we
look at the forecast.
I know I'm giving a lot of warnings here. Primarily because this is an introductory to
visualizations course. So I put out cautions when people are, what they think are
engagement, where they’re building a visualization, but it’s actually a mathematics data
problem, when we start talking about forecasting, this is another one of those yellow flags.
So as a caveat, I would not, as a warning, I would not suggest a business, particularly one
the size of Huskie Motor Corporation make a significant business decision based solely on
the forecast that is built into a utility of a visualization tool. Not to get into the details, but
there’s a lot of additional information that can be included in that type of mathematical
model. That being said, the built-in forecasting utilities can still be informative for
identifying patterns over time in your data and creating a degree of confidence in those
types of pattern predictions. In a forecast, we are able to show confidence intervals above
and below the predicted values. This is the case of a visualization so we can visualize our
confidence intervals and that’s similar to a box whisker plot where you have the predicted
values 80% of the time vs. the actual 80% of the values. The high/low markers are
equivalent to the whiskers and a box whisker plot if you will.
Moving on from forecast, let’s talk about connecting to data. Well, you always hear the
term data-connected dashboard. Let’s expand the dashboard conversation to introduce
what a dashboard is doing, a dashboard is doing under the hood. The first concept is
connecting to data. A dashboard works best when the dashboard can update as its
Page | 66
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
underlying data set. Also updates, we’ll cover this more in our data platform video, but for
now understand the visualization. A tool does not and in most cases should not change
the underlying data. So here we can connect to an online Google sheet. Here we can
connect to an O-data online data source or a JSON data source. So these are web data
sources. They’re separate.
Contrast that to something like Excel where Excel houses the data. In many cases,
especially the cases we’re accustomed to working with, Excel is the data you can type in
and modify the data. In many cases you have to modify the data to create a visualization.
So you end up with a lot of disconnected data tables, pivot tables, and pivot charts are
more like other visualization tools as long as you don’t touch and edit the underlying data.
In fact, I would say if you can use a pivot table effectively, you can probably use most of
the other visualization tools with minimal training, aggregates, and values. This concept is
probably the most important concept when building a dashboard. We’re at least using a
visualization-focused dashboarding tool and that is the concept of categorical values and
numerical values. Tableau calls them dimensions and measures, Power BI columns,
measures or aggregates, and measures in a pivot table or called rows or columns and vs.
values whenever you want to call them. Your categories are what determines the number
of slices in a pie chart, whereas measures determines the size of each slice category
determines the number of bars. Measure is the height of the bar. Again, categories or
factors. These are the same concepts as when you use an Excel table column to make
separate rows in a pivot table measures the same concept as values in a pivot table or
some mathematical operation.
Typically, sum is applied to the values corresponding to the pivot table. Row values in the
data video, I’ll talk about more about this concept. Even including the SQL structured
query language aggregation code that’s being created by the visualization platform. That
sounds a little scary. I’ll walk you through it. It’s not going to be as bad as it sounds.
Calculated values, all dashboarding tools let you create custom calculations for
aggregating your values. These can get pretty complicated. In this example, I’m relatively
simple when I’m creating a new column to create an address and I’m merging a bunch of
other columns concatenating them as what that’s called to create an address value for
geocoding. Yeah, that’s a simple example. And the hand calculated values are one of the
hardest aspects of learning a dashboard tool because calculated values, in particular,
calculated values like rolling average or cumulative totals, tend to either obscure or be
very specific to that platform to get the exact calculation I use to create the value you see
on the screen and honestly calculated values, especially the nuances of them. They aren’t
Page | 67
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
something you need to understand for this class. But as you learn dashboarding and
because it’s an important element of visualization, keep in mind that the measures you
see visualized in the dashboarding tool are calculated not only using the underlying data
but they will also change based on the visualization, visualization used and the categories
used in the visualization. Yeah, I know, I’m sure that didn’t make any sense. It takes a
whole class so you don’t need to worry too much about this concept. I’m telling it to you
so you’re aware of it. This is a potential stumbling block, and it may be something that you
struggle with cause I struggled with it when I started. So as you’re learning and, and
honestly in general, it’s a good idea to have a, your data team how, validate your
calculations or have predefined calculations, even have a data set. If you have a
particularly complex calculation you want to put into a visualization tool, have some data,
people will create a new data set for the analysis instead of relying on the built-in
dashboard capabilities.
Alright. Moving on. Dashboarding best practices. The first best practice is to know you do
not always need a dashboard. Yeah. Think about it. How many of you had a boss that
would not look at a presentation unless it was shown to them at a meeting? You could
write an email saying, next week's winning lottery numbers. We’re in the attachment on
that email and they still wouldn’t open it. It’s reasonable to assume if they will not click an
attachment to open a presentation, they are not going to click a link in your email, log into
our reporting server, assume they remember the username, and then navigate to the
dashboard themselves in a dashboard. You want to keep, infer it. Keep your informative
visuals at two or three major points at most, so you’ll be tempted to cram everything on
one dashboard. Look at that information density. Oh, you’ll say, and no one will know
what you mean. Or maybe you will. Somebody will be confused, and you just go up to
them and you say, Oh, you just need to, you know, click this. Then this, then this, so
highlight that. Pick this filter and voilà. Here’s your answer. No, and informative
dashboard allows exploration, but it should not require it. The line chart dashboards are
great examples. They may seem simple, and they answer a question very simply and
succinctly, but there’s little room for misinterpretation and you can still drill down and
filter. If you want to compare that to say like the geographic dashboard that we saw
earlier. It looks cool, don’t get me wrong, but it doesn’t say much without digging around
in it and it doesn’t even follow some best practices about high-level visualizations basically,
form over function. We should be doing function over form, my next best practice.
Make your own darn dashboard. I’ve seen people spend weeks working with consultants
Page | 68
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
to get a dashboard, and they’re still not happy. I’ve seen that time and time again, and it’s
because there’s always another question. That’s the whole point of exploration. You want
to find new questions. So if you find yourself working hours and hours with a consultant
or contractor or someone to build the dashboard, consider it spending that time watching
free training videos, all the platforms provide them, that way you can build the dashboard
you want. And if there are still complex data operations, like joining tables, rolling time
series aggregations, work with a data expert or reporting expert to create a new data set
or fix that one calculation rather than trying to read your mind or predict the future about
what questions you’re gonna ask because you don’t even know what they are yet, it’s
really not that hard. I promise they’re there. They’re easy to make. [inaudible] That’s what
they’re designed for.
Next best practice. Keep data operations to a minimum. Just because you can do complex
calculations doesn’t mean you should. Dashboards and tools like Tableau, Power BI, even
Excel, can be unclear how a value is calculated. When I was talking about that before in
the calculated values. Then when you throw in different terminology, what they’ll call it,
partition window table calculations, those are calculations that are aggregated over
several categories or dimensions within the visualization. It gets confusing and isn’t
transparent. So again, leave data to the experts. A data team-approved data set with
verified calculations that’ll ensure proper governance. It will ensure procedures are
followed. Give you the peace of mind that you’re not making a small and opaque
calculation error that could lead to $1 million incorrect decision.
In conclusion, in this video, we use dashboarding as an information management tool used
to organize and display information. Typically, a dashboard has an interactive element, and
it’s connected to an underlying data source for rapid updating. In our next video, we’ll give
you a little more visibility into the underlying data environment so you’re able to
communicate more effectively with your data stakeholders and you are better able to
understand what’s happening under the hood of popular visualization tools.
2.64 Section 2: The Data Dilemma, Continued
After reviewing the project proposal and meeting with Jan, Kevin calls Miranda to introduce
himself and confirm their initial meeting the following week.
Page | 69
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Kevin: “Hi, Miranda. This is Kevin from D & A Consulting. I just wanted to touch base with you to
set up our initial meeting.”
Miranda: “Hi, Kevin. I am really looking forward to your team’s visit. We need help as soon as
possible. I have so much data coming in that I barely have time to look at it, much less analyze it.
Also, our data is located in various databases, spreadsheets, and our ERP system, making it
difficult to integrate and fully utilize. We have data visualization software available on our server,
but we have not really tapped into its capabilities just yet. I am told that we can easily pull together
large amounts of data from various sources with user-friendly, adaptable output using
dashboards. Is that something you can help us with?”
The Data Dilemma, Continued
Kevin: “Absolutely. What kind of data visualization software do you have?”
Miranda: “Tableau. It is supposed to be very user-friendly. I am hoping you can get a dashboard
up and running in a couple weeks.”
HMC Dashboard Needs
Kevin: “Do you know what you want on the dashboard?”
Miranda: “I know that we want to be able to see profitability by brand and model, since solid
profit margins are crucial if we want to stay in business. We also want to keep a pulse on sales by
country and region. Ultimately, we want to do a better job of planning our production schedule,
but as you know, this requires up-to-date information on many moving parts.”
The Data Dilemma, Continued
Kevin: “I cannot say whether a couple of weeks will be enough time until we take a look at the
data and map out the specific decisions you hope the dashboard will enable you to make. I will
have a better idea after our initial meeting next week.”
Page | 70
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Miranda: “Okay, fair enough. We’ll see you on Monday at 1 p.m.”
HMC and D&A Meeting
The following week, Kevin and Jan meet with Miranda and her team for their initial meeting.
Kevin: “So, let’s get down to business. Let’s talk about the dashboard. Help me understand what
questions you are trying to answer? Or what story are you trying to tell? Who are the users going
to be? Will different users need different dashboards?”
The Data Dilemma, Continued
Miranda: “I have a list of questions that I would like to be able to answer using our data. There are
several areas where I believe we can gain greater insight from looking at our data. Specifically, we
need overall performance analytics to tell us how we are performing globally. We need to know
which of our models is profitable, where we are selling well, and, perhaps, how sales channels are
driving sales volume. It is also important that we are able to see information from a top-level
perspective with the ability to drill down into the detail. The profitability information is crucial for
our executive team: chief executive officer (CEO), chief financial officer (CFO), chief operations
officer (COO), and chief marketing officer (CMO). This team is ultimately responsible for the
direction of HMC and profitability for shareholders.
“I would like for our management accounting group and financial reporting group, including the
CFO, to have a financial analytics dashboard that would give them information about contribution
margins, total costs, and sales volume. Both groups will also need to monitor changes in costs and
contribution margins for all of the models we offer. These are the most important metrics and,
therefore, the ones we need to work on first. We will expand our analysis for the management
accounting group to include other efficiency measures as soon as we can get these initial
dashboards up and running.
“Our CMO and the sales team will need operations analytics to help them understand turnover
and demand. They will need to know which models are selling and how long it takes to sell them.
They will also need to understand how packages and options impact sales. They will need to have
a handle on which of our package/option offerings are popular and which ones are profitable.
Page | 71
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
“Finally, our budgeting and production teams will need to utilize forecast analytics to predict sales
and margins at least three quarters in advance.
“Do you think it is possible to make the dashboards ‘real time’?”
HMC Data
Kevin: “I think it is possible, but first we need to understand the data that you are currently
collecting. What can you tell me about the extent and magnitude of your data?”
Miranda: “Well, we track details down to the VIN level in our ERP system. So, we have a lot of data.
For each vehicle sold, we know the sales amount, marketing expense, and all variable and fixed
costs. We also track a lot of nonfinancial data.”
2.71 The Data Dilemma, Continued
Kevin: “What kind of nonfinancial data?”
Miranda: “Well, there is vehicle data such as brand, model, model year, series, segment, body
style, drive configuration, engine type, and transmission type. We also have the detail for each
vehicle sold as far as color, trim, and so on, as well as any package or options purchased.”
Jan: “Wow! I can see why the data set is so large!”
Miranda: “That is just the vehicle information. We also capture the region and country of sale, the
number of days that any given car was on the lot prior to sale, and the type of marketing
campaign in place at the time of sale. In addition to that, we track sales channel information.”
Page | 72
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
The Data Dilemma, Continued
Jan: “Can you give me a little more detail about what you mean by sales channel information?”
Miranda: “Sure. We identify sales using three sales channel dimensions. Sales channel 1 identifies
whether the sale was made through our dealers, fleet, or retail operations. Sales channel 2
identifies the type of customer account. We have commercial accounts, employee/partner
programs, government accounts, nonemployee accounts, and rental accounts. The third sales
channel identifies whether the sale was cash, financing, or lease.”
Kevin: “Okay, so ideally we will want to use both the financial and nonfinancial data in our
analyses.”
Miranda: “Exactly! Other potentially informative data sources that we have are social media
platforms like Facebook, Twitter, and Instagram. We have just started collecting data from these
sources about our vehicles, but we are still trying to figure out the best way to analyze it.”
The Data Dilemma, Continued
Kevin: “Eventually we can bring that data into the analysis and dashboard as well, but for now let’s
stick to the data we already have.”
“So, the actual sales data is in your ERP system. Where is the budget data?”
Miranda: “We keep track of the budget data in an Excel spreadsheet we call ‘plan data.’”
Video 9: Data Visualization – Data Ecosystems
In the case, it is evident that HMC data comes from and is stored in a variety of places, making it
difficult to organize and analyze it. In Dan’s next video, he will go through the concept of data
ecosystems and walk us through the process of aggregating data to see how data is connected
Page | 73
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
and transformed so it can ultimately be visually presented.
Video Transcript (Est. time: 10:22 min):
As you become more familiar with creating data visualizations, you will quickly understand
if you haven’t already, that data is just as important as the visualizations themselves. In
fact, data’s probably more important than the visualizations. Furthermore, having taught
hundreds of people, how do you use visualization tools and even more advanced
analytics? I know that understanding how the data operations and data connections are
enabled by those tools, that provides a more conceptually understanding. Understanding
the data concepts allows the students to use their skills across multiple tools rather than
just the one they’re currently learning. So I’m going to let you in on a big secret. Those
complicated calculated values and dimensions that I described in the previous video.
Those are just SQL structured query language or sequel, they’re just SQL aggregations.
They’re so complicated because they have to figure out how to abstract relatively simple
code, but it’s still using a visualization to create code. This is a slide that I use when I teach
people data concepts. For many of you, it may look too technical because it shows a little
code. That’s a common reaction. Please just bear with me. I hope to show you that much
of the SQL, much of the structured query language generated by dashboarding and
visualization tools, that code is similar in function to an ordinary pivot table. And by
extension a pivot chart is just like a dashboarding tool.
So the example that you see on your screen is just a table from a database. I’m glad that it
is a different data set because it shows you that these concepts extend across any data.
So to be consistent, though, with the language in the case study, the two tables here,
these are the difference in what they mean when they say normalized. That is, there is a
unique row for each individual data value. We see the same types of coverage and
education repeated multiple times and our normalized table in this example, if you need
context, it doesn’t really matter what the data pieces are when you’re talking about code.
But here we’re just talking about the coverage and education for different insurance
premiums for folks. And if our task here is to show what the sum of claims for each type
of coverage and we do not want to know the sum of claims for coverage where education
does not equal graduate, if that’s the business question, we would create an aggregate
table with one row for each type of coverage and its associated total claim amount. If this
was in Excel, the simplest solution would be a pivot table. We put coverage for each row
and move our claim amount to the aggregate value. It even defaults to some of that value
filtering the education. We put education into our filters and select all except graduate.
Page | 74
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Our SQL statement is exactly the same as the pivot table. It reads like a sentence select
coverage, some claim amount from normalized table where education does not equal
graduate group by coverage.
Looking at the corresponding colored squares between our pivot table and the SQL
statement, we can see that selecting a field or column and placing it in your group by
statement is the same as putting it in a row. Meaning that we have a value or group or
row for each of the associated values within our coverage column. One row for basic, one
for extended, and one for premium for some claim amount. The green box in our SQL
statement makes the same calculation as our aggregate value. Some claim amount in the
pivot table, some sum, the claim amount, and finally where education does not equal
graduate in the purple box is the same as the filter in the pivot table. Although you can’t
see it seeing the side-by-side comparison like this. I would argue that the SQL statement is
actually clearer. It’s more transparent, it’s more understandable in what the data
transformations and filtering are actually doing. It's rather than the uncertain filtering and
source for the pivot table, we don't, we don’t know what the sources for that pivot table by
looking at it, we have to click and sort through all sorts of check boxes for the filter. We
may not even know which exact calculation is being performed on the value.
So remember when I talked multiple times that abstraction was both a strength and
weakness of visualization tools. This is a common example. So you like the hidden filter. In
this case, the education filter in the pivot table may lead to mistakes. If someone was
unaware of that filtering logic, it’s easy to assume that the calculation, your aggregate
table would have all the education in it. Whereas with the SQL statement, you see it right
away. And to reinforce the point, just to bring it up again, the pivot table that you see
there, if this was connected to a database, if we were using power query, this is exactly
what it’s doing. If we had an ODBC connection, open database connection to a database,
which we connected a pivot table to, it would create this SQL statement. That’s all these
things are doing. They’re abstracting the creation of SQL code. When it’s very simple,
there’s not a lot of harm. It’s much faster to use the visualization tools, but when it gets
complicated, it’s typically faster and better governed to have the SQL statement. But
either way, understanding these basic SQL concepts makes it very easy to understand
how to use the different tools.
So that’s on the transformation side. Let’s talk about the data connection side, the
separation of concerns. Why? Why is it so important that we pass this code to a data
source anyway? We’ve used Excel for decades where we just transform the data in the
Page | 75
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
spreadsheet. Why all of a sudden is that not okay? Well, there’s a few reasons. I already
mentioned the data is often too big to fit in a single spreadsheet, but you know, memory
is pretty big now and most data sets that we use still aren’t more than a few gigs. So now
why is everyone saying we shouldn’t pull our data into Excel? First it boils down to
governance. Accountants are at their core data managers. We respect financial data and
as you know in any general ledger, you don’t erase information. You don’t erase a
transaction, you reverse a transaction. You want to keep that data unchanged. If you’re
creating a financial statement in a spreadsheet and you just backspace that number,
you’ve just changed it. You’ve deleted information, you’ve deleted data, the ability to easily
audit all the transformations to the data as we created that information and the
opportunity to share that, create an information rapidly, securely into a wide audience.
That’s why you keep the data separate from the visualization tool and the transformations
we need to be able to audit it. We need to have a consistent mechanism of transforming it
and we need to leave that data unchanged. With Excel, it’s extremely hard to ensure that
data pooled into a spreadsheet remains unchanged. Even with all its cells locked down, it’s
still possible to copy that information to a new sheet, make a new report, and then
distribute all that information. If all that data remains in the original sheet, there’s no
assurance. Somebody who’s not gonna update the original report as well or the other
report as well. You have two versions of the truth, so when all the data stays in one
database, instead of getting shipped around through email attachments, the analysis,
we’ll always update any changes to the database or any updates I should say to the
database. We’ll also update any analysis and if there’s an error, we can either update an
error in the data or update an error in that singular analysis sheet that we have that
everyone else is just connecting to.
So see that wasn’t that hard. Right. Okay. Yeah, that’s a little difficult. I know. I know. It’s a
lot to take in. Unfortunately. It’s just impossible to talk about data visualization without
talking about data. However, it’s my strong belief that a conceptual understanding, just
the simple conceptual understanding. If you watch this video a few more times and just
really get your head around those basic SQL statements and the concept of keeping data
separate from the transformations to that data, you are going to have such an easy time
picking up all the different data visualization tools. It’s kind of that light switch that gives
you the skill set to, to just understand what these things are trying to accomplish as
opposed to just feeling like you’re clicking something and just memorizing a process
rather than learning a solution. If you can connect to a data source in Tableau, you can do
it in Power BI. If you understand that the dimensions in Tableau are the same as a group
Page | 76
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
by statement in SQL, and if you understand that the measures in Tableau are the same as
some or select average in SQL, then you’re going to understand that in Power BI as well,
you can understand it in any platform.
The Data Dilemma, Continued
Kevin: “Okay, if you give Jan and me a sample, meaning at least 25% of the most recent data, we
will make sure that the data is ‘clean’ by running some data validity tests, and then we can start
putting together some analytics. Once we have a better feel for the data, we can give you a better
estimate of how long it will take to develop the dashboards.”
Miranda: “Great! My team will make sure you get all the data you need, and they will be available
to help as well.”
Video 10: Data Visualization – Data Wrap‐up
We’ll now move to the last video in Dan Smith’s Data Visualization video series where he
summarizes key takeaways pertaining to data, data analytics, and data visualizations.
Video Transcript (Est. time: 3:12 min):
Congratulations if you’re watching this, it means you’ve completed all the videos on data
analytics, visualizations. We covered a lot of material over the course of these videos. We
started with why visualizations are important for analytics. Following that, we looked at
some of the tools of the trade, the leaders of creation software and platform for creating
data visualizations.
Next, we looked at the different types of audiences and how visual analytics answers the
different types of business questions that members of that audience may have.
Exploratory visuals for audiences looking to get a better grip, a better understanding of
data, and find new ways in which they can use that data to help the business informative
visualizations for the higher-level questions or for demonstrable proof of how those
visualizations of how those exploratory visualizations can help the business. And finally,
reporting visualizations to keep track of ongoing activities and answer the repeated
Page | 77
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
business questions which we’ve already demonstrated can help the business.
We looked at the low-level concepts of visualizations such as how shapes can be used to
represent an aggregate value such as a sum, how align represents changes over time, and
how a point represents the intersection of two or more values such as in a Cartesian plot,
a scatterplot. We saw how combining those low-level concepts of shapes, lines, and points
can create much more complex visualizations like histograms and tree charts, bubble
charts, ribbon charts, etcetera. We walk through a flowchart which allowed us to select the
correct visualization based on our business question, such as are we looking at a
comparison of groups or a composition of the whole? Are we looking at time series or a
single point in time or do we have multiple factors which we want to include in our
visualization?
From there, we took a closer look at the types of visualizations that we use for exploratory,
informative in reporting business questions and audiences. Then we looked at some of
the best practices on how a dashboard can provide our end user, our audience with data
connected visualizations and add an exploratory component to those visualizations. And
finally, we explore the data component of data analytics visualizations, looking at some of
the data operations performed by data visualization tools and the data environment in
which those tools live.
I hope you all enjoyed learning these data visualization concepts as much as I have
enjoyed teaching them. Again, thank you so much for watching. I'm Daniel Smith.
Goodbye.
HMC Conclusion
You have now completed the case study, Huskie Motor Corporation: Visualizing the Present and
Predicting the Future and finished the series of 10 videos with Dan Smith.
Module 2 Wrap‐up
You have completed Module 2 of this course and should now be able to:
Page | 78
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Define data visualization.
Describe how data visualization can impact the way data is communicated.
Identify various data visualization tools and their different uses.
Recognize the importance of choosing the right visualizations based on your audience.
Module 3: Applying Data Analytics and Visualization
Course Roadmap
Click Next to proceed to Module 3, Data Analytics and Visualization Practicum.
Introduction
Welcome to Module 3, Applying Data Analytics and Visualization. The goals of this module are:
To explore the data analytics and visualizations discussed in the Huskie Motors case study,
and
To apply acquired knowledge of data analytics and data visualization to complete scenariobased exercises.
In the previous module, you were taken through a scenario of Huskie Motors Corporation’s
dilemma of the abundance of data available to them and the need to use it to make business
decisions. In this module, you’ll begin in Lesson 1 with a tutorial of how some of the Huskie
Motors analytics can be performed. Through a series of five videos, you will be walked through
how the data can be loaded, transformed, and analyzed, and ultimately visualized for effective
communication and decision-making. Following each video, there will be a set of knowledge
check questions that you’ll be required to answer before moving on.
After that, in Lesson 2, you will have the opportunity to apply the knowledge gained by
performing in-depth, scenario-based exercises on data analytics and data visualization. Before
you begin Lesson 2, you’ll need to access the data files that correspond to Module 3 from the
Resources link. You can download them directly onto your computer. Read the question
presented in each exercise, use the data files to figure out the best answer, and then come
back to the question and select your answer. You are allowed 2 attempts per question. If you
still do not get it correct, you will be able to click the ‘Show Answer’ button to view the correct
answer. All exercises in this lesson are based on the Huskie Motors case study reviewed in
Page | 79
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Module 2.
When you are ready, click next to continue.
Module Menu
Module 3 consists of two lessons,
Lesson 1: The Data Analytics of Huskie Motors Corporation, and
Lesson 2: Data Analytics & Visualization Practicum
Click Lesson 1 to proceed.
Lesson 1: The Data Analytics of Huskie Motors Corporation
Huskie Motors Data Analytics
Dr. Ann Dzuranin, co-author of the Huskie Motor Corporation case study, will present a series of
videos showing how to perform some of the data analytics required by Miranda and her team in
the case study.
Dr Dzuranin will take you step by step, using Microsoft Excel, through a demonstration of how to
work with the large amounts of financial and non-financial data that the Huskie Motor
Corporation is tasked to analyze.
In the video series, you will see how to:
Identify, extract, load, and verify the data;
Analyze the data to identify the most profitable brands;
Calculate profit margins, contribution margins, cost trends over time and more.
Huskie Motors Data Analytics: Video 1
Let’s now take a look at a video on identifying the issue and the data, extracting and loading the
Page | 80
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
data, and then verifying the data.
Video Transcript (Est. time: 16:58 min):
Hello, welcome to the Huskie Motors data analytics Video 1, ”Extract, Load, and Verify
Data.” Evaluating source data. I am Ann Dzuranin, and I will be discussing the material in
this video.
In video one, the issue that we have identified is that we need to load Huskie motorcars
2015 sales transaction data and confirm gross sales by region. The data that we’ve
identified we need is the 2015 sales transaction data. To extract that data, we contacted
the client and asked them to provide us with that data. The client has provided us with a
file, and the file is in a .TXT format, so it is in text file format. This is not an unusual way
that data will be downloaded and I should say extracted from databases, for example. We
need to take that data and import it into our evaluation or analytic software, in this case is
going to be importing that text file into Microsoft Excel. Once we have it in Excel, we’re
going to have to verify that all the data that the client had told us was in that file is actually
there and has actually loaded correctly into our Excel file. We’ll confirm that by comparing
what we’ve loaded into our Excel file to check figures that were provided by the client.
So let’s get started. First thing that we need to do is take a look at the text file that the
client has sent to us. So if you notice here we have what’s called the HMC client data file.
When you try to open that, it will open up a notepad and in this format it’s not really very
useful for us to work with. So instead we’re going to close that. We’re going to go ahead
and open up Excel and open a new workbook. Once I’m in Excel, now I go to the data tab
and I go over here where it says, all the way to the left, where it says get data. You click on
that get data, and notice you have a lot of choices from which you can bring data into
Excel. We know that we have a text file and so we are going to click on ”From Text\CSV” file
format. This will take us to our computer where we need to then go to our files and find
where we had saved that client data. Go ahead and import that.
Now if you have 2000 Excel, 2016, or a newer [version] you will have the option to actually
transform the data before you load it into your Excel spreadsheet. This can be a really
useful tool to use and I’ll show you on how this works. So if I look at my data right now, I
can click on, I can kind of page through it. It only shows me the first 200 rows just so I get
a feel for what’s come over. If I click on transform data, what that will do is, essentially
Excel will look at each of the data points and identify what type of data that is. So for each
column, for example, we would identify the VIN number as a text. ABC stands for a text
Page | 81
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
and that makes sense, right? Because it’s not a continuous number. The same thing for
brand model. Notice model year is just a number and that makes sense also because we
actually have, that’s really a category. It’s not the year when it was sold. Same thing for
seat and then series is a category segment, body style, etcetera. And so as we go through
these, you know, look and see if everything kind of makes sense, and it does make sense
to me, you know, month, day, and year. We don’t need those to be deep fields because we
have a sold date field here. Again, as we go through it makes sense that sales volume is a
number. It makes sense that the options and things like that are categorical or text fields.
And if we want to be sure though that all of our financial information is accurate.
So if we haven’t been through this, we haven’t been through this file before. Um, this file is
essentially sales for all of 2015 for Huskie Motor Car Company. And it starts with VIN
numbers who each row is a specific transaction, right? So it represents a sale of a unique
vehicle to an into a, uh, an individual. And all of the information that’s in here, um,
essentially describes everything about the vehicle that was sold. If we scroll all the way to
the left, we will eventually get to our financial data. And I wanted to point out a couple of
things that happen when we click on that transform, bringing over this text file into an
Excel. Um, we see that we have, um, currency is for gross sales and marketing, net sales,
labor tooling materials with and look at option costs. Again, the option costs. And we can
see that Excel has identified this as a text field, which initially doesn’t make a lot of sense.
Same thing for packaged cost, but notice what’s happening because the data set when it
was downloaded from the client’s database, but if there was a Z, instead of putting zeros, it
put dashes. And when it brings it over as a currency field with a dash in it instead of a zero,
Excel doesn’t know what to do with that. And so what Excel does is it says, no, this is not
currency because clearly we have text fields within here. We can in this transform
mode change this data type to be currency. And what will happen is we have the choice to
either add another column. So essentially duplicate this as the new column type or I, we
can replace it.
So I’m going to go ahead and click replace so you can see what happens.
So notice now that wherever there had been blanks or dollar signs with dashes, the word
error comes up. If I do the same thing for package costs and change that to currency, I will
get the same issue and we’ll talk about that in another, in just a minute. I just want to
quickly cruise through, the rest of these looks like tariffs, also was identified not as
currency. And so we’re going to go ahead and change that data type to currency and say,
okay, and everything else looks alright. So now that I’ve done this, I can go up here and
say close and load. And what will happen is all of those places where it said error, I’m
going to get a notification over here, a query that says, hey look, you’ve got 72 errors.
Page | 82
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
What you would do at this point is you would click on that and you would look and see
what the errors are. So it just brings over the rows where there are errors. And if you
scroll across, you will eventually find where the error is. So remember it was an option,
costs and package costs as well as tariffs. If they didn’t add an option, it should be zero. If
they didn’t add a special package, it would also be zero.
So what you would do at this point is you would actually go talk to the client. You would
confirm whether or not these should be zeros or whether they should actually have dollar
amounts in them. For our purposes today, we are just going to go ahead and load this
anyway and because now what’ll happen is it will just come through and it will just be
instead of an error when it loads into my Excel spreadsheet, it will just show up as a blank.
Okay. All right. So we have our data in Excel now and now the next thing we had wanted
to do, I’ll take you quickly back to the PowerPoint. What we want to do is check to see if
our file came over correctly. So what we want to do is confirm the number of transactions
are 1,308 that the gross sales total, $38,675,912, that the total number of blue cars sold
was 340 that our first sale happened on January 17, 2015, and then our last sale
happened on December 20, 2015. We can use pivot tables as a kind of quick and
convenient way to do a lot of these verifications. So for example, if I click anywhere in my
worksheet and click on pivot table, it being Excel will highlight my entire table range. I’m
going to go ahead and put this in a new worksheet since this has a lot in it already.
Now when I get to this worksheet, the first thing I wanted to do, remember, was
confirmed the total number of transactions. So a really quick way to do that would be to
say, okay, I could just count for that values, the number of VIN numbers in that column,
because each transaction is a unique VIN number. And so when I do that, I see that I get
account of 1,308, which if we glance quickly back is exactly what we want. So it looks like
the number of transactions is correct.
The next thing I would want to do is confirm sales. So what I can do for that is I can
continue to work in this same pivot table if I wanted to, but for an audit trail I’m going to
go ahead and do each confirmation on a different page. And I’m going to call this one
count of transactions. And then I’ll go ahead back to my first sheet and I’m going to
actually rename this so that we know exactly what it is. And this is 2015 sales. And now I’m
going to go ahead and insert another pivot table on a new worksheet. And for this one I
want to look at total gross sales. So I’m going to find my gross sales and I get my, some of
gross sales. I can change the format of that by going into the values, clicking on the down
arrow and picking value field settings. And then number format. I’m gonna go ahead and
Page | 83
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
change that to currency with zero decimal places. So it’s easier to read. Click OK. So
$38,675,192 and if we look back at our check figures, that is accurate.
The next thing we can using a pivot table is how many blue cars were sold. So we’re going
to go ahead back into our sales transactions. I’m going to go ahead and click anywhere in
there. Insert pivot table. Okay. And I am going to find color passed it. We’d go color. So
these are the various colors that they offer for their cars. And remember I want it to count
how many they I had of each color. Essentially what I really want I accidentally, that went
into rows. I’m going to drag it over to values and notice that I get a count of the VIN
numbers for all of the cars I could if I wanted to also bring color as a filter and then only
choose blue. And you can see that my account of blue cars sold is 340. I can go back to my
PowerPoint and confirm that that is accurate.
Now to confirm the first day of sale on the last day of sale. Probably the easiest way to do
that in the sales table is to make sure that your sold date column is sorted from oldest to
newest, meaning that the first one should be the first sale of the year and the last one
should be the last sale of the year. So the 12/20/2015 was our last sale 1/17/2015 was our
first, and that checks out with what we have over here. So what we’ve done is we have
gone through and we have confirmed the total number of transactions by counting the
number of VINs that we sold. We confirm the total gross sales of $38,675,192 by creating a
pivot table that summed gross sales. We confirmed the number of blue cars sold was 340
by creating a pivot table where we had color as the rows and then we filtered it for only
blue. And then finally we went and looked at our data, made sure that our sold column
was sorted from oldest to newest, and confirm that the first sale date was the 17th of
January and that the last sale was made on the 20th of December.
Remember when we first talked about what we were, our issue was today, one of the
things we wanted to identify was the total gross sales by region. And so after we verified
the data, we needed to provide that, those results. So we verified our data, we feel
comfortable that the data that the client gave us is complete, and accurate, and we went
ahead and totaled gross sales. I did this by using, again, another pivot table. So if we want
to pop quickly back into our Excel spreadsheet that we just created and we want to insert
a new pivot table here, I can pick region and another way that you can quickly find what
you’re looking for, you can start to type in the name of what you’re looking for. So I know
I’m looking for gross sales. I just typed in G-R and took me right to gross sales. And again I
click that down arrow, changed my field settings so that my number format is currency
with zero decimal places and it’s okay. And then we can see Europe, North America, and
Page | 84
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
South America and our total, which agrees with what we have in our visual here.
So the key takeaways. First, we had to identify your issue, which we did. We were looking
to verify our data and then, um, show the results for sales by region. We identify we were
using 2015 sales data. We extracted it via the client and loaded it into our Excel database.
They, we were given a text file by the client. We imported that into Excel. We looked at
how the transformation options are available in 2016 version of Excel and… and higher.
We talked about the issues of you have blank or missing data. We then verify that we had
the data. We were supposed to have confirmed our results and provided the summary of
sales by region.
So thank you for watching Video 1, “Extract, Load, and Verify Data.”
Huskie Motors Data Analytics: Video 2
In the next video, we’ll see how to transform and analyze the data in order to evaluate Huskie
Motor Corporation’s financial performance by brand.
Video Transcript (Est. time: 12:31 min):
Hello, welcome to the Huskie Motors data analytics Video 2: “Transform and Analyze
Evaluating Brand Performance.” In this video, we are going to identify a business issue
facing Huskie Motors. We will identify the data we need to evaluate that issue. We’re going
to transform the data by creating a measure that’s going to help us address the issue. We
will then analyze the data and then communicate our results.
In this video we’re going to be evaluating the HMC financial performance by brand, and
the data that we’re going to use for that are the 2015 through 2016 sales transactions. The
transformation that we’re going to do is to create a measure that’s going to help us
evaluate financial brand performance, and we’re going to use the profit margin ratio
calculation as that measure. Once we create that ratio, we’re going to go ahead and
analyze the data by preparing a pivot table to summarize profit margin by brand, and that
will allow us to communicate the results as to which brand has the highest profit margin
ratio for 2015 to 2016.
As a review, I’d like to cover what the profit margin ratio calculation is. So we take for a
profit margin ratio, we take operating income and we divide that by sales. There are
probably other measures that you might be thinking would be also good measures of
Page | 85
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
financial brand performance, but the reason we’re choosing profit margin ratio is because
that helps us eliminate any type of volume differences between the brands. So the profit
margin ratio allows us to break down by brand what their profit margin ratio is regardless
of how many cars they sell of each brand.
So let’s go ahead and open up the Excel spreadsheet that we’re going to be working with
in this video. This is the Huskie Motors sales transactions from 2015 to 2016, and if you’re
not already familiar with the data, I’ll give you a brief overview of it. Each row in this data
set represents a single sale of a vehicle. And so you can see that the VIN number should
be unique in each row of this data. There are approximately 2,674 transactions that we’re
going to be working with. The other information we have are typical things that you would
think about if you were buying a car: the brand name of the car, the model of that car, the
model year, perhaps there’s a series number for that vehicle. Is it a full-size car or a
compact? For example, what is the body style? And then we have sold date information.
We also have, sales volume will be one for each transaction since you can only sell unique
vehicle once, drive configurations, engine, etcetera.
If we continue to page through or scroll over, I should say, we will also see that we have a
lot of information about options that can be added to it. We have the region in which the
car was sold, the country in which it was sold, how many days it was on the lot, etcetera.
So if we’ve continued to scroll over, we will eventually get the financial information, and
that starts with the column of gross sales. I do want to point out that there is also a
column called net sales. And what that column represents is the difference between the
gross sale amount and any variable marketing expenses that were incurred. So these
variable marketing expenses are based on specials and deals that may be happening at
the time that car was sold. So cash rebates, low-interest financing, special lease terms,
those are types of variable marketing campaigns being used. And then this number
represents what that cost was.
We're going to go keep scrolling to the end of our financial data, and you can see I already
created a column for profit margin ratio and we’re going to recreate that so that I can walk
you through how I did that. It’s fairly straightforward so I’m just going to type in profit
margin ratio. So we said that the calculation is operating income divided by sales. So the
equivalent in this data set is going to be our net revenue, and we’re going to take net
revenue instead of after-tax revenue. The reason we’re going to do that is after-tax
revenue is driven by the tax rate that is in the specific countries in which the cars are
being sold. And so we don’t want to muddy the waters by also having to account for
differences in tax rates. So we’re going to go ahead with net revenue, and we’re going to
Page | 86
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
divide that by gross sales. Now the reason we’re using gross sales instead of net sales is
because gross sales is a better measure of sales. Since net sales is really just net of a
marketing expense. Now if this was a retail company and net sales represented the
difference between gross sales and sales returns, well then you would want the net sales,
but that’s not the case here. So we go ahead and we take our net revenue divided by our
gross sales, and we get our profit margin. Now I would recommend going and changing
this so that it is a little bit easier to read so you can change up to percentage. I actually
probably would add another decimal place to that. Now we can use the autofill feature by
just double-clicking in the corner here and that will autofill all the way down to the end of
our data. What our profit margin ratio is. I do want to point out a word of caution on
autofill. Autofill will only work if there is a corresponding data point in the column
immediately to the left of where you’re filling. If there was a blank, for example,
somewhere in this profit margin ratio column then the autofill would stop there. So what's
a good way you can tell if you’ve got everything? Well, if you notice down here after I did
autofill, it gave me a count. So it gave me a count of how many rows it autofilled that
formula to. And we know from looking at the number of rows we had, right in the
beginning there that we have 2,674 so it looks like it autofilled correctly.
All right, so now we have our profit margin ratio, and what we want to do now is create a
pivot table so that we can evaluate which model has the highest profit margin ratio for
sales from ’15 to ’16. So what we can do if you click anywhere in your worksheet and go up
to insert pivot table, notice that Excel will automatically find the whole parameter of all of
your data. So you just have to be clicked within a cell for that to work. We’re going to go
ahead and put this in a new worksheet because we already have a lot in this worksheet.
So click OK. And now notice that you have your fields here that you can start to drop. And
you’ve got it pretty much got your canvas here that you can start to drop in what you want
in your pivot table. So we’re interested in profit margin ratio by brand. So if I click on
brand over here, you can see that it drops in the Apechete, the Jackson, and the Tatra. We
also want profit margin. So if we look down to the bottom, you can see, and these go in
the order in which you have your columns in your worksheet. Notice I have two profit
margin ratios because remember I had already created one and then we created a second
one as a demonstration. It doesn’t matter which one we use. I’ll go ahead and take the
second one we just created. So now notice what we get. It defaulted to give me the sum of
profit margin. And that doesn’t really make a lot of sense, right? We don’t want to add up
all the profit margin ratios. That really doesn’t give us any kind of number that makes any
sense for us to evaluate. What we really want to do is look at the average. So if I click on
the down arrow and I go to value field settings, you can see here where you get choices of
how you want to summarize those, the fields. So in this case we were going to look at the
Page | 87
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
average of the profit margin, and I’m just going to click OK.
Now if I want to change this so that it’s in a nicer format and it has the percentages,
instead of, the way it looks right now, I can go back to value field settings and click number
format. And here if I pick percentage, I’m just going to use one decimal place and then I’m
going to click OK, click OK again. And now I have a much neater-looking table here. Grand
total doesn’t necessarily make a lot of sense, either. So if you want to take out grand total,
you can just click on it. And then go to remove grand total. We can also change what we
want to call this if we wanted to call it something different as well. So now you can see
that we have created a pivot table of the average profit margin, and as we can see, the
Apechete is performing the best followed by the Jackson and then by the Tatra.
Now before we call it a day, let’s think about what we talked about in the very beginning
that there might be other measures that you may have been tempted to use to evaluate
performance. Let’s just say for sake of argument that we want to also look at net revenue.
So if I click on net revenue and I’m looking now at total net revenue, some of net revenue,
and I’m going to go ahead and change the format on that. So that’s a little bit easier to
read. So we can see putting in net revenue that the Apechete also has the highest sum of
net revenues. So the highest total net revenue followed by the Jackson and then the Tatra.
But notice that the Tatra has very close revenue to the Jackson, and yet if we look at which
one is more profitable, we can see that it’s the Jackson of those two and that the Tatra
actually has a negative profit margin ratio. So if we go back to our slides, essentially what
we have is the results here on this slide shows you the results that we just calculated in
the pivot table where the Apechete has the highest profit margin followed by the Jackson
and then the Tatra.
So what are our key takeaways? Well, what we did in this video is we identified the issue of
what is the financial performance of the brands. Can we identify which brand is
performing best in the years ’15 to ’16, we took the data, which was the sales data from
2015 and ’16. And we transformed that data by creating a measure of profit margin ratio.
We then took that profit margin ratio and conducted a pivot table analysis, which helped
us to identify that the Apechete was in fact the best-performing brand.
So thank you for watching Video 2: “Transform and Analyze.
Page | 88
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
Huskie Motors Data Analytics: Video 3
Now that you saw how some data analytics are performed using Microsoft Excel, we’ll move on to
some exploratory data visualization using Tableau to understand the Huskie Motor Corporation’s
variable costs.
Video Transcript (Est. time: 22:20 min):
Hello, welcome to Huskie Motors Data Analytics Video 3, “Exploratory Data Visualization,”
understanding variable costs. I am Ann Dzuranin, and I will be discussing the material in
this video.
What we’re going to cover in Video 3 is first to identify the issue that we’re going to
examine, which in our case for this video is going to be evaluating Huskie Motor cars’s
variable cost. We have to identify what our key questions are. So what are we going to
explore in regard to Huskie Motor cars’s variable cost. Specifically, we’re going to be
examining which models have the highest variable cost, which costs comprise the largest
percentage of total variable costs, and how have variable costs changed over time.
The data that we’re going to be using is the data we’ve been working with in the other two
videos, the 2015 to 2016 sales transaction data for Huskie Motors. We’re going to explore
the data in two ways. We’re going to use Excel, and we’re going to use pivot tables and
charts to help us investigate the identified questions. And then we’re also going to use a
visualization software called Tableau to investigate the identified questions.
So let’s go ahead and get started. The first thing we’re going to do is get our Excel file
opened up. So this is the sales data for Huskie Motors. It’s the same data we worked with
in the last couple of videos. We’re going to be using this data to create three visualizations.
We’re going to explore it using both pivot tables and some pivot charts. What we want to
do is identify model variable cost. So this is our goal, to create this data, pivot table, and
this visualization. So that’s our first step. So let’s go ahead and start a new worksheet, and
we will go ahead and do that. So we want to know by model what total variable costs are.
To do that, we can go back to our sales data and click anywhere in the data sheet and go
to insert pivot table. So Excel will automatically highlight all the area, and create the table
that you’re going to be using. We’re going to go ahead and say to put it in a new
Page | 89
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
worksheet. So we get to our new worksheet. And now we have our canvas here to work
with to develop our pivot table. And from that pivot table we will develop a pivot chart.
So we know that we want to look at variable costs by model. So let’s go ahead and bring
model down to rows and then let’s bring years sold into columns. We’ve pretty much set
up how we want our rows and columns to look. We want each model listed as an
individual row. And then we want to look at the cost in 2015 and 2016. So now what I
need to do is grab that cost information and bring it over to values. So I’m going to go
ahead and scroll down until I come to total variable cost. And I’m going to bring that down
to values. It automatically creates a sum, and that’s okay. We’re interested in sum this
time, but we do want to make those fields, the number format look a little bit better so we
can better read these numbers.
So I’m going to go ahead and choose currency, and we don’t need decimal places for this
analysis, and I’ll hit OK. And then once I hit OK again, it’ll format my numbers. It looks
much easier to understand. So pivot tables are really useful. We’ve summarized all of our
total costs by model for each year, but you know, it does get to be a little confusing to now
try to look through here and make sense of, okay, which one is highest, which one is
lowest, which ones changed the most? So often visualizing the data is going to enable you
to get to those answers a lot more quickly.
So we can go ahead and work with what we have and we can actually create a pivot chart
if we would like to. So notice I’m in pivot table analyze and I have this choice over here for
pivot chart. If I click on that, it’s going to open up a box for me here to decide what type of
chart that I want. I’m okay with the column chart for now. So we’re going to go ahead and
click OK. And now we have our column chart. So as you can see, I’m going to close our
field tables over here. As you can see, we now can much more quickly identify which
model in 2015 had the highest total variable costs. Just by visually looking. We can see that
it’s Jespie and we hover on that point, we can see how much that is in total for that year.
And then if we wanted to see which one had the highest for 2016, we can again visually
pick that out pretty quickly, and it is the model called Pebble. We can also see visually
which ones are going up or down, right? And which ones have stayed the same. So very
quickly we can get to the information that we would like to understand better just by
taking that pivot table and changing into a pivot chart. That’s the power of using data
visualization for exploratory analysis.
Page | 90
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
The other question that we wanted to answer was which costs comprise the largest
percentage of total variable cost. So for that we want to look at the model variable cost,
but we want to look at it broken down instead of by model. We want it broken down by
the type of cost. So we know from the data set that our costs, that go into our total
variable costs include labor, materials, overhead, freight, warranty, package cost, and
option cost. So let’s go ahead and create a pivot table with those and then we can use that
to create this pivot chart. So again, I’m going to go back to the sales data set and I’m going
to click anywhere within my worksheet and go to insert pivot table. Now you might be
thinking, why don’t I just start with a pivot chart? And you could do that. You could just go
right to pivot chart, but it often is, it’s good to have the numbers in one place and then you
can manipulate the chart from those numbers vs. continually having to change the chart.
I’m going to go ahead and click OK to put this on a new worksheet. All right, so remember
now we want to look at the breakdown, by year of variable costs of the components of
variable costs. So what I would like to do is find each of my components and bring those
over. So let’s start with, let’s see, we said that it was labor, materials, option costs, package
cost, freight, overhead, and warranty are all in my variable cost. So notice now it’s just
putting a sum here. I also wanted it to be by year as well, right? So, what I would want to
do then is say, maybe I want these as rows. So I just drag that from columns, the values
from column to rows. What that did, this value represents all of the values that I’ve put
over here. And then I’m going to take years sold, which I believe is up here somewhere.
Right? And bring that over to my columns. So now I have all of my variable costs
components and how much they are for each of the two years and how much they
comprise of grand total. Now we could go back into each one of our fields here and go
ahead and change the format of those. Or we can change the format here. The benefit of
doing it within the values is then it stays within that. If you change your pivot chart around,
add things or move things, if we just change it here, the way we would do that, if we were,
in any spreadsheet on the home sheet, in the number section, obviously that will work but
again, if we change or add a column to this, that column would not be formatted at the
same way. Okay.
So we’ve got all of our components. And again, now how do we want to visualize this to be
able to see what comprises most of the variable costs. I think even just from the grand
total, we can see that that’s going to be materials which makes sense in the production of
cars. But let’s go ahead and click in your table, anywhere in your pivot table, on your top
ribbon pick pivot table analyze, which should be highlighted once you’re in that pivot table.
And go ahead and pick pivot chart. And again, we have a clustered, column is the default
Page | 91
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
that Excel is suggesting for us. We could choose another one. If we would like, it doesn’t
really always give us the best choices. So typically what I would do is stick with the
clustered bar. And so then you can see what we’ve done is we’ve brought this, I’ll bring this
over here and close our fields. So again, you know, we can kind of confirm quickly that
materials does make up the majority of our variable cost, and we can see followed really
closely by labor and option costs, warranty costs. We can also see some, a little bit of a
trend on which ones are going up and which ones are going down.
Okay. So now that we’ve, taken a look at this in Excel, why don’t we think about how we
might look at this in a different, well actually before we even move from Excel, let’s go
ahead and do that comparison. The year-to-year comparison. We just want to see, how
things- our third question was how have variable costs changed over time? So there’s a
couple of ways you could do that, right? We can look at it in total. And so let’s just go
ahead and create a new pivot table that looks at how variable costs have changed
from ’15 to ’16, let’s say. So let’s go ahead and again, click anywhere in our data, go to
insert pivot table, and now we can see that we can go ahead and hit a new worksheet. So
what do we want to know here? Here we wanted to know by years sold, don’t want to sum
the years sold. What is our total variable cost?
Okay. So, if I wanted to go ahead and change this so that it was formatted in currency, I
can go ahead and do that. I’ll do zero decimal places. So now we have basically our total
variable costs and we can see what it is each year. Go ahead and click OK, so it changes
the format. What if I wanted to create a pivot chart of this so that I could compare year to
year. So again, click anywhere in your pivot table and then click on pivot chart and you will
get your option here for your pivot chart. And so we can see that, very quickly that total
variable costs have gone up from 2015 to 2016. I want to point out a couple of things in
the default visualizations for Excel. Notice that we’re not starting at zero. That can always
be a very confusing type of visualization to create. So, this would not be considered best
practice for visualization. But again, we’re just using this for exploratory. We’re not using it
to explain or communicate our results yet. We’re just using this as a way for us to better
understand what’s happening. So it’s okay for that reason. So we can see that we’ve got an
increase in total variable costs, but what drives variable costs, sales volume, right? So we
really need to compare this to sales volume so we can go ahead and create another
visualization that gives us sales volumes.
So let’s go ahead, back to our data again, click anywhere in the data, click on insert pivot
Page | 92
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
table. Now in this case, I don’t want to put it in a new worksheet because I want it in the
worksheet I was just working in so I can compare the two. So I’m going to click on existing
worksheet location and then click in that location box. And what you need to do there is
go back to the sheet that we were on, that had that new visualization and it was this one
and click on a cell in there for it to go to and say, okay. All right, so now we’ve got another
option here for another pivot table. What I would like to do here again is look at sales
volume now by year. I’m going to go ahead and pick my year sold. I want a sum of my
years sold. And let’s go ahead and look at sales volume.
Now we had a sales volume number, but there’s another way that we can do that. We can
actually take the VIN number into values and it’ll automatically do a count for us. Okay, so
it’s counting the VIN numbers for us. So now we have the sales volume. I’m going to go
ahead and click within that pivot table and create a pivot chart and click OK. So just briefly
looking at this, and again, our scales are different though, right? But we can see that at
least we know that sales volume went up. In addition, right to our total variable costs. But
wouldn’t it be nice if we could see all of this in one visualization. I’m going to show you
how to use a visualization software to create an image like that and also to get a better
understanding of what your total variable costs might be.
So I’m going to go ahead and use a program called Tableau. Now I’m going to open up the
Tableau file that I’ve already started for this. So in this Tableau worksheet this looks very
similar as you can tell to a pivot table, right? We’ve got areas where we can drag our rows,
areas where we can drag our columns to create our visualization. The power of a
visualization software is that generally it has many more visualizations to choose from.
And in addition, you can manipulate things, a lot more easily. So let me show you as an
example. So one of the things we want to see was total variable costs. We’re going to see a
breakdown of total variable costs, right? So if I were to take my individual cost, I could do
that by again just clicking on each of those items. Now in this particular program, it will
alphabetize the data vs. having it in the order it was in the worksheet. There’s pros and
cons to that I suppose.
So we know that we want, freight, and if you double-click, it’ll bring it over. We want labor.
We want materials, we want option cost, overhead, packaged costs, and warranty. Now
notice we get something that we’re not really sure if this is exactly the visual we would like
to look at. We also want to be able to look at it, by year as well, right? So if I drag my sold
date up to my filter and pick years, then I can choose which years I want. In this case I
Page | 93
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
want both. So I'm going to hit OK. I can also drag that over to be in columns, let’s say. So
that then I’ve got, all of my information in here that I would like, now I can just go to the
show me button and pick a visualization that I would like to see. So what I’ve done is I’ve
picked this column chart and I can see in 2015 which items made up the majority of my
costs. And in 2016 which items made up the majority of my costs. We knew from our
previous analysis that it was going to be materials, but again, very quickly, visually we can
see that. Now, back to our question of, well, what if I wanted to see, 2015 total variable
costs and compared to volume. So let’s go ahead and start a new sheet and let’s think
about how we would want to do that. Well, we know that we would want to look at total
variable cost. We also know we would like to look at it by year. And we also know, that we
would be interested in seeing what our total sales volume was as well. So we can see I
have total variable costs and I have sales volume, and it’s not really all that much more
meaningful to me except I can see they’re both going up. But if I wanted to, I could, take
my sales volume number and I can use that as a dual access. So if I click within this second
visualization, I can pick dual access. Now what this does is allows me to look at sales
volume and variable costs. These are two different measurement scales, but it puts them
on the same plane so that I can see if my variable costs are rising at the same rate as my
volume.
And so after doing that, I can feel pretty comfortable, right? That they are moving in sync,
which is what we would expect variable costs to do. So again, just to summarize, this was
our total variable cost by modeling year. This was the one that we prepared in Excel, and
again confirming that we had the Jespie was the highest. Jespie was the highest in 2015,
and Pebble was the highest in 2016. We then looked at a breakdown of variable cost in
Excel so that we could see which of the variable costs were contributing the most of the
total. And that clearly came out as some of the material costs. We also split it by years that
we could see which ones are increasing or decreasing. And then we wanted to look at the
change over time and we did it initially in Excel, but then decided that really to evaluate
whether or not your variable costs are rising at the same rate as your volume, you need to
see both together. And so we used the visualization software Tableau to take both sales
volume and variable costs for both years. And using a dual access, we were able to plot
them both together in the same graph.
For our key takeaways today, we identified our issue, which was to evaluate Huskie
Motors’ variable cost. We identified the questions, specific questions, which models had
the highest variable cost. It was the Jespie in ’15, and the Pebble in ’16. Which costs
Page | 94
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
comprise the largest percentage of total variable costs. And that was materials for both
years, changed over time. And although they’ve increased from ’15 to ’16, they’ve done so
in sync with sales volume. And then we explore the data in both Excel and Tableau.
So thank you for watching Video 3, “Exploratory Data Visualization.”
Huskie Motors Data Analytics: Video 4
In the next video, we’ll perform some statistical analysis to evaluate total variable costs and
prepare a histogram.
Video Transcript (Est. time: 23:02 min):
Hello, welcome to Huskie Motors data analytics video for statistical analysis,
“Understanding Variable Costs.” I am Ann Dzuranin, and I will be discussing the material in
this video. So let’s go over what we’re going to cover in the video. The first thing we’ll do is
identify the issue. We’ll then identify the data we need, we’ll analyze the data and then
evaluate our results.
So the issue that we’re tackling today is the use of descriptive statistics to evaluate our
total variable cost. The data that we’re going to use is the 2015-2016 sales transactions.
We’re going to prepare descriptive statistics and prepare a histogram of variable cost as
well. And then we’ll summarize our key findings. So let’s go ahead and get started. We’ll
look at our Excel file that we’re going to be working with today and that’s the sales data
from 2015 to 2016. And what we’re trying to understand better are our total variable cost.
And we’re going to do that by preparing some descriptive statistics and a histogram of the
distribution as well. So this is what our goal is going to be to create these descriptives and
to create a histogram that gives us an idea of what our distribution looks like.
So what we’re gonna do first is create this, these descriptive statistics. Each of these
statistics can tell us something really important about our data. So let’s just go ahead and
get started in and recreate this. If we go to the 2015-’16 sales data, and I’m going to scroll
over until I get to my total variable cost, that’s what I’m analyzing in this video, total
variable cost. So I’m going to go ahead and copy this column of data and put it in a
separate worksheet so that I don’t have it, all in one large spreadsheet. Go ahead and
Page | 95
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
copy that and put it into a new sheet. So this is the data that we’re going to be working
with in this video. A total variable cost. Each row represents a specific car that was sold
and the variable costs associated with that vehicle. Now descriptive statistics can be done
using the data analysis tool in Excel. So if I click on data in the toolbar, I can see in my
ribbon up here that in my data ribbon, that there is a section called analysis and I have a
tool called data analysis. It’s possible that you don’t have that in your Excel. It’s possible
you need to add it. So if you don’t have it and you need to add it, what you’re going to do
is click on file and then go all the way down to options. Once you’re in the options screen,
you want to go down to add-ins. Once you’re in add-ins you want to go to the bottom
where it says manage Excel add-ins and click go. Now you should see a pop-up box here
that gives you the ability to add, some analysis things to your data ribbon. So make sure
you have analysis tool pack checked, that’s the first box. And then click OK. It should pop
up right in your data tab. If it doesn’t, then just refresh. Go out of your data tab and go
back to your data tab and it should be there. So once you have the data analysis tool, go
ahead and I’m going to click on that and show you what’s in here.
So this data analysis tool essentially summarizes all of the statistical functions that Excel
has available for to use. We’re going to be working with today two of these, we’re going to
work with the descriptive statistics tool and then also the histogram tool. So go ahead and
click on descriptive statistics and click okay. And now you get, a box here for you to fill in
the information to calculate the descriptive statistics. So your input range is going to be
the range of data that you would like to, get the descriptives on. So I always like to take
the column name because then it returns that in my results. I’m going to click on the first
row and I am going to highlight all the data in that row. Because I picked the label, I’m
going to make sure I click labels in first row. If you don’t do that, then Excel is going to tell
you there’s non-numeric data in the range. Because there are words at the top. We’re
going to go ahead and keep that in this worksheet. So I’m going to click on output range
and then click in that box and I’m just going to go ahead and scroll back up to the top here
where I want to put my results. Then I’m gonna click summary statistics. Otherwise I won’t
get any printout at all. Once I’ve done all that, I can click okay. And now what you see is a
report of variable cost. One thing that I like to do, because sometimes it can get confusing,
there’s many decimal places and no commas. I at least like to go to my formatting here for
numbers and click common style, just makes it easier to see the, the statistics, and read
the numbers more easily. Okay.
So, what I’d like to do is kind of point out what are the important, results in this variable
Page | 96
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
cost our most important descriptives. So if we look at these, I’m going to go ahead and
highlight them, as I talk about them. So the mean is the first one I want to talk about. We
all understand what a mean is, right? It’s a sum of all the observations divided by the
number of observations that we have. But one thing to keep in mind with a mean is that it
also can be influenced by outliers. So you have, you know, one or a couple of very high
numbers, it’s going to pull that mean toward that high number. Conversely, if you have a
few, extremely low numbers compared to the rest of yours, it’s going to pull the mean
down. So you always want to keep that in mind. The other measure of location that we’re
going to talk about is the median. So the median, essentially, is what is determined by
taking all of the values arranged from smallest to largest. And then the median is that
middle value. So the median is not influenced by outliers. And that’s an important thing to
keep in mind. If there are outliers in your data, then you probably want to go with the
median instead of the mean. The mode is also in here, I wouldn’t exactly highlight that as,
as something that’s important for this data set use of the mode is, it’s essentially the
observation that occurs most frequently. It’s really best used with small data sets that
have a small number of unique values and not really useful if you have a data set that has
a lot of repeating values. So those are the measures of location that that we’re going to be,
using or interpreting in our data. Measures of dispersion are also important. And so
measures of dispersion include the standard deviation and the variance. So standard
deviation and variance and range would also be considered, to be measures of dispersion.
Of these three the one that’s going to be most frequently used is going to be the standard
deviation. The standard deviation is an indication of how the observations in our data set
are spread out from the mean. So if we have a low standard deviation, then that would
imply that the observations are all close to the mean. If we have a large or high standard
deviation, that would indicate that the observations are more spread out, we get to the
standard deviation by calculating the variance. So the variance is the average of the
squared deviations from the mean. But the reason that we generally use the standard
deviation is the standard deviation, it’s more practical to use because it’s the same
measure dimension as the data. So in other words, 5,573.89 is 5,573 and 89 cents of
variable cost. Whereas the sample variance really is not that interpretable. The range is
also an indication of dispersion. Obviously the larger the range, the more possibility you
have for your data being dispersed. However, it’s really not a good measure; as good a
measure as using standard deviation.
The other things that we can, glean from our descriptive statistics are the measures for
the coefficients for Kurtosis and skewness and what these measures do, I'm going to
Page | 97
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
highlight them in different colors. What these measures tell us skewness and kurtosis and
skewness tell us is the shape of our data. So skewness is an indication of how symmetrical
our data is, how, what’s the symmetry or lack of symmetry of the data. So if we have data
that is symmetrical, then that would mean that is all spread around the mean. If we have
data that is not symmetrical, it’s either going to be positively skewed or negatively skewed.
And with that suggest is that there’s more data at one end or the other of the distribution,
kurtosis is a measure of how peaked our data is. So what we mean by that is we look at,
and when we look at the histogram, we would have seen this. That gives us an idea if we
have one area of highly peaked data and then all the rest is, spread from there. So it gives
us an indication of whether or not there’s a somewhat flat dispersion or a wide degree of
dispersion. That compared to whether or not there is a peaked amount of dispersion.
Okay.
So, the other interesting thing to look at here is if we look at the minimum, the maximum,
these are going to be useful for when we’re trying to determine what our bins are going to
be or what our groupings are going to be for our histogram. The minimum is the smallest
observation in the data, and the maximum is the largest observation. And then finally
count as an important thing to look at because it tells us very quickly whether or not we
have all of the data observations that we are supposed to have. So we know, that we are
supposed to have 2,674 observations, and we do have that. So it gives us some
confidence that we’ve got all of our data. All right, so now we’ve done, we have these
descriptives. What we can tell is that the average or the mean variable cost for a vehicle is
$17,841, the median is $15,516. We can also tell if there’s a standard deviation from that
mean of about $5,573. So in determining whether or not a standard deviation is large or
small is relative to the amount of your mean. So I would say that if we’ve got a mean of
$17,841 and our standard deviation is $5,573, there’s a pretty good amount of deviation
from the mean. So what we want to do next is we want to go ahead and do a histogram of
our total variable cost to get a feel for where we have buckets of data, of observations.
The way you want to do a histogram, generally you don’t want to have more than eight
bins in a histogram. We’re going to go ahead and create our bins over here, which is
essentially the columns of data that we’re going to have on the column bar chart. So to
create bands, you generally want to start close to where the minimum is and then
increase it by equal amounts until you get to the maximum. So that’s where the
minimum/maximum come in handy.
I’m going to go ahead and start my bins at 9,000 and then I’m going to increase those by
Page | 98
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
5,000 for each bin. So if I put in 9,000 and then 14, which is the increase, I can just drag
that down until I get up 39,000. These are the bins that I’m going to use to create my
histogram and the groupings of variable cost observations. So let’s go ahead and create
that histogram. If we go back to our data tab, and we click on data analysis, and then we
click on histogram and click OK. What Excel is asking us to do is tell us, okay, what is it you
want me to put in your histogram? So that’s our variable costs. Again, I’m going to pick the
label and I'm going to pick all the data in there. The Bin range is going to be those bins
that I just established. So I want to see my groupings. I’m starting at 9,000 and going up by
5,000. And I’m going to put the output range here on this worksheet so that we can see it
right next to what we’ve created. Now down here is where you’re gonna want to chart
your output so you actually do get the histogram. Go ahead and click OK. Now you can
see what I get. I get my bins repeated and then the frequency, how often I see an
observation that is in that range. So 9,000, that essentially is that minimum one of 8,912.
Notice that I’ve got the highest amount in the bin between 14 and 19,000, which makes
sense because our mean is 17. Our median is 15. So it makes sense that this would be the
highest, the highest amount of observation. So it’s basically saying that variable costs
between 14 and 19,000 occurred 889 times. So it gives us a really good indication of
where, we have our, most of our observations distributed. It also gives us maybe a hint
that perhaps this is an outlier. We might want to go back and check and be sure that data
is accurate. And then when we’ve got only three that are above the 39,000, we might want
to make sure that data is accurate as well. So I’m going to go ahead and call this sheet
descriptives.
So our goal though is to understand how total variable costs are, per model. And so we
know now that we’ve got a pretty significant standard deviation, we know what our overall
mean is for all of our vehicles that we sold, but we really want to get a little bit more
detailed. So we really want to understand better what’s happening by each model. So to do
that, we can create a pivot table. And what we’re going to do is go back to our data and
again, as we did and in other videos, we’re going to click anywhere in that data, go up to
insert pivot table, and we’re going to go ahead and put that on a new worksheet. And then
click OK. So what I’m interested in is looking at by model what my total variable costs are.
But I want to think of it in the same way that I just did my, other variable cost analysis. I
want to look at the average or the mean and the standard deviation. So if I click on total
variable cost, it brings that down into my values as a sum. And I’m not interested in sum
right now, I really want to understand better what’s happening by looking at the average.
I’m going to click on average, and I’m going to change my number format so it’s easier to
Page | 99
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
read to currency with no decimal places and click OK. So now I have my average. I also
want to look at the standard deviation. So I’m gonna drag that variable costs down again.
Again, it’s going to give me a sum, and I don’t want it to be a sum, I want to change that to
be the standard deviation. So I have that option there. Again, change the number format
to currency with no decimal places. And click OK and click OK again. So now I can see by
model what the average variable cost is and what the standard deviation is. This is helpful
information because it can help me to determine if I have a specific or a group of models
that maybe have a much higher standard deviation than others indicating that there’s a
very large variance in what’s happening with my total variable cost. That would help me
take that to that next step, where I want to really start diving into that model and looking
at the individual variable costs that are going into those models to see what’s happening
there. Perhaps there’s a correlation between the types of options, that the buyer puts in
and the variable cost. So there’s still a lot of research to be done, but this will help me at
least narrow it down to where I should start my investigation.
So although it’s all great in a table, it really would be useful to see this in a visualization. So
again, if I click anywhere in my pivot table and go to pivot table, analyze and click on pivot
chart, I will get the option to create a chart out of that data. So notice that Excel is going to
give me the option to do a clustered column, which doesn’t really help me all that much.
Obviously I can see by looking at which models have the a higher standard deviation, but
they might also have a higher average total variable cost. Another useful way to look at
this would be to use a combo chart. The default that it’s giving me is bars for average, total
variable and align for standard. Let’s look at this as to, as a stacked column and then click
OK. So I’m going to make this bigger so we can see it a little bit better. So now we can see
by this visual, we can right away see the total of this or the total average variable costs and
then we can see the standard deviation of that. So again, we could see this one, they were
side by side as well, but we can clearly see that the Chare has the highest variable costs as
well as the highest variance, the highest standard deviation. So we may want to
go back and look at the Chare model. Look at the Island model and perhaps look a little
more deeply into, either Pebble or the Jespie. This is just one step in our analysis journey,
but it definitely helps us to narrow it down.
To summarize what we’ve done in going through, we identified our issue, our data, and we
analyzed it. This was the analysis of our descriptive statistics. This was our histogram
where we were able to see that. In our descriptives, we see that we have a mean of 17,000
in variable costs per model. The median was 15,000, but we have a pretty large standard
Page | 100
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
deviation, as well. It’s about, a little more than a third of the mean. So we then took that
information that we wanted to see. Well, let’s see what our distribution looks like, right?
Let’s see what our dispersion looks like. And we can see the majority of the observations
are between the 14 and the 24, which coincides with what our descriptive statistics told us.
And then we took it one step further and we wanted to see by model what the
descriptives looked like. And, so we really wanted to get an understanding of which
models seem to have the highest variable cost and combined with the highest standard
deviation. And we can see that the models that stand out to us are the Chare, the Island,
and the Pebble. The ones with the highest standard deviation is definitely the Chare. So
our next step will likely to be to go back and look at the models that seem to have the
highest variable cost, average variable cost and the highest standard deviation, and see if
we can dig down a little bit deeper to see what costs within those variable costs are
perhaps driving that or maybe some correlations with what options the buyer bought.
So our key takeaways, we identified that we were looking at trying to better understand
total variable cost using the 2015 to ’16 sales data. We did descriptive statistics on total
variable costs, and then we broke that down by model, so that we could evaluate which
specific model had the highest average variable costs and the largest standard deviation.
So thank you for watching this video for statistical analysis.
Huskie Motors Data Analytics: Video 5
In the final video of this data analytics video series, Dr. Dzuranin will show you how to use Tableau
to bring it all together to create a dashboard with key performance indicators (KPIs) of the Huskie
Motor Corporation.
Video Transcript (Est. time: 24:41 min):
Hello, welcome to Huskie Motors data analytics Video 5, “Explanatory Data Visualization,”
creating decision useful dashboards with key performance indicators. I’m Ann Dzuranin,
and I will be discussing the material in this video. What we will cover in this video is we’ll
first identify what the issue is. And the issue is that we want to create decision-useful
dashboards with key performance indicators. So what we need to do is identify the
dashboards that are going to be needed and then the relevant key performance
Page | 101
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
indicators that we would like to have on those dashboards. So when reading through the
Huskie Motors case, we’ve identified that the Huskie Motors management is interested in
understanding and being able to quickly evaluate region performance, brand
performance, model performance, and overall performance.
As we discussed in Video 2, deciding on performance measurements can, there can be a
variety of those, but what we’ve determined is that, we’ve determined that the KPIs that
we’re most interested in are going to be gross sales, sales volume, net revenue, and profit
margin. The data we’re going to use to create the dashboards is going to be the 2015 to
2016 sales transactions for Huskie Motors. We’ll use data visualization software. In this
video I’ll be demonstrating the software Tableau to analyze the data and evaluate that
with the visualizations that we create. We’ll then communicate our results by preparing
interactive dashboards, allowing management to be able to filter the dashboards to see
down to the most granular level of information that they’re interested in.
So let’s go ahead and get started. As I mentioned, the software I’m going to use to
demonstrate the visualizations in this video is called Tableau. Any visualization software
can be used to do something like this. But we’re going to work with Tableau today. I’ve
already loaded Tableau data into the Tableau file and that’s that 2015-’16 actual sales data.
So we can go right to the first worksheet and start to create visualizations that we then
use in our dashboards. So we identified that one of the things we’re interested in is gross
sales. So lets just look at gross sales right now by region. So I can go over and I can quickly
pull region into my columns and then I’ll find gross sales and drag that into rows. So now
you can see I’ve got Europe, North America, and South America, my three regions, and I
have gross sales, and this represents total gross sales, but I actually want to look at gross
sales separate 2015 and 2016, so the other thing that I need to bring over is going to be
the sold date. So I bring over the sold date into columns I’m going to get then the 20152016 data.
Tableau, like a lot of visualization software packages, will give you a default that they think
is best. That’s not always the case. I want to look at this in column, bars so that I can kind
of look and see visually from one year to the next what’s happening with my data. I can
quickly change the name of my sheets so that I remember what I’m working with, and
we’re going to go ahead and call this gross sales by region. The other thing I want to think
about is how is it that I want to be able to view this data? Am I interested in region and
year, just years, when I have a color scheme that’s identified years as two different colors.
Page | 102
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
So ’15 is blue and ’16 is this orange color. That may be good if I’m just concerned about
year-to-year comparisons. I’m actually concerned about region to region and then within
that year. So I actually would prefer to have region as my color. So what I’m going to do is
take region and drag that into my colormark. Now I have colored my regions. So now any
of the visualizations that I use if I choose to use region is my color scheme then I can
quickly identify, you know, orange is North America, red is South America, blue is Europe. I
could also, if I wanted to, see the exact number of sales instead of having to look across at
the axis, I could also just drag gross sales into the label and that’ll put it on top of all of my
bars. Now that’s great, except it’s a little bit large. So I can format that number first of all,
so that it’s currency. And then I can also change the display unit. So then I just have it say
in millions. So now we can see very quickly what our sales are. We could actually even
remove this gross sale access over here if we wanted to as well to take away some of the
additional ink on the page that we may not need. I also might be interested in actually
seeing this with the highest gross sale, region first and then filtered from there. So what I
can do is go to region and I can sort this, and I can sort this based on the field, which
would be the sum of gross sales. And I’ll do it in descending order so that it’s from highest
to lowest. So now I can quickly see that North America, even though I could tell that from
the other version of it, at least this way I can see them in order as to which is contributing
the most to sales, gross sales by region.
So the power of doing visualizations is not only to be able to change the type of chart that
you’re using, but also to be able to make it interactive. And the best way to make
something interactive is to be sure that you have the information in here down to the
most granular region. So, for example, right now I have gross sales by region. I know
eventually I’m going to want to look at gross sales by brand and by model. So what I can
do is be sure that I have those in here as filters. So if I drag brand to filters right now, I
want all of them in there and I’m going to click OK. If I drag model to filters again, I'll
choose all right now and click oK. So what this allows me to do is if I only wanted to see
how the brand Tatra was doing in all the regions, I could just pick Tatra and click OK. And
now I have my visualization has changed and I can just see how the Tatra brand is doing.
So that can help to make something really powerful. And so this way, if I want to just
quickly use my visualization in a dashboard, now we’ll be able to filter down all the way to
the granular level for what we’re looking at right now is model. So this is a demonstration
of how we actually work with visualizations.
Let’s go ahead and look in a completed Tableau file that has put, where we’ve put together
Page | 103
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
the dashboards for the areas that we identified that we would like to have dashboards
ready to help us make decisions. So I’m going to switch over to the completed Tableau file
and we’ll start again with region performance. So what I’m going to do is put this in view, a
display mode. So this is a visualization that was created for region performance. So
remember we said we’re interested in, at revenue, we’re interested in sales volume, we’re
interested in profit margin. So the visualizations that were created for this particular, um,
dashboard show us net revenue by region, the increase and decrease in sales volume by
region, and then our profit margin by region. So very quickly in this dashboard I can see
that I’ve got profit margin, North America is decreasing. That could be driven and probably
is largely driven by my decrease in sales volume. I have a slight decrease in, um, South
America from ’15 to ’16. So that one I can see, although I have an increase in net revenue,
my profit margin is going down a bit. Now my sales volume line up. So this could be some,
somewhat due to perhaps increased expenses that’s driving my net revenue down. And
then when I look at my Europe, I can see that Europe actually is improving. So my profit
margin went from 10% to 13.6% from ’15 to ’16. And I could see my net revenue reflects
that. A large increase as well. And I can see that my sales volume is up there as well. I keep
in mind you can also use this to just, um, filter based on one, um, aspect of it. So I just
clicked on Europe, and then I have my other visualizations changed to only show me
Europe. So if I don’t want to look at what else is happening, if I’m just interested in looking
at Europe today, I just go in, I click on Europe, and that’s the only ones that pop up for me.
If I click on that again, it’ll bring them all back. And the same is true if I just clicked on
North America or I just clicked on South America.
So this is our dashboard that we’re going to use for region performance. Now we also said
another dashboard we’re interested in is brand performance. So I’m going to show you a
brand performance dashboard. So this dashboard, again, we’re looking at profit margin,
net revenue, and sales volume. I could also have brought gross sales over into the
visualization. Um, but you probably want to keep your visualizations to three and maybe
four visuals within the dashboard. Makes it easier for the decision maker to kind of zero in
on what they’re interested in. So now I’m looking at it by brand and I can see my brand
Apechete is decreasing on profit margin. Jackson has increased and my Tatra, now this is a
problem, ’cause I can see that that brand actually has a negative profit margin. It’s
increased a little bit from ’15 to ’16, but something is going on here that I really need to
look further into. If I look at brand performance, net revenue, again, I can see that the
Apechete, even though net revenue has gone up a bit, our profit margin is going down.
When I look at my Jackson, my net revenue has gone up, as has my profit margin. And
Page | 104
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
when I look at the Tatra, my net revenues going up a bit, but you know, that’s still in the
negative area here. So I really need to start thinking about what’s going on with the Tatra.
Um, and another thing that you can do in visualizations is you can create what’s called a
dual axis. So I’m interested in looking at how sales volume has changed from one year to
the next within my brands. And I can see, I can look at sales volume and I could have
another chart that just looks at the sales volume change, or I can put the two together. So
that gives me kind of an indication of the magnitude of the increase or decrease in my
sales volume. So if we look at the Apechete, for example, we had a sales volume in 2015
of 371 and that increased to 395. That represents a 6.47% increase. If I look at what
happened in Jackson, Jackson went up just a bit at 2.54%, and then I can see that sales
volume for Tatra is pretty high overall. But it did have an increase, and that increase was
about 4.2%. It may have helped a little bit of what’s happening over here. But I still have
some things I’ll really need to think about what that Tatra. The other dashboard that we
said we’re interested in are the other decision-making tools we’re interested in using is
something that can help us understand what’s happening with the model itself. And so we
created a dashboard here that combines both brand and model. And what’s interesting
about this is so there’s three types of visualizations that we’ve used here. Again, looking at
profit margin only. And so I’ve looked at profit margin ratio by brand and model. And what
I’ve done here is created a chart that if I look at the darkest blue, that’s the highest profit
margin, median profit margin ratio, and the darkest red is going to be the lowest. So if I
look at the Apechete right away, I see that I’m all in the blue, right? So I’ve got all positive
profit margins. When I look at the Jackson, I can see I’ve got a couple of brands, a couple
of models that I’m a little bit concerned about because they are showing a negative profit
margin, right? So they’re in losses. And when I look at the Tatra, I can see that I’ve got
some really strong models here, but I’ve got two models that are really dragging down
what’s happening overall at the Tatra. If I want to see just one model and see how that
profit margin is impacted in what region those are in, for example. So let’s say I really
want to see what’s happening with this, Jespie. And so if I click on 2015 for Jespie. I can see
really quickly how that’s affected in my regions, right? So even though my profit margin is
negative in 2015, you know what, let’s just look at 2016 instead, even though my profit
margin is negative I’m still doing okay in South America with that one. But you know, I’m
barely, I’m barely breaking, you know, positive margin in Europe, and I’ve got a huge
negative profit margin ratio in North America. So if I wanted to see which models were
doing best, so that would be the advantage. The advantages at 36.3% profit margin,
clearly, it’s driving a lot of North America. It’s doing well in Europe, although not nearly as
well. And it’s doing well in South America as well, although again, definitely performing
Page | 105
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
best in North America. So what’s interesting about being able to do this is, again, you can
quickly filter, and see how everything changes. Right now, I’m looking at both years. I do
have a filter up here where I can change that and I can only look at ’16. So that helps to,
allow me to very quickly just look at what’s happening in one year. And so if I look at just
that one year, then I can see again, the Mortimer is actually doing worse than Jespie in
that year. And I can also quickly see, that the one that’s doing best is still the advantage.
So this is a nice way to bring together both brand and model performance, to be able to
understand better what’s happening within our regions, our brands, and our models, and
from year to year.
Now, the last dashboard that I’m going to illustrate for you is that overall performance
dashboard. So this dashboard, what we want to do is really be able to make this very
interactive for the user. So they’re able to look at overall performance and then drill that
down by model or by brand. So what we’ve brought into this visualization is net revenue
by region, profit margin by region, and then growth sales by country so we can get a
better understanding of what’s happening within those regions. So the regions are still
color-coded as they were in my example where we’ve got North America as this orange
color, Europe is blue, and the red is South America. So if we look at this visualization on
the bottom, this is what’s called a tree map and what this represents, the larger the area,
right, the higher the amount of sales. And the color represents the region. And then I have
added the detail here where you can see what that number is in millions. So we can see
just by looking at this that the U.S. comprises right North America I should say, comprises
the largest portion of all of our sales or half of it, looks about half of our sales. And within
North America, the U.S. is by far the largest. If we look in South America, we’re actually
fairly close between Columbia, Venezuela, and Brazil. Then followed by Chile, Argentina,
and Bolivia. And when we look in Europe, we can see that we’ve got the U.K. and France
are equal, Germany, followed by Spain and then Poland and Sweden, which are the same.
So this is all well and good. I can see how this is all happening by country. If I just wanted
to click on the U.S. and see what the U.S. net revenue is and what the U.S. profit margins
are, I just click in U.S. and then I can just look at what’s happening there. If I’m interested,
for example, in just how France is performing, I click on France. I can see what my gross
sales are. I can see what my net revenue is year to year, and I can see what my profit
margin is as well. If I wanted to use this down and go down to an even more granular level,
remember we were concerned about what was happening with the Tatra. So if I picked
just Tatra, now my visualizations all change and it just shows me the results for just that
brand. So it’s that brand, but all the models within that brand and I can see that that
Page | 106
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
brand does not do very well in Europe and in South America, although this revenue is high
and the sales are high in the U.S., we are running at a loss as far as profit margin. So we
really need to think about kind of what’s happening there as well.
So, then if I just click on the U.S. and that will again show me what’s happening just within
the U.S. for the Tatra. Click back on there and bring it all back. We know that the Tatra has
some really high-performing brands and some really poorly performing brands. So if I go
and look at, for example, we knew that the, I think it was the Jespie was not doing so well.
We can see that this Jespie is what’s driving a lot of the negative profit margin, right?
Because we can see we’ve got a loss year to year for that model and we can see also
what’s happening over here in North America, right, with the loss, although we can see
that it is performing better in Europe in ’15, but declining in ’16 and improving in South
America. So if I look down and wanted to see, you know, well, hey, where is it doing well in
South America, I go down here and I can see that really Argentina is where I have sales.
And again, this is all relative, this profit margin is relative to what’s happening in Argentina
in South America. If I just clicked on Argentina, I can see that Argentina has had an
increase in sales of the Tetra in net revenue I should say from year to year. And I can see
that it’s had a very large increase in profit margin as well. So being able to make this
dashboard interactive, and I’ll go ahead and change this back now to all brands and all
models. When I have everything in here, it gives me that overall performance evaluation.
And then if I want to really dig into what’s driving some of my more concerning issues, for
example, my declining profit margin, then I can dig down deeper by looking at what’s
happening with each of the brands. And that automatically changes all my visualizations,
making it a very powerful way to explain what’s happening with your operations.
So we’re going to go back and kind of summarize what we’ve done here. And what we’ve
done is gone through and created dashboards for each of the areas that we want to
evaluate performance and identify the KPIs using the 2015-16 data and using those
visualizations to prepare that interactive dashboard. We put together first region
performance where we summarized what was happening in our region by looking at net
revenue, increase or decrease in sales volume, and then mean profit margin, median
profit margin by region. We then delved a little bit deeper by looking at what was
happening with our brands. We have three brands in our business. We’ve got the
Apechete, the Jackson, and the Tatra. And we looked again at what was happening with
profit margin, revenue, and then sales volume. And for this one, instead of just looking at
the increase or decrease, we looked at volume as well as what the percentage change was
from year to year in each of the brands. We took that one step further and looked at
Page | 107
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
brand and model performance. And there we created a visualization where we could look
at profit margin by growth, brand, and model. And we could make this, and we made this
interactive so that we could look at it either by both years, which is what this static view is
by both years. Or we could have changed and looked at one year at a time. And we could
look and see very quickly which profit margins are somewhat concerning. So we’ve got the
Jespie, the Mortimer, and the Crux, and these were for both years. And then we can look
at it by region over here. And we saw that we could look at just ’16. We could look at just
one particular brand or model, and this was color-coded so that the more the lower the
profit margin, if it was negative, it was in the red. And if it was positive it was in the blue,
and the darker the color, the more extreme either way. And then we fit, we wrapped it up
by looking at overall performance where we made this interactive so that we could, if we
wanted to, look at it just overall. So this all models, all brands, all models for all areas. We
looked a little bit deeper to see within the regions how the countries are performing. And
then we showed how we could change all of the visualizations at the same time by simply
going through and picking different models and brand combinations so that we could see
how those were doing.
So the takeaways, again, we identified our issue, which was to create useful dashboards
for decision making. We identified the KPIs we were interested [in] for that. We use the
data from 2015 and ’16 to analyze that data and communicated our results in a variety of
interactive dashboards that would allow the decision maker to change those dashboards
to help identify whatever areas they were interested in examining further.
So thank you for watching Video 5, “Explanatory Data Visualization.”
Huskie Motors Conclusion
This concludes the Huskie Motors data analytics video series. Up next, you will have the
opportunity to perform some data analytics yourself, as you attempt the exercises based on
the Huskie Motors case study. You’ll now need to access the data files from the Resources
tab and save them to your computer. You’ll use these files to perform the analytics required
for the exercises in the following lesson.
Please note: To perform the exercises, you will be asked to load raw data into Microsoft Excel.
Based on the version of Excel you are using, please refer to the following links for guidance
Page | 108
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
on loading raw data:
For Office 365: https://support.office.com/en-us/article/data-import-and-analysis-options3ea52160-08bc-45ac-acd9-bc4a11bcc2a2
For Excel 2010 – 2016: https://support.office.com/en-us/article/text-import-wizard-c5b02af6fda1-4440-899f-f78bafe41857#ID0EAAEAAA=Office_2010_-_Office_2016
Module 3 Wrap‐up
You have now completed Module 3, Applying Data Analytics and Visualization. In this module, you
were able to
Explore the data analytics and visualizations described in the Huskie Motors case study, and
Apply this knowledge to complete the exercises.
You will now advance to the final module of this course, Module 4 Conclusion and Final
Assessment
Module 4: Conclusion and Final Assessment
Course Roadmap
Click Next to proceed to Module 4, Conclusion and Final Assessment.
Summary
Welcome to Module 4, Conclusion and Final Assessment! In this module, we will conclude the Data
Analytics & Visualization Fundamentals Certificate course by reviewing the overall course learning
objectives, and then moving to the Final Assessment.
Course Objectives Review
Let’s revisit the learning objectives for this course. In the preceding modules, you learned to:
Recognize the impact of technology and analytics on the accounting profession.
Demonstrate how data analytics can influence organizational strategy.
Identify ways data visualizations effectively enable appropriate business decisions.
Page | 109
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script
If you are unsure of your understanding of any of these objectives, please review the modules
before attempting the final assessment.
We Are Here to Support You
IMA is here to support you as advancements in technology and analytics transform the finance
and accounting profession. As mentioned by Jeff Thomson, IMA President and CEO, in Module 1,
IMA is committed to preparing you today for changes that are happening now as well as those
that are coming in the near future.
We offer many online courses, podcasts, webinars, and live events with relevant topics in the
realm of technology and analytics, and you can expect to see more added. This compilation of
resources is curated by IMA to help you navigate through this dynamic time and prepare you for
success.
Click below to navigate to IMA’s Technology & Analytics Center to access the resources and
recommendations to further educate yourself on these topics.
Final Assessment Introduction
Next, you will progress to the Final Assessment. Remember to click the “Submit” button, located
at the top-right of your screen, after choosing your answer. If you pass the Final Assessment with
a score of 70% or higher, you will earn your Data Analytics and Visualization Fundamentals
Certificate, CPE, as well as a digital badge that you can upload to your professional profile on
social media.
Good luck!
Course Wrap‐Up
To access your professional certificate, digital badge, and NASBE CPE certificate, go to the “My
Transcript” tab in IMA’s Learning Management System. To download the certificates, click on
“Certificates.” To download your Digital Badge, click on “Digital Badge.”
We need your feedback! Please click the link displayed to complete a brief survey on your learning
experience. If you haven’t already taken the survey, please click the link below.
Thank You
Thank you for taking the IMA Data Analytics & Visualization Fundamentals CertificateTM course.
Page | 110
IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course
Script
Please contact us if you have questions about this course or if you are interested in other IMA
courses.
Page | 111
Download