Uploaded by Joshua Benjamin Rodriguez

study-guide-quantitative-techniques

advertisement
lOMoARcPSD|21931040
Study Guide Quantitative Techniques
Quantitive Techniques (Damelin)
Studocu is not sponsored or endorsed by any college or university
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
BACHELORS OF COMMERCE
(GENERIC)
MODULE: QUANTITATIVE TECHNIQUES
STUDY GUIDE
2021
1
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Copyright © Educor 2020
All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any
form or by any means, including photocopying, recording, or other electronic or mechanical
methods, without the prior written permission of Educor Holdings. Individual’s found guilty of
copywriting will be prosecuted and will be held liable for damages.
2
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
1 Table of Contents
1
About DAMELIN .............................................................................................................................. 6
2
Our Teaching and Learning Methodology ...................................................................................... 6
2.1
3
Icons ........................................................................................................................................ 8
Introduction to the Module .......................................................................................................... 11
3.1
Module Information.............................................................................................................. 11
3.2
Module Purpose .................................................................................................................... 11
The purpose of this module is to instil critical thinking and analytical mind-set in making decisions
in a business and management settings. This will give learners the ability to logically analyse sets of
related issues and be able to come up with an informed decision. ................................................. 11
4
5
3.3
Outcomes .............................................................................................................................. 11
3.4
Assessment ........................................................................................................................... 12
3.5
Planning Your Studies / Resources Required for this Module: ............................................. 13
Prescribed Reading ....................................................................................................................... 13
4.1
Prescribed Book .................................................................................................................... 13
4.2
Recommended Articles ......................................................................................................... 13
4.3
Recommended Multimedia .................................................................................................. 13
Module Pacing .............................................................................................................................. 14
5.1
WEEK 1: STATISTICS IN MANAGEMENT ................................................................................ 17
5.1.1
Introduction .................................................................................................................. 17
5.1.2
Statistics in Management.............................................................................................. 17
5.1.3
The terminology of Statistics ........................................................................................ 19
5.1.4
Components of Statistics .............................................................................................. 20
5.1.5
Statistical Applications in Management ....................................................................... 21
5.1.6
Statistics and Computers .............................................................................................. 21
5.1.7
Data and Data Quality ................................................................................................... 21
5.1.8
Data Types..................................................................................................................... 22
5.1.9
Data Sources ................................................................................................................. 23
5.1.10
Self-Assessment ............................................................................................................ 25
5.2
WEEK 2: SUMMARISING DATA: SUMMARY TABLES AND GRAPHS ....................................... 26
5.2.1
Introduction .................................................................................................................. 26
5.2.2
Summarising Categorical Data ...................................................................................... 26
5.2.3
Summarising Numeric Data .......................................................................................... 30
5.2.4
Self-Assessment ............................................................................................................ 33
3
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
5.3
Damelin©
WEEK 3: DESCRIBING DATA: NUMERIC DESCRIPTIVE STATISTICS ........................................ 34
5.3.1
Introduction .................................................................................................................. 34
5.3.2
Non-central Location Measures .................................................................................... 37
5.3.3
Measures of Dispersion ................................................................................................ 37
5.3.4
Measure of Skewness ................................................................................................... 38
5.3.5
The Box Plot .................................................................................................................. 39
5.3.6
Self-Assessment ............................................................................................................ 40
5.4
WEEK 4: BASIC PROBABILITY CONCEPTS............................................................................... 41
5.4.1
Introduction .................................................................................................................. 41
5.4.2
Types of Probability ...................................................................................................... 41
5.4.3
Properties of a Probability ............................................................................................ 43
5.4.4
Basic Probability Concepts ............................................................................................ 43
5.4.5
Calculating Objective Probabilities ............................................................................... 44
5.4.6
Probability Rules ........................................................................................................... 45
5.4.7
Probability Trees ........................................................................................................... 45
5.4.8
Permutations and Combinations .................................................................................. 45
5.4.9
Self-Assessment ............................................................................................................ 46
5.5
WEEK 5: PROBABILITY DISTRIBUTIONS ................................................................................. 47
5.5.1
Introduction .................................................................................................................. 47
5.5.2
Types of Probability Distribution .................................................................................. 47
5.5.3
Discrete Probability Distributions ................................................................................. 47
5.5.4
Binomial Probability Distribution .................................................................................. 48
5.5.5
Poisson Probability Distribution.................................................................................... 49
5.5.6
Continuous Probability Distribution ............................................................................. 50
5.5.7
Normal Probability Distribution .................................................................................... 51
5.5.8
Standard Normal (z) Probability Distribution ............................................................... 51
5.5.9
Self-Assessment ............................................................................................................ 52
5.6
WEEK 6: CONFIDENCE INTERVAL ESTIMATION..................................................................... 53
5.6.1
Introduction .................................................................................................................. 53
5.6.2
Point Estimation ............................................................................................................ 53
5.6.3
Confidence Interval Estimation ..................................................................................... 53
5.6.4
Confidence Interval for a single population mean: sample standard deviation is known,
n – large (n>30) ............................................................................................................................. 56
5.6.5
The Precision of a Confidence Interval ......................................................................... 56
4
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
5.6.6
Damelin©
The Student t-distribution............................................................................................. 56
5.6.7
Confidence Interval for a Single Population Mean (μ) when the Population Standard
Deviation (σ) is unknown .............................................................................................................. 57
5.6.8
Confidence Interval for the Population Proportion (π) ................................................ 59
5.6.9
Self-Assessment ............................................................................................................ 60
5.7
WEEK 7: HYPOTHESES TESTS – SINGLE POPULATION (PROPORTIONS & MEANS) ............... 61
5.7.1
Introduction .................................................................................................................. 61
5.7.2
The Process of Hypothesis Testing................................................................................ 61
5.7.3
Hypothesis Test for a Single Population Mean (μ) – Population Standard Deviation (σ)
is known 62
5.7.4
Hypothesis Test for a Single Population Mean (μ) – Population Standard Deviation (σ)
is Unknown.................................................................................................................................... 62
5.7.5
Hypothesis Test for a Single Population Proportion (π) ............................................... 63
5.7.6
The p-value approach to hypothesis testing................................................................. 63
5.7.7
Self-Assessment ............................................................................................................ 65
5.8
5.8.1
Introduction .................................................................................................................. 66
5.8.2
Simple Linear Regression .............................................................................................. 66
5.8.3
Scatter Plot .................................................................................................................... 67
5.8.4
Correlation Analysis ...................................................................................................... 70
5.8.5
The Coefficient of Determination (r2) ........................................................................... 70
5.8.6
Self-Assessment ............................................................................................................ 71
5.9
6
WEEK 8: SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS ................................. 66
WEEK 9: TIME SERIES ANALYSIS: A FORECASTING TOOL ...................................................... 72
5.9.1
Introduction .................................................................................................................. 72
5.9.2
The Components of a Time Series ................................................................................ 72
5.9.3
Decomposition of a Time Series ................................................................................... 74
5.9.4
Trend Analysis ............................................................................................................... 74
5.9.5
Seasonal Analysis .......................................................................................................... 75
5.9.6
Uses of Time Series Indicators ...................................................................................... 75
5.9.7
Self-Assessment ............................................................................................................ 75
REFERENCES .................................................................................................................................. 76
5
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
1 About DAMELIN
Damelin knows that you have dreams and ambitions. You’re thinking about the future, and how the
next chapter of your life is going to play out. Living the career, you’ve always dreamed of takes some
planning and a little bit of elbow grease, but the good news is that Damelin will be there with you
every step of the way.
We’ve been helping young people to turn their dreams into reality for over 70 years, so rest assured,
you have our support.
As South Africa’s premier education institution, we’re dedicated to giving you the education
experience you need and have proven our commitment in this regard with a legacy of academic
excellence that’s produced over 500 000 world – class graduates! Damelin alumni are redefining
industry in fields ranging from Media to Accounting and Business, from Community Service to Sound
Engineering. We invite you to join this storied legacy and write your own chapter in Damelin’s history
of excellence in achievement.
A Higher Education and Training (HET) qualification provides you with the necessary step in the right
direction towards excellence in education and professional development.
2 Our Teaching and Learning Methodology
Damelin strives to promote a learning-centred and knowledge-based teaching and learning
environment. Teaching and learning activities primarily take place within academic programmes and
guide students to attain specific outcomes.
•
•
•
•
•
A learning-centred approach is one in which not only lecturers and students, but all
sections and activities of the institution work together in establishing a learning
community that promotes a deepening of insight and a broadening of perspective with
regard to learning and the application thereof.
An outcomes-oriented approach implies that the following categories of outcomes are
embodied in the academic programmes:
Culminating outcomes that are generic with specific reference to the critical cross-field
outcomes including problem identification and problem-solving, co-operation, selforganisation and self-management, research skills, communication skills,
entrepreneurship and the application of science and technology.
Empowering outcomes that are specific, i.e. the context specific competencies students
must master within specific learning areas and at specific levels before they exit or move
to a next level.
Discrete outcomes of community service learning to cultivate discipline-appropriate
competencies.
Damelin actively strives to promote a research culture within which a critical-analytical approach and
competencies can be developed in students at undergraduate level. Damelin accepts that students’
learning is influenced by a number of factors, including their previous educational experience, their
cultural background, their perceptions of particular learning tasks and assessments, as well as
discipline contexts.
6
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Students learn better when they are actively engaged in their learning rather than when they are
passive recipients of transmitted information and/or knowledge. A learning-oriented culture that
acknowledges individual student learning styles and diversity and focuses on active learning and
student engagement, with the objective of achieving deep learning outcomes and preparing students
for lifelong learning, is seen as the ideal. These principles are supported through the use of an engaged
learning approach that involves interactive, reflective, cooperative, experiential, creative or
constructive learning, as well as conceptual learning via online-based tools.
Effective teaching-learning approaches are supported by:
•
•
•
•
•
•
•
•
•
•
•
•
Well-designed and active learning tasks or opportunities to encourage a deep rather than
a surface approach to learning.
Content integration that entails the construction, contextualization and application of
knowledge, principles and theories rather than the memorisation and reproduction of
information.
Learning that involves students building knowledge by constructing meaning for
themselves.
The ability to apply what has been learnt in one context to another context or problem.
Knowledge acquisition at a higher level that requires self-insight, self-regulation and selfevaluation during the learning process.
Collaborative learning in which students work together to reach a shared goal and
contribute to one another’s learning at a distance.
Community service learning that leads to collaborative and mutual acquisition of
competencies in order to ensure cross cultural interaction and societal development.
Provision of resources such as information technology and digital library facilities of a high
quality to support an engaged teaching-learning approach.
A commitment to give effect teaching-learning in innovative ways and the fostering of
digital literacy.
Establishing a culture of learning as an overarching and cohesive factor within institutional
diversity.
Teaching and learning that reflect the reality of diversity.
Taking multi culturalism into account in a responsible manner that seeks to foster an
appreciation of diversity, build mutual respect and promote cross-cultural learning
experiences that encourage students to display insight into and appreciation of
differences.
7
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
2.1 Icons
The icons below act as markers, that will help you make your way through the study guide.
Additional Information
All supplementary and recommended learning resources
Announcements
Important announcements made via myClass
Assessments
Continuous and Summative Assessments
Audio Material
Audio recordings and podcasts
Calculator
Activities that require calculation and equation base solutions
Case Study
Working examples of concepts and practices
Chat
A live chat with your Online Academic Tutor
Discussion Forum
Topic to be explored in the weekly discussion forum
Glossary
Learning activity centered on building a module glossary
8
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Group Assignment
Assignments to be completed with peers
Help
Instructions on how to receive academic support and guidance
Individual Assignment
Assignments to be completed individually
Lesson Material
Learning content in myClass as per the units below
Module Information
Important information regarding your module like outcomes, credits,
assessment, and textbooks
Module Welcome
A welcome to the module in myClass to introduce you to the module and
important module information
Outcomes
Learning outcomes you will meet at the end of a section or module
Survey
A poll, feedback form or survey to complete
Practice
Indicates an activity for you to practice what you’ve learnt
Lesson/Virtual Class
Virtual Class links available via myClass
9
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Quote
A thought, quote or important statement from a thought leader in the
specialist field
Reading
Prescribed reading material and module textbooks
Revision
Questions and activities that will support your module revision
Self-Assessment Quiz
Weekly quizzes to complete to self-measure if you have a complete
understanding of the lesson material
Shout Out | Example
Examples and highlights to contextualise the learning material, critical
concepts and processes
Lesson Material
Indicates sections of learning material in myClass
Thinking Point
A question, problem or example posed to you for deeper thinking,
interrogation, and reflection
Time
The allocated time required per week, unit and module related to the module
credit structure as per your factsheet
Video
Additional videos, video tutorials, desktop capture/screen recording and
other audiovisual supplementary material
Vocabulary
Important words and their definitions that aid the development of your
specialist vocabulary
10
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
3 Introduction to the Module
Welcome to the QUANTITATIVE TECHNIQUES
This course is a Business Statistics in nature and its part of every management education programme
offered today by academic institutions and business schools. Statistics provides evidence-based
information which makes it an important decision support tool in management.
Although students are encouraged to use this guide, it must be used in conjunction with other
prescribed and recommended text.
3.1 Module Information
Qualification title
Bachelor of Commerce
(Generic)
Module Title
Quantitative Techniques
NQF Level
7
Credits
10
Notional hours
100
3.2 Module Purpose
The purpose of this module is to instil critical thinking and analytical mind-set in making decisions in a
business and management settings. This will give learners the ability to logically analyse sets of related
issues and be able to come up with an informed decision.
3.3 Outcomes
At the end of this module, you should be able to:
• Describe the role of Statistics in management decision making and the importance of data in
statistical analysis.
• Describe the meaning of and be able to calculate the mean, confidence intervals for the mean,
standard deviation, standard error, median, interquartile range, and mode.
• Summarise tables (pivot tables) and graphs providing a broad overview of the profile of random
variables, identifying the location, spread, and shape of the data.
• Understanding the basic concepts of probability to help a manager to understand and use
probabilities in decision making.
• Describe and make use of probability distributions that occur most often in management
situations, that describe patterns of outcomes for both discrete as well as continuous events
• Review the different methods of sampling and the concept of the sampling distribution.
• Describe the concept of interval estimation
• Describe hypothesis testing and construct the null and an alternate hypothesis.
11
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
• Describe normal distribution, student’s distribution, binomial distribution, Poisson distribution,
and the F distribution and apply for testing hypothesis.
• Describe the concepts of multiple correlation and regression.
• Discuss the necessity of including statistical planning in research design.
• Test the difference between correlated and uncorrelated sample means using a t-test for two
means and analysis of variance of several means.
• Express the relationship between two variables by regression and calculating their correlation.
• Analyse the dependence of one variable upon another by regression.
• Sole well defined but unfamiliar problems using correct procedures and appropriate evidence.
• Describe the time series analysis using a statistical approach to quantify the factors that influence
and shape time series data and apply it to making forecasts of future levels of activity of the time
series variables.
3.4 Assessment
You will be required to complete both formative and summative assessment activities.
Formative assessment:
These are activities you will do as you make your way through the course. They are designed to help
you learn about the concepts, theories, and models in this module. This could be through case studies,
practice activities, self-check activities, study group / online forum discussions and think points.
You may also be asked to blog / post your responses online.
Summative assessment:
These are activities you will do as you make your way through the course. They are designed to help
you learn about the concepts, theories, and models in this module. This could be through case studies,
practice activities, self-check activities, study group / online forum discussions and think points.
You may also be asked to blog / post your responses online.
You are required to do two individual assignments, online multiple-choice questions, and an online
exam.
Mark allocation
These are activities you will do as you make your way through the course. They are designed to help
you learn about the concepts, theories, and models in this module. This could be through case studies,
practice activities, self-check activities, study group / online forum discussions and think points.
You may also be asked to blog / post your responses online.
The marks are derived as follows for this module:
Individual Assignment 1
20%
Individual Assignment 2
20%
Online Multiple-Choice Questions
10%
Online Exam
50%
12
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
TOTAL
Damelin©
100%
3.5 Planning Your Studies / Resources Required for this Module:
What equipment will I need?
•
•
Access to a personal computer and internet.
Scientific Calculator (Casio FX-82ZA Plus Scientific
Calculator) or Sharp Writeview Scientific Calculator (ElW506)
4 Prescribed Reading
4.1 Prescribed Book
Wegner, T (2016). Applied business statistics methods and Excel-based applications. Juta. Cape Town
South Africa. 978-1-48511-193-1
4.2 Recommended Articles
Please refer to the additional resources that are mentioned throughout the various weeks.
4.3 Recommended Multimedia
Please refer to the video resources that are mentioned throughout the various weeks.
13
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5 Module Pacing
Week
1
2
3
4
5
6
7
8
9
10
11
12
13
Topics
STATISTICS IN MANAGEMENT
SUMMARISING DATA: SUMMARY TABLES
AND GRAPHS
DESCRIBING DATA: NUMERIC DESCRIPTIVE
STATISTICS
BASIC PROBABILITY CONCEPTS
PROBABILITY DISTRIBUTIONS
CONFIDENCE INTERVAL ESTIMATION
HYPOTHESES TESTS –SINGLE POPULATION
(PROPORTIONS & MEANS)
SIMPLE LINEAR REGRESSION AND
CORRELATION ANALYSIS
TIME SERIES ANALYSIS: A FORECASTING
TOOL
REVISION 1
REVISION 2
REVISION 3
REVISION 4
Study Guide Unit
Number
1
2
Textbook Chapter
Number
1
2
3
3
4
5
6
7
4
5
7
8
8
9
9
10
1 To 2
3 To 5
6&8
9 & 10
NAME OF TOPIC FOR THE WEEK AS PER THIS GUIDE AND REFLECTIVE OF LMS
Weeks
WEEKLY TOPICS FOR THE SEMESTER
2020
1
2
3
4
5
6
7
8
9
10
11
12
13
Statistics in Management
Summarising Data: Summary Tables and Graphs
Describing Data: Numeric Descriptive Statistics
Basic Probability Concepts
Probability Distributions
Confidence Interval Estimation
Hypothesis Tests – Single Population (Proportions & Means)
Simple Linear Regression and Correlation Analysis
Time Series Analysis: A Forecasting Tool
Revision 1
Revision 2
Revision 3
Revision 4
00..Exam
Week
Each Unit should be thought of as a “week of content”. If the unit is larger, it can be split over two
weeks, but we should see that a week is a capsule or episode of learning that can have “consolidating”
14
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
learning activities. As such, each “unit” will be required to have a prescribed amount of learning
activities and engagements. These are to be embedded within each week and not to be listed at the
end of the unit. PLEASE SEE THE EXAMPLE OF UNIT CONTENT.
Prescribed Learning Activities and Engagements:
Video Content
Podcast
Thinking Point
Case Studies
Discussion Forum
Example/Practice
At least ONE video resource to be in each subsection of content. This
is be embedded within the content at the appropriate time as per the
learning design.
At least ONE podcast to be in each unit.
The podcast must be seen as supplementary to the learning content
and if a podcast is not available on the specific topics at hand, an
adjunct concept/topic can be used that will broaden the general area
knowledge of the subject matter for the student.
Podcasts should not be selected that are only available on
streaming websites that require a subscription.
At least ONE thinking point should be used within each subsection of
content as a way to pause the movement through content and to
provide the chance for the student to think and concretize their
learning or what they have just read. A thinking point may be a
hypothetical, a personal reflection or a question regarding the
content within a different context (application). A thinking point
must be thorough and engaging enough to draw pause and focus
from the student.
A case study should be within each unit and can be used in any
relevant subsection of content. The case study should be robust
enough for the student to understand how to apply something or to
see how a function/tool/theory or practice may work in a real world
environment. A case study should be seen as a way for the student
to be reflected in the learning experience and as such, it is advised
that case studies are selected from local/afrocentric contexts and
illustrate our commitment to intersectionality within our teaching
and learning approach and philosophy.
Each unit of study/each week will require at least ONE discussion
forum topics. This can either be embedded within a certain section
of content or it can be at the end of the unit content depending on
the requirements of the module as per the subject matter. The
discussion forum topic/question should robust and dense enough for
the student to be engaged and a reference must be made to the fact
that the Discussion Forum topic is live and available within the
module page on myClass.
These are to be used within each section that deals with applied
learning – the application of a process, technique, equation or
function. The example is to be used when an example of a problem
15
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Vocabulary
Glossary
Additional Resource
Prescribed Reading
Quote
Self-Assessment Quiz
Damelin©
and a solution is provided and the practice is to be used when a
problem is provided for the student to solve.
Vocabulary is to be used within each subsection of content where an
important word, term or definition is provided that students are to
take note of.
The glossary is an LMS activity function and can be inserted into a
guide where the development of a glossary is required and necessary
for the module. This is to be used mainly within NQF 5 modules as it
speaks to the specific level descriptors of that module.
Each subsection of content must have at least THREE additional
resources. These can be supplementary articles and journals,
mixed/multimedia content such as a respected blog, social media
account, news site, music video or audio recording. The additional
resource must be provided by the study guide author if it is an
“attachment” that will require loading into the LMS.
Each subsection must refer to a page, section or chapter in the
prescribed reading for the module. The prescribed reading should
indicate to the student where to locate the texts from which the
subsection has been summarised or written. This may be placed at
the start of the subsection, or at the appropriate point where a
student must leave the study guide/lms and read through a text
section in the prescribed reading.
Each subsection of content should have at least ONE quote that is
from a thought leader in the field, or contextualises a section of
learning for the student. The quote must not be inserted as a graphic
but as plan text with the appropriate graphic alongside it.
Each unit/week will have a self-assessment quiz for the student.
Within the study guide, the author can refer to the self-assessment
as per the below but must stress that the self-assessment will be live
in the module myClass page for completion.
Referencing
16
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.1 WEEK 1: STATISTICS IN MANAGEMENT
Purpose
The purpose of this unit is to introduce the common terms, notations and
concepts in statistical analysis
By the end of this week, you will be able to:
Define the term ‘management decision support system’
Explain the difference between data and information explain the basic
terms and concepts of Statistics and provide examples recognise the
different symbols used to describe statistical concepts
• Explain the different components of Statistics
• Identify some applications of statistical analysis in business practice
• Distinguish between qualitative and quantitative random variables
• Explain and illustrate the different types of data
• Identify the different sources of data
• Discuss the advantages and disadvantages of each form of primary data
collection
• Explain how to prepare data for statistical analysis.
It will take you 5 hours to make your way through this study week.
•
•
Learning
Outcomes
Time
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications. Juta.
Cape Town South Africa. 978-1-48511-193-1. Chapter one
5.1.1 Introduction
This week, we focuses on describing what is statistics and the role it plays in management and decision
making. The importance of data in statistics is also discussed. Some basic statistical terms and
concepts are also explained
Let us Watch!
5.1.2 Statistics in Management
In all academic institutions, business schools and management colleges worldwide, business statistics
is part of every programme being offered today. The term statistics can take on a variety of meanings.
It's frequently used to describe data of any sort, mass, pressure, height, weight, stock prices, batting
average, GPA, temperature, etc. Other people may connect the term to the results of surveys, polls,
and questionnaires. Particularly in our study, we will use the term statistics primarily to designate a
specific academic discipline focused on methods of data collection, analysis, and presentation.
Virtually in all case, statistics is concerned with the transformation of data into information. Black
(2011)
17
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Management Decision Making
As a manager decision making is one of the most crucial aspects of the jobs. Decision are made on all
business activities such as what to sell, how to sell, how much to buy, which market to target, which
equipment to buy; whether certain types of goods are of acceptable quality; where to locate stores
so as to maximise profits; and whether girls buy more of a particular product than boys etc. Therefore,
a well-informed decision based on quality information will need to be made. Wegner (2016)
Information
In order to make sound and viable business decisions, managers need high-quality information.
Information must be relevant, adequate timeous, accurate and of easy access. Information is
organised (collected, collated, summarized, analysed and presented) data values that are meaningful
and can be used to make business decisions. Most often the information is not readily available in the
formats required by the decision makers. Wegner (2016)
Data
Date constitute of individual values, for instance, observations or measurements on an issue e.g.
R400.50, 5 days, 70 meters, strongly agree, etc. Data is readily available and carries a little useful and
usable information to decision makers. Wegner (2016)
Statistics
It is a set of mathematically-based methods and techniques which transform small or large sets of raw
(unprocessed) data into a few meaningful summary measures, that may exhibit relationships, show
patterns and trends, which then contains very useful and usable information to support sound
decision making, whether we're sorting out the day's stock quotations to make a more informed
investment decision or unscrambling the latest market research data so we can better respond to
customer needs and wants. The understanding and use of statistics empower managers to become
confident and quantitative reasoning skills that enhance decision-making capabilities and provides an
advantage over colleagues who do not possess them Black (2013).
Transformation process from data to information
INPUT___________PROCESS____________OUTPUT ____________BENEFIT
(Data)
(Statistical Analysis)
(Information)
(Management decision making)
Source: Wegner 2016)
18
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Statistics support the decision process by strengthening the quantifiable basis from which a wellinformed decision can be made.
Figure 1.1 Key Statistical Elements
Source: Black (2013)
Let us watch
Watch the YouTube link below and write a brief essay on why
you think statistics is important.
https://www.youtube.com/watch?v=yxXsPc0bphQ
5.1.3 The terminology of Statistics
Some essential terms and concepts are:
Random Variable is any attribute or characteristic being measured or observed. It takes different
values at each point of measurement e.g. Years of experience of an employee.
Data, these are real values or outcomes drawn from a random variable e.g. Years of experience of an
employee might be (2, 4, 1, 3, 2, 6).
See below some examples of random variables and related data:
•
•
•
•
Travel distances of delivery vehicles (data: 22 km, 18 km, and 29 km)
Daily occupancy rates of hotels in Pretoria (data: 34%, 48%, and 34%)
Duration of machine spends on working (data: 13 min, 21 min, and 18 min)
Brand of washing powder preferred (data: Sunlight, OMO, and Aerial).
Sampling Unit, this will be the item being measured, observed or counted with respect to the random
variable under study. e.g. employees.
Population represents every possible item that contains a data value (measurement or observation)
of the random variable under study. The sampling units should possess the characteristics that are
relevant to the problem. e.g. all employees of Damelin Pretoria City.
19
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Population Parameter, the actual value of a random variable in a population. It’s derived from all data
values on the random variable in the population. It is constant. e.g. about 57% of MTN employees
have more than 5 years’ experience.
The sample is a subset of items drawn from a population. e.g. employees in the Finance department
of MTN.
Sample Statistic, It is a value of a random variable derived from sample data. It is NOT constant as its
value always depends on the values included in each sample drawn.
Table 1. 1 Examples of population and associated samples
Random variable
Population
Sampling unit
Size of bank
All current accounts with An Absa client with
overdraft
Absa
a current account
Mode of daily
All commuters to Cape
A commuter to
commuter transport Town’s central business
Cape Town’s CBD
to work
district (CBD)
TV programme
All TV viewers in Gauteng A TV viewer in
preferences
Gauteng
Age of students at a All students at Damelin
A registered
college
College
student at Damelin
College
Table 1. 2 Symbolic Notation for Samples and Population Measure
Statistical Measure
Sample Statistic
Mean
x
Standard deviation
S
Variance
S2
Size
n
Proportion
p
Correlation
r
Sample
400 randomly selected
client’s current accounts
600 randomly selected
commuters to Cape
Town’s CBD
2000 randomly selected
TV viewers in Gauteng
1000 randomly selected
registered Damelin
student
Source: Wegner (2016)
Population Parameter
μ
Σ
σ2
N
π
ρ
Source: Wegner (2016)
5.1.4 Components of Statistics
Statistics has three components namely:
• Descriptive Statistics – condense large volumes of data into summary measures. It seeks
to paint a picture of a management problem scenario.
• Inferential Statistics – sample findings can be generalized to the broader population. It
extends the sample findings to the actual population.
• Statistical Modelling – builds relationships between variables to make predictions. It uses
equations to explain variables and to estimate or predict values of one or more of the
variables under different management scenarios.
20
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.1.5 Statistical Applications in Management
Recent examples that show importance of statistics decision making:
• According to an Electronics wholesale survey, the average amount spent by a shopper on
computer accessories in a two-month period is R 7280 at the Game store, R5040 at Makro,
R2460 at PnP hyper, R6720 at the Incredible store, and R4200 at President Hyper.
• 1275 workers survey by job mail reports shows that 45% of workers believe that the quality
of their work is perceived the same when they work remotely as when they are physically
in the office.
• A KPMG Retail “Blue” survey of 1860 adults revealed that 44% agreed that plastic, noncompostable shopping bags should be banned.
From these few examples, it is clear that there are a wide variety use and applications of statistics in
business i.e.
•
•
•
•
•
Finance
Marketing
Human Resources
Operation/Logistics
Economics
5.1.6 Statistics and Computers
The invention of computers has opened many new opportunities for statistical analysis. A computer
allows for storage, retrieval, and transfer of large data sets. Some widely used statistical techniques,
such as multiple regression, are so tedious and cumbersome to compute manually that they were of
little practical use to researchers before computers were developed. Some statistical software
packages, include i.e. R, Minitab, SAS, and SPSS. Wegner (2016)
Case Study
The link below takes you through various statistical packages that
are used in industries. Identify three statistical packages that are of
interest to you and carry out a brief research writing what features
they entail.
https://www.youtube.com/watch?v=hHywVkLwLzg
5.1.7 Data and Data Quality
To this point, we've used the term data pretty loosely. In statistics, the term refers specifically to facts
or figures that are subject to summarization, analysis, and presentation. A data set is a collection of
data having some common connection. Data can be either numeric or non-numeric. Numeric data are
data expressed as numbers. Non- numeric data are represented in other ways, often with words or
letters. Telephone numbers and golf scores are examples of numeric data. Nationalities and
nicknames are non-numeric. Data is the raw material of statistical analysis. If the quality of data is
poor, the quality of information derived from statistical analysis of this data will also be poor.
Consequently, user confidence in the statistical findings will be low. A useful acronym to keep in mind
21
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
is GIGO, which stands for ‘garbage in, garbage out’. It will be of great importance to understand
influences of the quality of data needed to produce meaningful and reliable statistical results. Data is
used to plan business activities and to make business decisions Black (2013) Data should be of good
quality and the quality of data depends on three aspects:
• Type of data
• Source of data
• Data collection method
5.1.8 Data Types
Classification 1: categorical versus numeric
Table 1. 3 Categorical Data
Random variable
Gender
Country of origin
Categories
Female
Male
Angola
Botswana
Codes
1
2
1
2
Source: Developer’s compilation
• Categorical data (qualitative): refers to data representing categories of outcomes of a random
variable e.g.
• Numeric data (quantitative data): refers to real numbers that can be manipulated using arithmetic
operations to produce meaningful results.
Classification 2: Nominal, Ordinal, Interval and Ratio Scales of Measurement
Enormous numerical data are gathered in businesses every day, representing myriad items. For
instance, numbers represent rand costs of items produced, geographical locations of retail outlets,
weights of shipments, and rankings of subordinates at yearly reviews. All this data should not be
analysed the same way statistically because the entities represented by the numbers are different. In
such cases, the business researchers need to know the level of data measurement represented by the
numbers being analysed Black (2013). Four common levels of data measurement follow.
1.
2.
3.
4.
Nominal
Ordinal
Interval
Ratio
Nominal is the lowest level of data measurement followed by ordinal, interval, and ratio. Ratio is the
highest level of data
22
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 1.2 Hierarchy of Levels of Data
Source: Black (2013)
• Nominal-scaled data: used on categorical data of equal importance e.g. gender – male or female
• Ordinal-scaled data: used on categorical data where ranking is implied e.g. Shirt size – small,
medium, large.
• Interval-scaled data: It is mainly from rating scales, which are used in questionnaires to measure
respondents’ attitudes, motivations, preferences, and perceptions e.g. attitudes – poor, unsure,
good etc.
• Ratio-scaled data: used on numeric data involving direct measurement where there is an absolute
origin of zero. E.g. length of service – 27 months, 45 months.
Classification 3: Discrete versus Continuous Data
• Discrete data: consists of whole numbers only. E.g. 1, 2, 3, 4
• Continuous data: is numeric data that can take any value in an interval (both whole number and
fractional values) e.g. 4, 4.6, 10.7, 34.2
Case Study / Online Forum discussion
Data types are a very important concept in the world of statistics,
See the video below for
additional information on data types and discuss in a group their
relevance to data analysis.
https://www.youtube.com/watch?v=hZxnzfnt5v8
5.1.9 Data Sources
Data, of course, can be collected from a variety of sources—from vast government agencies charged
with maintaining public records to surveys conducted among a small group of customers or
prospective clients.
23
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
The Internet
In these last 20 years, Internet has become an almost limitless source of business and economic data.
With the help of powerful search engines, even the casual user has instant access to data that once
would have required hours, if not weeks, of painstaking research (Bergquist et al, 2013).
Government Agencies and Private-Sector
Both governmental agencies and private-sector companies gather and make available a wealth of
business and economic data.
Thinking Point
Name both governmental agencies and private sector companies that gather and make
available business and economic data.
Internal versus External Sources
• Internal data refers to the availability of data from within an organization, examples are. Financial,
production, human resources etc.
• External data refers to data available from outside an organization, examples are. Employee
associations, research institutions, government bodies etc.
Primary versus Secondary Sources
• Primary data is data which is taken at the point at which it is generated. i.e. surveys
• Secondary data: the data is collected and processed by others for various purposes other than the
problem at hand. i.e. publications
Data Collection Methods
• Observation Methods
Direct observation –by directly observing the respondent or object in action data is collected. E.g.
vehicle traffic survey
Desk Research (Abstraction) – extracting secondary data from a variety of source documents. E.g.
books, publications, newspapers etc.
• Survey Methods:
Primary data is gathered through the direct questioning of respondents.
Personal interviews: involves face-to-face with a respondent during which a questionnaire gets to be
completed.
Postal surveys: involve posting questionnaires to respondents for completion
24
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Let us watch
Watch the video on the link below about ideas on data collection
and list some instances where you think a particular data collection
method is suited for.
https://www.youtube.com/watch?v=8SHnJfPQ9qc
REVISION QUESTIONS
1.
Why is it necessary to differentiate between different types of data?
2.
What is the difference between:
• Quantitative and qualitative data?
• Discrete and continuous data?
What types of information would be included in quantitative data?
In what ways is qualitative data critical to the success of a business?
3.
4.
Before we progress to the following learning unit learning unit, make sure you are able
to understand and talk through the following concepts:
•
•
•
•
•
Nominal data
Ordinal data
Interval data
Ratio data
Scale of Measurement
5.1.10
Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Quantitative
Techniques in myClass. Head on to the quiz to see how you have fared with
this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
25
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.2 WEEK 2: SUMMARISING DATA: SUMMARY TABLES AND
GRAPHS
Purpose
The purpose of this unit is to explain how to summarise data into table
format and how to display the results in an appropriate graph or chart
By the end of this unit, you will be able to:
Learning
Outcomes
Time
• Summarize categorical data into frequency table and cross-tabulation
table.
• Interpret the findings from a categorical frequency table and crosstabulation table.
• Construct and interpret appropriate bar and pie charts.
• Summarize numeric data into a frequency distribution and cumulative
frequency distribution (ogive).
• Construct and interpret a histogram and a cumulative frequency polygon.
• Construct and interpret a scatter plot of two numeric measures
• Display time series data as line graphs and interpret trends.
It will take you 10 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications.
Juta. Cape Town South Africa. 978-1-48511-193-1. Chapter two
5.2.1 Introduction
This week aims to give the students an understanding of the common ways in which statistical findings
are conveyed. The most commonly used means of displaying statistical results are summary tables
and graphs. Summary tables can potentially be used to summarise single random variables as well as
examine the relationship of 2 random variables. The choice of the summary table and graphic to be
used depends on the type of data that need to be displayed. Managers do benefit from the statistical
findings if the information will easily be interpreted and communicated effectively to them. Tables
and graphs convey information much more efficiently and quick than a written report. In graphs and
tables, for instance, there is much truth in the old adage ‘a picture is worth a thousand words’. While
in practice, analysts’ should most if not all the times consider the use summary tables and graphical
displays more than written texts. Profile consisting of a single random variable (e.g. most-preferred
TV channel by viewers or pattern of delivery times) or to examining the relationship between two
random variables (e.g. between gender and newspaper readership) can be easily summarised by
summary tables and graphs.
5.2.2 Summarising Categorical Data
Quantitative data graphs are plotted along a numerical scale while qualitative graphs are plotted using
non-numerical categories. In this particular section of the study unit, the aim is to examine 2 types of
qualitative data graphs, pie charts and bar charts,
26
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Categorical summary table (one-way pivot table) shows the count/percentage of responses that
belong to each category of a categorical variable. If it is given as a count it is called absolute frequency
and if given as a percentage or fraction it is referred to as relative frequency Wegener (2016).
Table 2.1 Types of Cars at Company X
Car Type
Absolute Frequency
Relative Frequency (%)
Mazda
6
40
Toyota
3
20
Nissan
2
13
Isuzu
4
27
Total
15
100
Source: Developer’s own compilation (2021)
Data from a categorical frequency table can be displayed as a pie chart or simple bar chart.
Simple Bar chart
Charts or bar graphs contain 2 or more categories along one axis and a series of bars, one for each
particular category, along with the other axis. The length of the bar would represent the magnitude
of the measure (frequency, percentages amount, money, etc.) for each category. A bar graph is
qualitative since the categories are not numerical, they may be either horizontal or vertical. The same
type of data that is used to produce a bar graph is also used to construct a pie chart. The advantage
of a bar graph over a pie chart is that for some categories close in value, it is considered easier to see
the difference in the bars of bar graph than discriminating between pie slices Black (2011).
Construction of a simple bar chart
•
•
•
•
The categories are exhibited on the horizontal axis.
Frequencies are exhibited on the vertical axis.
The height of each bar displays the frequency of each category.
The width of the bars must be constant.
Figure 2.1 Simple chart for Table 2.1
Source: Developer’s own compilation (2021)
27
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
In the example below, consider the data in Table 2.2 expenses by an average student on back-toschool spending. When constructing a bar graph from the data, the categories would be Electronics,
Clothing and Accessories, Dorm Furnishings, School Supplies, and Misc. Bars for each of these
categories are made using the rand figures given in the table. Figure 2.1 is the resulting bar graph
produced by Excel
Table 2.2 Back to School Spending
Category
Amount Spent (R)
Electronics
R211.89
Clothing
R134.40
Dorm Furnishings
R90.90
School Supplies
R68.47
Misc.
R93.72
Source: Black (2013)
Figure 2.2 Bar Graph of Back to School Spending
Source: Black (2013)
Pie Chart
A circular display of data where the area of the whole circle represents 100% of the data and slices of
the circle represents a percentage breakdown of the different sublevels is known as a pie chart. Pie
charts are widely used in business, mostly in showing things such as budget, market share, ethnic
groups and time/resource allocations categories. Since pie charts can lead to less accuracy than are
possible with other types of graphs there are, however, their use is minimized in the sciences and
28
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
technology. In general, it is harder for the viewer to interpret the relative sizes of angles in a pie chart
than judging the length of rectangles in a bar chart Black (2013).
Construction of a Pie chart
• Divide a circle into segments.
• Size of each segment should be proportional to the frequency count/ percentage of its category.
• The sum of the segment frequency must equal to the whole.
Figure 2.3 Pie Chart for Table 2.1
27%
40%
Mazda
Toyota
Nissan
Isuzu
13%
20%
Source: Black (2013)
• Profiling two categorical variables
A cross-tabulation table (two-way pivot table) shows the number/percentage of observations that
jointly belong to each combination of categories between two categorical variables. For example car
type at company X in two years 2007 and 2008.
The first categorical variable is car type with four categories (Mazda, Toyota, Nissan, and Isuzu). The
second categorical variable is a year with two categories (2007 and 2008). The results can be displayed
as shown in Table 2.3:
Table 2.3 Type of Car at Company X 2007 & 2008
Car Type
Year
Total
2007
2008
Mazda
6
5
11
Toyota
3
2
5
Nissan
2
1
3
Isuzu
4
7
11
Total
15
15
30
Source: Black (2013)
Data from a cross-tabulation table can be displayed as a component bar chart or a multiple bar chart.
29
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 2.4 Component (Stacked) Bar Chart for Table 2.2
Source: Developer’s own compilation
5.2.3 Summarising Numeric Data
Raw data, or data that have not been summarized in any way, are sometimes referred to as
ungrouped data. Data that have been organized into a frequency distribution are called
grouped data. The distinction between ungrouped and grouped data is important because
the calculation of statistics differs between the two types of data. Several of the charts and
graphs presented in this section are constructed from grouped data. One particularly useful
tool for grouping data is the frequency distribution, which is a summary of data presented in
the form of class intervals and frequencies Black (2013).
Profiling a single numeric variable
A numeric frequency table (distribution) is a summary table which groups numeric data into
intervals and reports the frequency count of numbers assigned to each interval.
Construction of a frequency table:
• Determine data range, often is defined as the difference between the largest and smallest
numbers
• Decide on a number of classes, one rule of thumb is to select between 5 and 15 classes. If the
frequency distribution contains too few classes, the data summary may be too general to be useful.
Too many classes may result in a frequency distribution that does not aggregate the data enough
to be helpful. The final number of classes is arbitrary. The business researcher arrives at a number
by examining the range and determining the number of classes that will span the range adequately
and also be meaningful to the user
• Determine class width, an approximation of the class width can be calculated by dividing the range
by the number of classes
• Determine class limits, are selected so that no value of the data can fit into more than one class
30
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Table 2. 4 Frequency Table of Office Data
Class Interval
Column 1 Column 2
115 -<130
5
12.5
130-<145
7
17.5
145-<160
6
15
160-<175
12
30
175-<190
8
20
190-<205
2
5
Total
40
100
Damelin©
Column 3
12.5
30
45
75
95
100
Column 4
100
87.5
70
55
25
5
Column
122.5
137.5
152.5
167.5
182.5
197.5
Source: Black (2013)
Column 1: Absolute frequency
Column 2: Relative frequency
Column 3: Less than Cumulative frequency
Column 4: More than Cumulative frequency
Column 5: Class Mid-points
Histogram
A Histogram is a graphic display of numeric frequency distribution. One of the more widely
used types of graphs for quantitative data is the histogram. A histogram is a series of
contiguous rectangles that represent the frequency of data in given class intervals. If the class
intervals used along the horizontal axis are equal, then the heights of the rectangles represent
the frequency of values in a given class interval. If the class intervals are unequal, then the
areas of the rectangles can be used for relative comparisons of class frequencies Black (2013).
Figure 2.6 Example of a Histogram
Source: Black (2011)
31
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Ogive – the cumulative frequency table
An ogive (o-jive) is a cumulative frequency polygon. Construction begins by labelling the xaxis with the class endpoints and the y-axis with the frequencies. However, the use of
cumulative frequency values requires that the scale along the y-axis be great enough to
include the frequency total. A dot of zero frequency is plotted at the beginning of the first
class, and construction proceeds by marking a dot at the end of each class interval for the
cumulative value. Connecting the dots then completes the ogive. Ogives are most useful when
the decision maker wants to see running totals. For example, if a comptroller is interested in
controlling costs, an ogive could depict cumulative costs over a fiscal year. Steep slopes in an
ogive can be used to identify sharp increases in frequencies Black (2013).
Figure 2.7 Ogive of the Unemployment Data
Source: Black (2013)
REVISION QUESTIONS
1. Complete the sentence: ‘A picture is worth a .......................’
2. What is the name was given to the chart that displays:
a) The summarised data of a single categorical variable?
b) The summarised data of two categorical variables simultaneously?
3. What is the name given to the table that summarises the data of
two categorical variables?
4. Explain at least three differences between a bar chart and a
histogram.
5. What is the name of the chart that is used to display time series
data?
32
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.2.4 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
33
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.3 WEEK 3: DESCRIBING DATA: NUMERIC DESCRIPTIVE
STATISTICS
The purpose of this week is to explore descriptive statistics, this kind of
statistics help to identify the location, spread and shape of the data.
Purpose
By the end of this week, you will be able to:
Learning
Outcomes
Time
•
Describe the various central and non-central location measures.
•
Calculate and interpret each of these location measures.
•
Describe the appropriate central location measure for different data
types.
•
Describe the various measures of spread (or dispersion)
•
Calculate and interpret each measure of dispersion
•
Describe the concept of skewness
•
Calculate and interpret the coefficient of skewness
•
Calculate the five-number summary table and construct its box plot
•
Explain how outliers influence the choice of valid descriptive statistical
measures
It will take you 10 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications.
Juta. Cape Town South Africa. 978-1-48511-193-1. Chapter 3
5.3.1 Introduction
Summary table and their graphical displays, described in study week 2 are used to communicate broad
overviews of the profiles of random variables. Managers sometimes need numerical measures
(statistics) to convey more precise information about the behaviour of random variables. This precise
communication of data is the purpose of descriptive statistical measures.
5.3.1.1 Central Location Measures
Descriptive Statistics – Location Measures
Definition of Measures of Central Tendency
34
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
One type of measure that is used to describe a set of data is the measure of central tendency.
Measures of central tendency yield information about the centre, or middle part, of a group of
numbers, observations of a random variable tend to group about some central value. The statistical
measures, which quantify where the majority of observations are concentrated, are referred to as
measures of central location. A central location statistic represents a typical value or middle data point
of a set of observation and is useful for comparing data sets Wegner (2016).
There are three main measures of central location:
1. Arithmetic mean
2. Mode
3. Mean
Measures of Central Tendency for Ungrouped Data
1. Mean
The arithmetic mean is the average of a group of numbers and is computed by summing all numbers
and dividing by the number of numbers. Because the arithmetic mean is so widely used, most
statisticians refer to it simply as the mean. The population mean is represented by the Greek letter
mu (μ). The sample mean is represented by
Formula:
Population Mean
๐œ‡=
∑ ๐‘ฅ๐‘–
Sample Mean
๐‘ฅฬ… =
∑ ๐‘ฅ๐‘–
2. The Mode
๐‘
๐‘›
The mode is the most frequently occurring value in a set of data Organizing the data into an ordered
array (an ordering of the numbers from smallest to largest) helps to locate the mode i.e. The value
with the highest frequency. It is calculated by observation Wegner (2016).
3. The Median
The median is the middle value in an ordered array of numbers. For an array with an odd number of
terms, the median is the middle number. For an array with an even number of terms, the median is
the average of the two middle numbers Wegner (2016).
The following steps are used to determine the median.
STEP 1. Arrange the observations in an ordered data array.
STEP 2. For an odd number of terms, find the middle term of the ordered array. It is the median.
STEP 3. For an even number of terms, find the average of the middle two terms. This average is the
median.
35
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Measures of Central Tendency for Grouped data
1. Mean
For ungrouped data, the mean is computed by summing the data values and dividing by the number
of values. With grouped data, the specific values are unknown. What can be used to represent the
data values? The midpoint of each class interval is used to represent all the values in a class interval.
This midpoint is weighted by the frequency of values in that class interval. The mean for grouped data
is then computed by summing the products of the class midpoint and the class frequency for each
class and dividing that sum by the total number of frequencies. The formula for the mean of grouped
data follows.
๐‘ฅฬ… =
x = mid-point of each class interval
∑(๐‘“๐‘ฅ)
๐‘›
f = frequency of each data value
n = total frequency.
2. The Mode
The mode for grouped data is the class midpoint of the modal class. The modal class is the class interval
with the greatest frequency. The formula for the mode of grouped data follows
Mo = O๐‘š๐‘œ +
๐‘(⌊๐‘“๐‘š −๐‘“๐‘š−1 ⌋)
2๐‘“๐‘š − ๐‘“๐‘š−1 − ๐‘“๐‘š+1
Where: O๐‘š๐‘œ = lower limit of the modal interval
๐‘ = width of the modal interval
๐‘“๐‘š = frequency of the modal interval
๐‘“๐‘š−1 = frequency of the interval preceding the modal interval
๐‘“๐‘š+1 = frequency of the interval following the modal interval
3. The Median
The median for ungrouped or raw data is the middle value of an ordered array of numbers. For
grouped data, solving for the median is considerably more complicated. The calculation of the median
for grouped data is done by using the following formula.
Median Formula:
๐‘⌊๐‘›2−๐‘“(<)⌋
Me = O๐‘š๐‘’ +
๐‘“๐‘š๐‘’
36
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Where: O๐‘š๐‘’ = lower limit of the median interval
๐‘ = class width
๐‘› = sample size (number of observations)
๐‘“๐‘š๐‘’ = frequency count of the median interval
๐‘“(<) = cumulative frequency count of all intervals before the median interval
5.3.2 Non-central Location Measures
Definition: An economic system refers to any mechanism prevalent in the economy which is a vehicle
by which scarce resources are produced and distributed in order to satisfy human needs and wants
(Gemma, 2014).
• Interquartile range
• Quartile deviation
5.3.2.1 Measures of Dispersion for Ungrouped data
Interquartile range
It is the difference between the highest quartile (upper quartile) and the lowest quartile (lower
quartile). i.e. ๐‘ฐ๐‘ธ๐‘น = ๐‘ธ๐Ÿ‘ − ๐‘ธ๐Ÿ
Quartile deviation
It is a measure of the spread of the data values about the median.
i.e. ๐‘ธ๐‘ซ =
๐‘ธ๐Ÿ‘ −๐‘ธ๐Ÿ
๐Ÿ
5.3.3 Measures of Dispersion
These are statistical measures that quantify the spread of the data set about their central location
value. The main measures of dispersion include:
•
•
•
•
•
Range
Variance
Standard deviation
Coefficient of variation
Coefficient of skewness
Range
For our student survey data, 1, 3, 2, 2, 5, 4, 3, 3, 4, 3, we could report a range of 5 − 1 = 4 Unfortunately,
although the range is obviously a simple measure to compute and interpret, its ability to effectively
measure data dispersion is fairly limited. The problem is that only two values in the data set the
smallest and the largest are actively involved in the calculation. None of the other values in between
37
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
has any influence at all. Considering also the data set consisting of the values 3, 7, 5, 2, 4, 5, 1000. The
range would be 998, this would give a pretty misleading sense of the dispersion involved here since
all the values but one is clustered within 5 units of each other.) The measures described next are
intended to correct for this shortcomings Black (2013).
Variance
A variance is a measure of average squared deviation. It is calculated using all the data values in the
dataset.
For Ungrouped data
For Grouped data
๐‘‰๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘›๐‘๐‘’ =
๐‘†๐‘ข๐‘š ๐‘œ๐‘“ ๐‘ ๐‘ž๐‘ข๐‘Ž๐‘Ÿ๐‘’๐‘‘ ๐‘‘๐‘’๐‘ฃ๐‘–๐‘Ž๐‘ก๐‘–๐‘œ๐‘›๐‘ 
๐‘†๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘ ๐‘–๐‘ง๐‘’ −1
ฬ…)
๐‘– −๐‘ฅ
= ๐‘  2 = ∑(๐‘ฅ๐‘›−1
2
∑ ๐‘“๐‘– ๐‘ฅ๐‘– 2 − ๐‘›๐‘ฅฬ… 2
๐‘  =
๐‘›−1
2
Standard deviation
It is the square root of the variance.
For ungrouped data
∑(๐‘ฅ๐‘– −๐‘ฅฬ… )2
For grouped data
s=√
Coefficient of variation
∑ ๐‘“๐‘– ๐‘ฅ๐‘– 2 − ๐‘›๐‘ฅฬ… 2
√
๐‘  =
๐‘›−1
๐‘›−1
2
It is used to compare variability where data sets are given in different units. The coefficient of variation
essentially is a relative comparison of a standard deviation to its mean. The coefficient of variation can
be useful in comparing standard deviations that have been computed from data with different means.
๐‘†
Coefficient of variation (CV) = ๐‘ฅฬ… × 100
5.3.4 Measure of Skewness
Measures of shapes can be tools used in describing the shape of a distribution of data. This section,
examines the skewness. Skewness is a measure of the shape of a uni-modal distribution of numeric
data values. A distribution of data in which the right half is a mirror image of the left half is said to be
symmetrical. Skewness is when a distribution is asymmetrical or lacks symmetry
38
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 3.1 Relationships of Mean, Median and Mode
Source: Black (2013)
5.3.5 The Box Plot
Box and whisker plot is another way to describe a distribution of data. Sometimes referred as a box
plot, is a depiction of the upper and lower quartiles together with the median and the two extreme
values to show a distribution graphically. The median is then enclosed by the box. The box gets
extended outward from the median along a band to the lower and upper quartiles, encompassing not
only the median but also the middle 50% of the data. From the lower and upper quartiles, lines
referred to as whiskers are stretched out from the box toward the outermost data values. The boxand-whisker plot is determined from five specific numbers sometimes referred to as the five-number
summary Black (2013).
Figure 3.2 Box - and -Whisker Plot
39
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Think point
Most of the statistics presented here are drawn from studies or surveys. Let’s
say a study of laundry usage is done in 50 South Africa households that have
washers and dryers. Water measurements are taken for the number of gallons
of water used by each washing machine in completing a cycle. The following
data presented are the number of litres used by each washing machine during
the washing cycle. Summarize the data so those study findings can be
reported.
1. Calculate the Mean, mode, median, and standard deviation?
Revision Questions
1.
Select the appropriate central location measure (mean, median, mode)
referred to in each of the following statements.
(a) A quarter of our lecturers have more than 10 years’ work experience.
(b) The most wealth city in South Africa in Johannesburg.
(c) The average time taken by a runner to finish the 200m race is 17 seconds.
2. Identify for which of the following statements would the arithmetic mean be
inappropriate as a measure of central location? (Give a reason.) State which measure
of central location would be more appropriate, if necessary?
(a) The ages of children at a playschool
(b) The number of cars using a parking garage daily
(c) The brand of cereal preferred by consumers
(d) The value of transactions in a clothing store
5.3.6 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
40
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.4 WEEK 4: BASIC PROBABILITY CONCEPTS
Purpose
The purpose of this week is to provide a brief overview of the basic concepts
of probability to help a manager understand and use probabilities in decision
making.
By the end of this week, you will be able to:
Learning
Outcomes
Time
•
Understand the importance of probability in statistical analysis.
•
Define the different types of probability.
•
Describe the properties and concepts of probabilities.
•
Apply the rules of probability to empirical data.
•
Construct and interpret probabilities from joint probability tables.
•
Understand the use of counting rules (permutations and
combinations).
It will take you 12 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications.
Juta. Cape Town South Africa. 978-1-48511-193-1. Chapter 4
5.4.1 Introduction
Most business decisions are made under ambiguous conditions. Probability theory provides the
underpinning for quantifying and assessing uncertainty. It is used to estimate the dependability in
making inferences from samples to populations, as well as to quantify the uncertainty of future
occurrences.
5.4.2 Types of Probability
Uncertainty surrounds most aspects of the business situations. Frequently business people make
decisions based on chance. Probability theory provides a logical way of quantifying and evaluating
uncertainty.
Probability is the chance or possibility of a particular outcome out of a number of conceivable
outcomes occurring for a given event.
There are two types of probability:
• Subjective Probability
• Objective Probability
41
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Subjective Probability
The subjective method of assigning probability is based on the feelings or insights of the person
determining the probability. Subjective probability comes from the person's intuition or intellectual.
Although not a scientific approach to probability, the subjective method often is based on the accrual
of knowledge, understanding, and experience stored and processed in the human mind. At times it is
merely a supposition. At other times, the subjective probability can potentially yield accurate
probabilities. Subjective probability can be used to exploit the background of experienced workers
and managers in decision making. It is based on an educated guess, expert belief or value judgment.
This type of probability cannot be confirmed statistically, hence it has limited use. E.g. the probability
that it will rain in Cape town tomorrow is 0.15 Wegner (2016).
Objective Probability
It is founded on empirical observations or theoretical properties of an object. e.g. the probability of
getting a Head after tossing a coin is 0.5. With this method, the probability of an event occurring is
equal to the number of times the event has occurred in the past divided by the total number of
opportunities for the event to have occurred.
Formula:
๐‘Ÿ
A = event of a specific type
P(A) = ๐‘›
r = number of outcomes of event A
n = total number of all possible outcomes (sample space)
P (A) = Probability of event A occurring
e.g. A container contains 3 red balls and 2 black balls. If a ball is picked at random from the bag, what
is the probability that it is: (i) red, (ii) black?
Solution:
i
P(Red) = 3/5
ii
P(Black) = 2/5
Experiment
An experiment is a procedure that produces outcomes. Examples of business-oriented experiments
with outcomes that can be statistically analysed might include the following.
• Interviewing 10 randomly selected consumers and asking them which brand of washing powder do
they prefer
• Sampling every 100th bottle of KOO beans from an assembly line and weighing the contents
• Testing new antibiotic drugs on samples of HIV patients and measuring the patients' improvement
• Auditing every 5th account to detect any errors
• Recording the S&P 500 index on the first Monday of every month for 5 years
42
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Event
Since an event is an outcome of an experiment, the experiment expresses the possibilities of the
event. If the experiment is to sample five bottles coming off a production line, an event could be to
get one defective and four good bottles. In an experiment to roll a die, one event could be to roll an
even number and another event could be to roll a number greater than two.
5.4.3 Properties of a Probability
• The probability of an event A is a likelihood of the occurrence of an event. The probability of event
A (denoted by P(A)) is a number between 0 and 1 inclusive (i.e. 0 ≤ ๐‘ƒ(๐ด) ≤ 1).
• If P(A) = 0, then event A is unlikely to occur.
• If P(A) = 1, then event A is certain to occur.
• The sum of the probabilities of all possible events (i.e. the collective exhaustive set of events)
equals one, i.e. P(A1 ) + P(A2 ) + P(A3 )+. . . +P(Ak ) = 1, for k possible events.
• If P(A) is the probability of event A occurring. Then the probability of event A not occurring is
ฬ… ) = 1 − P(A). This is called complementary probability
defined as P(A
5.4.4 Basic Probability Concepts
Concept 1: Intersection of Two Events (A∩B)
The intersection of events A and B is the set of outcomes that belong to both A and B altogether. The
key word is “AND” Wegner (2016).
Concept 2: Union of Two Events (A∪B)
The union of events A and B is the set of outcomes that belong to either event A or B or both. The key
word is “OR” Wegner (2016).
Concept 3: Mutually Exclusive Events
Events are mutually exclusive if they cannot happen together on a single trial of a random experiment.
Two or more events are mutually exclusive events if the happening of one event precludes the
occurrence of the other event(s). This characteristic means that mutually exclusive events cannot
occur at the same time and therefore can have no intersection Wegner (2016).
NB* The probability of two mutually exclusive events taking place at the same time is zero.
Concept 4: Collectively Exhaustive Events
Events are collectively exhaustive when the union of all possible events is equal to the sample space.
i.e. at least one of the events is certain to occur in a randomly drawn object from the sample space
Wegner (2016).
Concept 5: Statistically Independent Events
Two events A and B are statistically independent if the happening of event A has no effect on the
outcome of event B and vice-versa. Two or more events are independent events if the occurrence or
non-occurrence of one of the events does not affect the occurrence or non-occurrence of the other
43
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
event(s). Certain experiments, such as rolling dice, yield independent events; each die is independent
of the other. Whether a 6 is rolled on the first die has no effect on whether a 6 is rolled on the second
die. Coin tosses always are independent of each other. The event of getting ahead on the first toss of
a coin is independent of getting ahead on the second toss. It is generally believed that certain human
characteristics are independent of other events Wegner (2016).
5.4.5 Calculating Objective Probabilities
Components of Objective Probabilities
Empirically derived objective probabilities can be classified into three categories:
• Marginal Probability
• Joint Probability
• Conditional Probability
Marginal Probability
A marginal probability is the probability of only a single event A occurring. i.e. the outcome of only
one random variable Wegner (2016).
Joint Probability
A joint probability is the probability of both event A and event B occurring simultaneously on a given
trial of a random experiment Wegner (2016).
Conditional Probability
A conditional probability is the probability of one event A occurring, given information about the
occurrence of a prior event Wegner (2016).
Figure 4.1 Marginal, Union, Joint and Conditional Probabilities
Source: Black 2013
44
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.4.6 Probability Rules
There are basically two probability rules:
• Addition Rule
• Multiplication Rule
Addition Rule
For non-mutually exclusive events
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
For mutually exclusive events
Multiplication Rule
P(A ∪ B) = P(A) + P(B)
For statistically dependent events
P(A ∩ B) = P(A|B) × P(B)
For statistically independent events
P(A ∩ B) = P(A) × P(B)
5.4.7 Probability Trees
A probability tree is a graphical way to apply probability rules where there are multiple events that
happen in sequence and these events can be represented by branches (similar to a tree).
See example on page 114 of the prescribed textbook.
5.4.8 Permutations and Combinations
Most probability questions involve counting large numbers of event outcomes and a total number of
outcomes (n). Counting rules assist in finding values of r and n.
Multiplication Rule of counting
Factorial Notation
It is used to find the total number of different ways in which n objects of a single event can be arranged
(ordered).
n! = n factorial = n(n-1)(n-2)(n-3) ….. 3.2.1
Permutations
A permutation is the number of distinct ways of arranging a subset of r objects selected from a group
of n objects where the order is important. Each possible arrangement (ordering) is called a
permutation.
Formula
45
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
.
๐’๐‘ท๐’“
Combinations Rule
Damelin©
๐’!
= (๐’−๐’“)!
A combination is the number of diverse ways of bringing together a subset of r objects selected from
a group of n objects where the order is not important. Each separate grouping is called a combination.
Formula
.
๐’๐‘ช๐’“
๐’!
= ๐’“!(๐’−๐’“)!
Revision Questions
1. If an event has a probability equal to 0.2, what does this mean?
2. What term is used to describe two events that cannot occur
3.
4.
5.
6.
simultaneously in a single trial of a random experiment?
What is meant when two terms are said to be ‘statistically independent’?
If P (A) = 0.26, P (B) = 0.35 and P(A and B) = 0.14, what is the value of P(A
or B)?
If P(X) = 0.54, P(Y) = 0.36 and P(X and Y) = 0.27, what is the value of
P(X/Y)? Is it the same as P(Y/X)?
Economic sectors
In a survey of companies, it was found that 45 were in the mining sector,
72 were in the financial sector, 32 were in the IT sector and 101 were in
the production sector.
a) Show the data as a percentage frequency table.
b) What is the probability that a randomly selected company is in the
financial sector?
c) If a company is selected at random, what is the probability that this
company is not in the production sector?
d) What is the likelihood that a randomly selected company is either a
mining company or an IT company?
e) Name the probability types or rules used in questions b, c and d.
5.4.9 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
46
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.5 WEEK 5: PROBABILITY DISTRIBUTIONS
Purpose
The purpose of this week is to introduce a few important probability
distributions that occur most often in management situations and also
describe patterns of outcomes for both discrete as well as continuous events.
By the end of this week, you will be able to:
Learning
Outcomes
Time
•
Understand the concept of a probability distribution.
•
Describe three common probability distributions used in management
practice.
•
Identify applications of each probability distribution in management
•
Calculate and interpret probabilities associated with each of these
distributions.
It will take you 12 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications. Juta.
Cape Town South Africa. 978-1-48511-193-1. Chapter 5
5.5.1 Introduction
This week makes known to us the probability distributions. Probabilities can also be derived using
mathematical functions known as probability distributions. Probability distributions quantify the
uncertain conduct of many random variables in business practice. Probability distributions can define
patterns of outcomes for both discrete as well as continuous events. important study unit, as it lays
the foundation for most of the economic analysis in Microeconomics.
5.5.2 Types of Probability Distribution
A probability distribution is a list of all the conceivable outcomes of a random variable and their
associated probabilities of occurrence. Probability distributions can be classified into two groups:
• Discrete Probability Distributions
• Continuous Probability Distributions
5.5.3 Discrete Probability Distributions
These are used to model random variables that take whole number values only. i.e. specific values.
e.g. 0, 1, 2, 3, 4, etc. A random variable is a discrete random variable if the set of all possible values is
at most a finite or a countably infinite number of possible values. In most statistical situations, discrete
random variables produce values that are nonnegative whole numbers.
The two common discrete probability distributions are:
• The Binomial Probability Distribution
• The Poisson Probability Distribution
47
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.5.4 Binomial Probability Distribution
The word binomial indicates any single trial of a binomial experiment consists of only two possible
outcomes. These two outcomes are categorized as success or failure. Usually, the outcome of interest
to the analyst is labelled a success.
Example
1. If a quality analyst is looking for defective products, he would contemplate finding a defective
product a success even though the company would not consider a defective product a success.
2. If analysts are studying HIV patients, the outcome of getting an HIV person in a trial of an
experiment is a success.
The other possible outcome of a trial in a binomial experiment is called a failure. The word failure is
used only in opposition to success.
Characteristics:
• There are only two, mutually exclusive and collectively exhaustive outcomes of the random
variable, success & failure.
• Each outcome has an associated probability:
• Probability of success = p and probability of failure = q
• p + q = 1 (always)
• The random variable is observed n times/trials. Each trial generates either a success or failure.
• Then trials are independent of each other. i.e. p & q are constant.
The Binomial question: What is the probability that r successes will occur in n trials of the process
under study?
The Binomial formula:
๐‘ƒ(๐‘ฅ) = n.๐ถx ๐‘ ๐‘ฅ (1 − ๐‘)๐‘›−๐‘ฅ
๐‘“๐‘œ๐‘Ÿ ๐‘ฅ =0, 1, 2, 3, … n
Where n = sample size
r = the number of successes in the n independent trials
p = probability of a success outcome
q = probability of a failure outcome
Descriptive Statistical Measures of the Binomial Distribution
A measure of central location and a measure of dispersion can be calculated for any random variable
that follows a binomial distribution using the following formulae:
Mean ๐œ‡=๐‘›๐‘
Standard deviation: ๐œŽ=√๐‘›๐‘(1−๐‘)
How to select p
The success outcome is always associated with the probability, p. the outcome that must be labelled
as the success outcome is identified from the binomial question.
48
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Useful Pointers on Calculating Probabilities
•
•
Key words such as at least, no more than, at most, no less than, smaller than, larger than,
greater than, no greater than always imply cumulative probabilities (i.e. the summing of
individual marginal probabilities.
The complementary rule should be considered whenever practical to reduce the number of
probability calculation
Revision question
One study by CNNMoney reported that 60% of workers have less
than $25,000 in total savings and investments (excluding the value
of their home). If this is true and if a random sample of 20 workers is
selected, what is the probability that fewer than 10 have less than
$25,000 in total savings and investments?
5.5.5 Poisson Probability Distribution
The Poisson distribution defines the occurrence of infrequent events. In fact, the Poisson formula has
been signified to as the law of improbable events.
Example
1. Serious accidents at a chemical plant are rare, and the number per month might be described by
the Poisson distribution.
2. The number of random customer arrivals per five-minute interval at a small boutique on weekday
mornings.
The Poisson distribution is every so often used to explain the number of random arrivals per some
time interval. If the number of arrivals per interval is too recurrent, the time interval can be reduced
enough so that a rare number of occurrences is expected. In the field of management science, models
used in queuing theory are usually based on the assumption that the Poisson distribution is the proper
distribution to describe random arrival rates over a period of time. In statistical quality control, the
Poisson distribution is the basis for the c control chart used to track the number of non-conformances
per item or unit.
Characteristics:
Measures the number of occurrences of a particular event of a discrete random variable.
• There is a pre-determined time, space or volume interval.
• The average number of occurrences of the event is known or can be determined.
The Poisson question: What is the probability of x occurrences of a given event being observed in a
predetermined time, space or volume interval?
49
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
The Poisson formula:
๐‘ƒ(๐‘ฅ) =
๐‘’ −๐œ† ๐œ†๐‘ฅ
๐‘ฅ!
for ๐‘ฅ = 0, 1, 2, 3, …
Where:
a = the mean number of occurrences of a given event of the random variable for a predetermined
time, space or volume interval.
e = a mathematical constant
x = number of occurrences for which the probability is required.
Descriptive Statistical Measures of the Poisson distribution
A measure of central location and a measure of dispersion can be calculated for any random variable
that follows a Poisson process using the following formulae:
๐‘€๐‘’๐‘Ž๐‘›: ๐œ‡ = ๐œ†
Standard deviation: ๐œŽ = √๐œ†
Revision question
One study by CNNMoney reported that 60% of workers have less
than $25,000 in total savings and investments (excluding the value
of their home). If this is true and if a random sample of 20 workers is
selected, what is the probability that fewer than 10 have less than
$25,000 in total savings and investments?
5.5.6 Continuous Probability Distribution
These are used to model random variables that take both fractional and whole numbers. i.e. intervals
of x-values. Continuous random variables take on values at every point over a given interval. Thus
continuous random variables have no gaps or unassumed values. It could be said that continuous
random variables are generated from experiments in which things are “measured” not “counted.”
Example,
If a worker is assembling a product component, the time it takes to accomplish that feat could be any
value within a reasonable range such as 3 minutes 36.4218 seconds or 5 minutes 17.5169 seconds.
A list of measures for which continuous random variables might be generated would include time,
height, weight, and volume.
The following are examples of experiments that could produce continuous random variables:
1. Sampling the volume of liquid nitrogen in a storage tank
2. Determining the time between customer arrivals at a retail outlet
3. Determining the lengths of newly designed automobiles
50
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
4. Determining the weight of grain in a grain elevator at different points of time
NB* The main continuous probability random variable is the normal distribution.
5.5.7 Normal Probability Distribution
Probably the most extensively known and used of all distributions is the normal distribution. It fits
many human characteristics, such as height, weight, length, speed, IQ, scholastic achievement, and
years of life expectancy, and many others. Like their human counterparts, living things in nature, such
as trees, animals, insects, and others, have many characteristics that are normally distributed. Many
variables in business and industry also are normally distributed.
Examples
1. The annual cost of household insurance.
2. The cost per square foot of renting warehouse space.
3. Managers' satisfaction with support from ownership on a five-point scale.
In addition, most items produced or filled by machines are normally distributed.
Characteristics:
•
•
•
•
•
•
It is a smooth bell-shaped curve
It is symmetrical about the central mean value.
The tails of the curve are asymptotic.
The distribution is always described by two parameters, mean & standard deviation
The total area under the curve will always equal one.
The probability associated with a particular range of x-values is described by the area under the
curve between the limits of the given x range. ( ๐‘ฅ1 < ๐‘ฅ < ๐‘ฅ2 ).
Finding Probabilities using the normal distribution
Special statistical tables are used to obtain probabilities for a range of values of x.
5.5.8 Standard Normal (z) Probability Distribution
The normal distribution is described or illustrated by two parameters, the mean, μ, and the standard
deviation, σ. That is, every unique set of the values of μ and σ explains a different normal distribution.
Note that every change in a parameter (μ or σ) determines a different normal distribution. This
characteristic of the normal curve (a family of curves) could make analysis by the normal distribution
tedious because volumes of normal curve tables—one for each different combination of μ and σ—
would be required. Fortunately, a method was established by which all normal distributions can be
transformed into a single distribution: the z distribution. This process produces the standardized
normal distribution (or curve). The changeover formula for any x value of a given normal distribution
is as follows.
๐’›=
๐’™−๐
๐ˆ
51
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Revision Questions
1. Name two regularly used discrete probability distributions.
2. Specify whether each of the subsequent random variables is discrete
or continuous:
a)
The mass of cans coming off a production line
b)
The number of employees in a company
c)
The number of households in Gauteng that have solar heating
panels
d)
The distance traveled daily by a courier service truck.
3. Use the binomial formula to find each of the following probabilities:
(i) n = 7, p = 0.2 and x = 3
(ii) n = 10, p = 0.2 and x = 4
(iii) n = 12, p = 0.3 and x ≤ 4
(iv) n = 10, p = 0.05 and x = 2 or 3
(v) n = 8, p = 0.25 and x ≥ 3
4. Once a week a merchandiser restock of a particular product brand in
six stores for which she is responsible. Experience has shown that
there is a one-in five chance that a given store will run out of stock
before the merchandiser’s weekly visit.
a)
Which probability distribution is appropriate in this problem?
Why?
b)
What is the probability that, on a given weekly round, the
merchandiser will find exactly one store out of stock?
c)
What is the probability that, at most, two stores will be out of
stock?
d)
What is the probability that no stores will be out of stock?
e)
What is the mean number of stores out of stock each week?
Note: Calculate the probabilities in (b)–(d) using the binomial formula and
then the Excel function BINOMDIST.
5.5.9
Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
52
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.6 WEEK 6: CONFIDENCE INTERVAL ESTIMATION
Purpose
The purpose of this week is to is to explain the process of confidence interval
estimation.
•
•
Learning
Outcomes
•
•
•
Time
Understand and explain the concept of a confidence interval
Calculate a confidence interval for a population mean and a population
proportion
Interpret a confidence interval in a management context
Identify factors that affect the precision and reliability of confidence
intervals
Determine sample sizes for desired levels of statistical precision.
It will take you 15 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications.
Juta. Cape Town South Africa. 978-1-48511-193-1. Chapter 7
5.6.1 Introduction
Inferential statistics’ role is to use sample evidence to ascertain population parameters. An important
and. reliable procedure to estimate a population measure will be to use the sample statistic as a
reference point and to create intervals of values around it. This would likely cover the true population
parameter with a stated level of confidence and the procedure would be called confidence interval
estimation.
5.6.2 Point Estimation
A point estimate is a statistic drawn from a sample used to estimate a population measure. A point
estimate is as good as the representation of its sample. If some other random samples are drawn from
the population, the point estimates derived from those samples would vary. Due to this variation in
sample statistic, estimating parameters of the population with interval estimate will be most
preferable than using a point estimate. The point estimate is defined as the value to a single sample
statistic, used in representing the true, but unknown value of a population parameter. For instance,
sample mean is utilized to estimate the population mean measure and sample proportion statistics is
used to estimate population proportion measure. Wegner (2016)
5.6.3 Confidence Interval Estimation
Interval estimate (confidence interval) would be a range of values constructed around the value of
sample statistic within which population parameter would be expected to lie with a certain level of
confidence. A confidence would be bounded (upper and lower) of values within which the analyst may
declare, with some confidence, were the population parameter is thought to be lying. Interval
estimates may be two-sided or one-sided. Due to the central limit theorem, the below z formula for
53
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
sample means statistic can be used when the population standard deviation. The parameter is known
regardless of the shape of the population when the sample size is large, or even for small sample sizes
if population standard deviation is known then would imply the population is normally distributed.
Rearranging this formula algebraically to solve for μ gives
Because a sample mean can be greater than or less than the population mean, z can be positive or
negative. Thus the preceding expression takes the following form:
When this expression is rewritten it would yield the confidence interval formula for estimating μ with
large sample sizes when the population standard deviation is known.
100(1-α) % confidence interval to estimate μ, when σ is known:
ฬ… ± ๐’๐œถ⁄๐Ÿ
๐’™
๐ˆ
√๐’
Alpha (α) would be the area under the normal curve in the tails of the distribution outside the area
defined by the confidence interval. Here we use α to locate the z value in constructing the confidence
interval. For instance, if we would want to build a 95% confidence interval, the level of confidence is
95%, or .95. If 100 such intervals are constructed by taking random samples from the population, it is
likely that 95 of the intervals would include the population mean and 5 would not. As the level of
confidence is increased, the interval gets wider, provided the sample size and standard deviation
remain constant.
For 95% confidence, α = .0.5 and α/2 = .0.25. The value of zα/2 or z.025 is found by looking in the
standard normal table under .0.5000 = .4750. This area in the table is associated with a z value of 1.96.
Another way can be used to locate the table z value. Because the distribution is symmetric and the
intervals are equal on each side of the population mean, ½(95%), or .4750, of the area is on each side
of the mean. It would yield a z value of 1.96 for this portion of the normal curve. Thus the z value for
a 95% confidence interval is always 1.96. In other words, of all the possible values along the horizontal
axis of the diagram, 95% of them should be within a z score of 1.96 from the population mean.
54
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 7.1 Z scores for confidence Intervals in Relation to α
Source: Black 2013
Figure 7.2 Distribution of Sample Means for 95% Confidence
Source: Black 2013
Think Point
A survey was taken of South African. Companies that do
business with firms in Nigeria. One of the questions on
the survey was: Approximately how many years has your
company been trading with firms in Nigeria? A random
sample of 44 responses to this question yielded a mean
of 10.455 years. Suppose the population standard
deviation for this question is 7.7 years. Using this
information, construct a 90% confidence interval for the
mean number of years that a company has been trading
in Nigeria for the population of South Africa. Companies
trading with firms in Nigeria.
55
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.6.4 Confidence Interval for a single population mean: sample
standard deviation is known, n – large (n>30)
Confidence Interval =
ฬ… ± ๐’๐œถ⁄๐Ÿ
๐’™
๐ˆ
√๐’
Where ๐‘ฅฬ… = Sample mean
z= value from the standard normal tables
๐œŽ = population standard deviation
n = sample size
5.6.5 The Precision of a Confidence Interval
The width of a confidence interval is a measure of its precision, If the confidence interval is narrower
the more precise is the interval estimate, and vice versa.
The width of the confidence interval is influenced by:
• the specified confidence level
• the sample size
• the population standard deviation
Most commonly used confidence intervals are shown in the table below
Confidence Level
90%
๐’› − ๐’๐’Š๐’Ž๐’Š๐’•๐’”
±1.645
95%
±1.96
99%
5.6.6 The Student t-distribution
±2.58
In the formulas and problems deliberated so far in this unit, sample size was assumed large
(n ≥ 30). In the business world, however, sample sizes may be small. While the central limit theorem
applies only when the sample size is large, the distribution of sample means is approximately normal
even for small sizes if the population is normally distributed Thus, if it is known that the population
56
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
from which the sample is being drawn is normally distributed and if σ is known, the z formulas
presented in this previously can still be used to estimate a population mean even if the sample size is
small (n < 30).
Example
Suppose a South Africa. car rental firm wants to estimate the average number of
kms travelled per day by each of its cars rented in Cape Town. A random sample
of 20 cars rented in Cape Town reveals that the sample mean travel distance per
day is 85.5 km, with a population standard deviation of 19.3 km. Compute a 99%
confidence interval to estimate μ.
Here, n = 20, ๐‘ฅฬ… = 85.5, and σ = 19.3. For a 99% level of confidence, a z value of 2.575 is obtained.
Assume that number of kms traveled per day is normally distributed in the population. The confidence
interval is
The point estimate indicates that the average number of kms travelled per day by a rental car in Cape
Town is 85.5 with a margin of error of 11.1 kms. With 99% confidence, we estimate that the population
means is somewhere between 74.4 and 96.6 kms per day.
5.6.7 Confidence Interval for a Single Population Mean (μ) when
the Population Standard Deviation (σ) is unknown
The t distribution would be used instead of the z distribution for performing inferential statistics on
the population mean in cases where the population standard deviation is unknown and the population
is normally distributed. The formula for the t statistic is:
This formula is essentially the same as the z formula, but the distribution table values are different.
Confidence interval to estimate ๐œ‡: Population Standard deviation unknown and the population
normally distributed,
ฬ… ± ๐’•๐œถ⁄๐’−๐Ÿ
๐’™
๐Ÿ
๐‘บ
√๐’
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
57
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Example
In the aerospace industry, some companies allow their employees to accumulate
extra working hours beyond their 40-hour week. These extra hours sometimes are
referred to as green time or comp time. Many managers work longer than the eighthour workday preparing proposals, overseeing crucial tasks, and taking care of
paperwork. Recognition of such overtime is important. Most managers are usually
not paid extra for this work, but a record is kept of this time and occasionally the
manager is allowed to use some of this comp time as extra leave or vacation time.
Suppose a researcher wants to estimate the average amount of comp time
accumulated per week for managers in the aerospace industry. He randomly samples
18 managers and measures the amount of extra time they work during a specific
week and obtains the results shown (in hours).
Here constructs a 90% confidence interval to estimate the average amount of extra
time per week worked by a manager in the aerospace industry. He assumes that
comp time is normally distributed in the population.
The sample size is 18, so df = 17. A 90% level of confidence results in α/2 = .05 area in each tail. The
table t value is
t.05, 17 =1,740
The subscripts in the t value denote to other researchers the area in the right tail of the t distribution
(for confidence intervals α/2) and the number of degrees of freedom. The sample mean is 13.56 hours,
and the sample standard deviation is 7.80 hours. The confidence interval is computed from this
information as
The point estimate for this problem is 13.56 hours, with a margin of error of ±3.20 hours. The
researcher is 90% confident that the average amount of comp time accumulated by a manager per
week in this industry is between 10.36 and 16.76 hours. From these figures, aerospace managers could
attempt to build a reward system for such extra work or evaluate the regular 40-hour week to
determine how to use the normal work hours more effectively and thus reduce comp time.
58
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.6.8 Confidence Interval for the Population Proportion (π)
Business decision makers and researchers often need to be able to estimate a population proportion.
Example
What proportion of the market does our company control (market
share)? What proportion of our products is defective? What
proportion of customers will call customer service with complaints?
What proportion of our customers is in the 20-to-30 age group?
What proportion of our workers speaks Xhosa as a first language?
Techniques similar to those previously discussed can be used to
estimate the population proportion.
The central limit theorem for sample proportions leads to the following formula
Standard error of sample proportion p is calculated using
๐‘(1 − ๐‘)
๐œŽ๐‘ ≈ √
๐‘›
Thus the confidence interval for a single population proportion, p, is given by
๐‘ − ๐‘ง√
๐‘(1−๐‘)
๐‘›
(lower limit)
๐‘(1−๐‘)
๐‘›
≤ ๐œ‹ ≤ ๐‘ + ๐‘ง√
(upper limit)
Example
A study of 87 randomly selected companies with a telemarketing
operation revealed that 39% of the sampled companies used
telemarketing to assist them in order processing. Using this
information, how could An analyst estimate the population
proportion of telemarketing companies that use their telemarketing
operation to assist them in order processing?
For n = 87 and p = .39, a 95% confidence interval can be computed
to determine the interval estimation of p. The z value for 95%
confidence is 1.96.
The confidence interval estimate is computed as follows
59
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
This interval suggests that the population proportion of telemarketing firms that use their operation
to assist order processing is somewhere between .29 and .49, based on the point estimate of .39 with
a margin of error of ±.10. This result has a 95% level of confidence
Revision Questions
1) What is the aim of a confidence interval?
2) If x = 85, σ = 8 and n = 64, set up a 95% confidence interval estimate of the population mean,
μ.
3) If the population standard deviation, σ, is not known, what standardised statistic used to
construct a confidence interval?
4) If x = 54, s = 6 and n = 25, set up a 90% confidence interval estimate of the population mean,
μ.
5) The Department of Trade and Industry (DTI) conducted a survey to estimate the average
number of employees per small and medium-sized enterprises (SME) in Gauteng. A random
sample of 144 SMEs in Gauteng found that the average number was 24.4 employees. Assume
that the population standard deviation is: 8
5.6.9 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
60
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.7 WEEK 7: HYPOTHESES TESTS – SINGLE POPULATION
(PROPORTIONS & MEANS)
The purpose of this unit is to explain the process to test the validity of a
manager’s claim using a sample evidence. This unit only covers the
hypothesis testing for single population mean and single population
proportion
Purpose
By the end of this unit, you will be able to:
•
•
•
•
•
•
Learning
Outcomes
Time
Understand the concept of hypothesis testing
Perform hypothesis tests for a single population mean
Perform hypothesis tests for a single population proportion
Distinguish when to use the z-test statistic or the t-test statistic
Correctly interpret the results of a hypothesis test
Correctly translate the statistical results into management conclusions.
It will take you 12 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications.
Juta. Cape Town South Africa. 978-1-48511-193-1. Chapter Eight
5.7.1 Introduction
In this unit, we focus on another approach of inferential statistics, where a claim made about the true
value of a population parameter is assessed for validity. The hypothesis testing is a statistical process
to test the validity of such claims using sample evidence.
5.7.2 The Process of Hypothesis Testing
The hypothesis testing is a statistical rigorous process of testing for the closeness of a sample statistics
to a hypothesized population parameter.
General Procedure:
Step 1: Formulation of the statistical hypotheses (Null & alternative)
Step 2: Computing of the sample test statistic
Step 3: Determining the rejection criteria
61
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Step4: Comparing the sample test statistic to the rejection criteria and make a conclusion.
Step 5: Make statistical and management conclusions
5.7.3 Hypothesis Test for a Single Population Mean (μ) –
Population Standard Deviation (σ) is known
The most basic hypothesis test is a test about a population mean. A business analyst might be
interested in testing to find out whether an established or accepted mean value for an industry is still
true or in testing a hypothesized mean value for a new theory
Example
1. A company dealing with computer products sets up a
telephone service to assist customers by providing technical
support. The average wait time during weekday hours was 35
minutes. However, more technical consultants were hired to
the system; managers believe that it resulted in a decrease to
waiting time, and they wish to prove it.
2. A boutique investment firm wishes to test to determine
whether the average hourly change in the JSE average over a
5-year period is +0.25.
3. A manufacturing company wishes to test and determine
whether the average thickness of a plastic bottle is 2.2
millimetres.
4. A retail store wants to test to determine whether the average
age of its customers is less than 42 years.
The formula below can be used to test hypotheses about a single population mean. When σ has known
if the sample size is large (n ≥ 30) for any population and for small samples (n < 30) If x is known to be
normally distributed in the population.
Z Test for single mean
๐’›-๐’”๐’•๐’‚๐’• =
ฬ…−๐
๐’™
๐ˆ
√๐’
5.7.4 Hypothesis Test for a Single Population Mean (μ) –
Population Standard Deviation (σ) is Unknown
Most of the times when a business analyst is gathering data to test hypotheses about a single
population mean, the value of the population standard deviation is unknown and the analyst must
use the sample standard deviation as an estimate of it. In such cases, the z test cannot be used. In the
previous study unit, the t distribution was presented which can be used to analyse hypotheses about
a single population mean when σ is unknown if the population is normally distributed for the
62
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
measurement being studied. In this part of the unit, the t-test is discussed for a single population
mean. More often, the t-test is applicable whenever the analyst is drawing a single random sample to
test the value of a population mean (μ), the population standard deviation is unknown, and the
population is normally distributed for the measurement of interest.
The formula for testing such hypotheses follows.
๐‘ก-๐‘ ๐‘ก๐‘Ž๐‘ก =
๐‘ฅฬ… −๐œ‡
๐‘ 
√๐‘›
5.7.5 Hypothesis Test for a Single Population Proportion (π)
Data analysis used in decision making often contains proportions to describe such aspects as a
consumer makeup, quality defects, market share, on-time delivery rate, profitable stocks etc. Most
often business surveys produce information expressed in proportion form, such as .35 of all businesses
offer flexible hours to its employees or .78 of all businesses have social networks for customers. The
business analyst would conduct hypothesis tests about such proportions to determine whether they
have changed in some way.
Example
1. Suppose a company held a 36%, or .36, share of the market
for several years. Resulting from a massive marketing effort
and improved product quality, company officials believe that
the market share increased, and they want to prove it.
2. A market researcher analyst wishes to test to determine
whether the proportion of old car purchasers who are female
has increased.
3. A financial analyst wants to test to determine whether the
proportion of companies that were profitable last year in the
average investment officer's portfolio is 0.50.
4. A quality assurance manager for a large manufacturing firm
wishes to test to determine whether the proportion of
defective items in a batch is less than 0.04.
The formula below makes possible the testing of hypotheses about the population proportion in a
manner similar to that of the formula used to test sample means
๐‘ง-๐‘ ๐‘ก๐‘Ž๐‘ก =
√
๐‘−๐œ‹
๐œ‹(1−๐œ‹)
๐‘›
5.7.6 The p-value approach to hypothesis testing
P-value method is another way to reach a statistical conclusion in hypothesis testing problems. The pvalue technique tests hypotheses in where there is no present level or critical values of the test
63
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
statistic. Decisions to reject or fail to reject the null hypothesis are made using a p-value technique,
which is the probability of getting a test statistic at least as extreme as the observed test statistic
computed under the assumption that the null hypothesis is true. P-value is sometimes referred to as
the observed significance level. The p-value technique has grown in importance with the increasing
use of statistical computer packages to test hypotheses. The good thing about this method is that
most computer statistical packages yield a p-value for every analysis. The p-value defines the smallest
value of alpha ๐›ผ for which the null hypothesis could be rejected.
Example
Suppose the p-value of a test is 0.038, the null hypothesis cannot be
rejected at α = 0.01 because 0.038 is the smallest value of alpha for
which the null hypothesis can be rejected and is larger than 0.01.
However, the null hypothesis can be rejected for α = 0.05 because
the p-value = 0.038 is smaller than α = 0.05.
Manually solving for a p-value
Suppose an analyst is conducting a one-tailed test with a rejection region in the upper tail, and the
analyst obtains an observed test statistic of z = 2.04 from the sample data. Using the standard normal
table, we find that the probability of randomly obtaining a z value this great or greater by chance is
0.5000− 0.4793 = 0.0207. Thus, the p-value for this problem is 0.0207. Using this information, the
analyst would reject the null hypothesis for α = 0.05 or 0.10 or any value larger than 0.0207. The
analyst would not reject the null hypothesis for any alpha value less than 0.0207 (in particular, α =
0.01, 0.001, etc.). When conducting two-tailed tests, remember that alpha is split to determine the
critical value of the test statistic. For the two-tailed test, the p-value can be compared to α/2 to reach
a statistical conclusion. If the p-value is less than α/2, the decision would be to reject the null
hypothesis.
Revision Questions
1. What is meant by the term ‘hypothesis testing’?
2. What determines whether a claim about a population parameter value is accepted as probably
true or rejected as probably false?
3. Name the five steps of hypothesis testing.
4. What information is required to determine the critical limits for the region of acceptance of a null
hypothesis?
5. If −1.96 ≤ z ≤ 1.96 defines the limits for the region of acceptance of a two-tailed hypothesis test
and z-stat = 2.44, what statistical conclusion can be drawn from these findings?
64
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.7.7 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
65
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.8 WEEK 8: SIMPLE LINEAR REGRESSION AND CORRELATION
ANALYSIS
The purpose of this unit is to explain the technique used to quantify the
relationship between variables and also identify the strength of the
relationship as well as pointing out the significant variable in the
prediction
Purpose
By the end of this unit, you will be able to:
Learning
Outcomes
Time
•
Explain the meaning of regression analysis
•
Identify practical examples where regression analysis can be used
construct a simple linear regression model
•
Use the regression line for prediction purposes
•
Calculate and interpret the correlation coefficient
•
Calculate and interpret the coefficient of determination
It will take you 12 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based applications. Juta.
Cape Town South Africa. 978-1-48511-193-1. Chapter 12
5.8.1 Introduction
Business decisions most often are made through predicting the unknown values of numeric variables
using other numeric variables that may be related to it and for which values could be known. A
statistical method that quantifies the relationship between a single response variable and one or more
predictor variables is called regression analysis. This relationship, which is referred to as a statistical
model, is used for prediction purposes. Correlation analysis, on the other hand, determines the
strength of the relationships and determines which variables are useful in predicting the response
variable.
5.8.2 Simple Linear Regression
Regression analysis is the process of developing a mathematical model or function that can be used
to predict or determine one variable by another variable or other variables. The simplest regression
model is called simple regression or bivariate regression involving two variables in which just one
variable is predicted by just one other variable. In simple linear regression, the variable to be predicted
66
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
is referred to as the dependent variable (y). The predictor is referred to as the independent variable,
or explanatory variable, (x). In simple linear regression analysis, only a straight-line relationship
between two variables is assessed Nonlinear relationships and regression models with more than one
independent variable can be explored by using multiple regression models which are beyond the
scope of this module.
Independent variable (X): influences the outcome of the other variable
Dependent variable (Y): influenced by the independent variable
5.8.3 Scatter Plot
Usually, the first step in simple linear regression analysis is to develop a scatter plot (scatter diagram),
Graphing the data in this way yields preliminary information about the spread and shape of the data.
Figure 10.1 and Figure 10.2 is excel scatter plot of some data. In the case of the scatter diagrams below
try to imagine a line passing through the points. Is a linear fit possible? Would a curve fit the data
better? The scatter plot would give some rough idea of how well a regression line fits the data.
Figure 8.1 Scatter Plot of Airline Cost Data
Source: Black (2013)
67
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 10.2 Scatter Plot of Airline Cost Data
Source: Black (2013)
Determining the equation of a straight line
The first step in determining the equation of the regression line that passes through the sample data
is to establish the equation's form. In regression analysis, analysts use the slope-intercept equation of
a line. In statistics, the slope-intercept form of the equation of the regression line through the
population points is:
ฬ‚ = ๐’ƒ๐ŸŽ + ๐’ƒ๐Ÿ ๐’™
๐’š
Where:
๐‘ฅ = values of the independent variable
๐‘ฆฬ‚ = estimates values of the dependent variable
๐‘0 = ๐‘ฆ-intercept coefficient (where the regression line cuts the ๐‘ฆ-axis)
๐‘1 = slope (gradient) coefficient of the regression line
To construct the equation of the regression line for a sample of data, the analyst must determine the
values for b0 and b1. This procedure is sometimes referred to as least squares analysis. Least squares
analysis is a procedure whereby a regression model is constructed by obtaining the minimum sum of
the squared errors. On the basis of this premise and calculus, a particular set of equations has been
developed to produce components of the regression model.
68
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 10.3 Regression Line
Source: Black (2013)
Method of least squares:
It Is a mathematical technique that determines the values ๐‘0 and ๐‘1 , such that: the sum of the squared
deviations of data points from the fitted line is minimised
The mathematical calculations of coefficients ๐‘0 and ๐‘1 that results from the method of least squares
is as follows:
๐‘1 =
Interpreting the ๐’ƒ๐Ÿ Coefficient
๐‘› ∑ ๐‘ฅ๐‘ฆ − ∑ ๐‘ฅ ∑ ๐‘ฆ
๐‘› ∑ ๐‘ฅ 2 − (∑ ๐‘ฅ)2
๐‘0 =
∑ ๐‘ฆ − ๐‘1 ∑ ๐‘ฅ
๐‘›
The ๐‘1 regression coefficient is the slope of the regression line. It is a marginal rate of change measure.
It is interpreted as follows: for a unit change in ๐‘ฅ, ๐‘ฆ will change by the value of ๐‘1 .
Extrapolation
Extrapolation occurs when ๐‘ฅ-values are chosen from outside the domain to substitute into the
regression equation to estimate ๐‘ฆ.
69
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.8.4 Correlation Analysis
In our case of examining two related variables, correlation measures the degree of relatedness of
these variables. It can help business analysts determine,
Several measures of correlation are available, the selection of which depends mostly on the level of
data being analysed. Ideally, analysts would like to solve for ρ, the population coefficient of
correlation. However, because analysts virtually always deal with sample data, this unit introduces a
widely used sample coefficient of correlation, r
The term r is a measure of the linear correlation of two variables, it ranges between −1 and +1,
representing the strength of the relationship between the variables. For r-value of +1 denotes a
perfect positive relationship between two variables. For r-value of −1 denotes a perfect negative
correlation, indicating an inverse relationship between two variables: as one variable gets increases,
the other decreases. For r-value of 0 means, no linear relationship is present between the two
variables. It measures the strength of the linear association between X and Y. Pearson’s Correlation
Coefficient (r) measures the correlation between two ratio-scaled random variables.
Formula
r=
๐‘› ∑ ๐‘ฅ๐‘ฆ − ∑ ๐‘ฅ ∑ ๐‘ฆ
√[๐‘› ∑ ๐‘ฅ 2 − (∑ ๐‘ฅ)2 ] × [๐‘› ∑ ๐‘ฆ2 − (∑ ๐‘ฆ)2 ]
Where: ๐‘Ÿ = ๐‘กโ„Ž๐‘’ ๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘๐‘œ๐‘Ÿ๐‘Ÿ๐‘’๐‘™๐‘Ž๐‘ก๐‘–๐‘œ๐‘› ๐‘๐‘œ๐‘’๐‘“๐‘“๐‘–๐‘๐‘–๐‘’๐‘›๐‘ก
๐‘ฅ = ๐‘กโ„Ž๐‘’ ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’๐‘  ๐‘œ๐‘“ ๐‘–๐‘›๐‘‘๐‘’๐‘๐‘’๐‘›๐‘‘๐‘’๐‘›๐‘ก ๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘๐‘™๐‘’
๐‘ฆ = ๐‘กโ„Ž๐‘’ ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’๐‘  ๐‘œ๐‘“ ๐‘กโ„Ž๐‘’ ๐‘‘๐‘’๐‘๐‘’๐‘›๐‘‘๐‘’๐‘›๐‘ก ๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘๐‘™๐‘’
๐‘› = ๐‘กโ„Ž๐‘’ ๐‘›๐‘ข๐‘š๐‘๐‘’๐‘Ÿ ๐‘œ๐‘“ ๐‘œ๐‘๐‘ ๐‘’๐‘Ÿ๐‘ฃ๐‘Ž๐‘ก๐‘–๐‘œ๐‘›
5.8.5 The Coefficient of Determination (r 2 )
Suppose the sample correlation coefficient, r, is squared (r2), the resulting measure is called the
coefficient of determination. The coefficient of determination measures the proportion (or
percentage) of variation in the dependent variable, y, that is explained by the independent variable,
x. The coefficient of determination values ranges between 0 and 1 (or 0% and 100%). 0 ≤ r2 ≤ 1 or 0%
≤ r2 ≤ 100% r2 is an important indicator of the usefulness of the regression equation because it
measures how strongly x and y are associated. The closer r² is to 1 (or 100%), the stronger the
association between x and y. alternatively, the closer r² is to 0, the weaker the association between x
and y.
To the economist, the cost of using something in a particular way is the benefit foregone by not using
it in the best alternative way. This is called opportunity cost. Whereas accountants and businesspeople
consider only the actual expenses incurred to produce a product, the economist measures the cost of
production as the best alternative sacrificed (or foregone) by choosing to produce a particular product.
70
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Revision Questions
1. What is regression analysis? What is correlation analysis?
2. What name is given to the variable that is being estimated in a
regression equation?
3. What is the purpose of an independent variable in regression
analysis?
4. What is the name of the graph that is used to display the
relationship between the dependent variable and the independent
variable?
5. What is the name given to the method used to find the
regression coefficients?
6. Explain the strength and direction of the association between
two variables, x, and y that have a correlation coefficient of −0.78.
5.8.6 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
71
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
5.9 WEEK 9: TIME SERIES ANALYSIS: A FORECASTING TOOL
Purpose
The purpose of this unit is to explain how to treat time series data and
how to prepare forecasts of future levels of activities
By the end of this unit, you will be able to:
Learning
Outcomes
Time
•
Explain the difference between cross-sectional (survey) and time
series data
•
Explain the purpose of time series analysis
•
Identify and explain the components in time series analysis
•
Calculate and interpret the trend values in a time series
•
Calculate and interpret the seasonal influence in a time series
•
De-seasonalise a time series and explain its value
•
Prepare seasonally adjusted forecast values of a time series
It will take you 12 hours to make your way through this study week.
Reading
Wegner, T (2016). Applied business statistics methods and Excel-based
applications. Juta. Cape Town South Africa. 978-1-48511-193-1. Chapter 15
5.9.1 Introduction
Data collected on a given phenomenon over a period of time at systematic intervals is known as
time-series data. Time-series forecasting methods endeavours to account for changes over time by
studying patterns, trends, or cycle, or making use of information about previous time periods to
predict the outcome for a future time period. Time-series methods include naïve methods,
averaging, smoothing, regression trend analysis, and the decomposition of the possible time-series
factors. Most data used in the statistical analysis is known as cross-sectional data, meaning that it is
gathered from sample surveys at one point in time. Conversely, data can also be collected over time.
For instance, when a business collects its daily, weekly or monthly gross revenue; or when a
household records their daily or monthly electricity usage, they are gathering a time series of data.
5.9.2 The Components of a Time Series
The general conviction is that time-series data is comprised of four components: trend, cycles,
seasonal effects, and irregular fluctuations. Not all time-series data have all these features.
72
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Figure 11.1 Bond yield time series data
Source: Black (2013)
Bond yield data portrayed in Figure 9.1. The general trend appears to move downward and comprises
two cycles. Each of the cycles passes through approximately 5 to 8 years. It is possible, although not
presented here, that seasonal periods of highs and lows within each year result in seasonal bond
yields. In addition, irregular daily variations of bond yield rates may occur but are unexplainable. Timeseries data that comprise of no trend, cyclical or seasonal effects are thought to be stationary.
Approaches used to forecast stationary data analyse only the irregular fluctuation effects.
Figure 11.2 Time Series Effects
Source: Black (2013)
Figure 9.2, which shows the effects of these time-series elements on data over a period of 13 years.
The long-term general direction of data is referred to as a trend. Notice that even though the data
move through upward and downward periods, the general direction or trend is increasing Cycles are
73
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
patterns of highs and lows through which data move over time periods usually of more than a year.
Notice that the data in Figure 9.2 apparently move through two periods or cycles of highs and lows
over a 13-year period. Time-series data that do not extend over a long period of time may not have
enough “history” to show cyclical effects. Seasonal effects, on the other hand, are shorter cycles,
which generally occur in time periods of less than one year. Every so often seasonal effects are
measured by the month, but they may occur by a quarter or may be measured in as small a time frame
as a week or even a day. Note the seasonal effects shown in Figure 9.2 as up and down cycles, many
of which occur during a 1-year period. Irregular variations are rapid changes or “bleeps” in the data,
which ensue in even shorter time frames than seasonal effects. Irregular fluctuations can happen as
often as day to day. They are subject to momentary change and are often unexplained. Note the
irregular fluxes in the data of Figure 9.2.
5.9.3 Decomposition of a Time Series
Time series methods wish to separate the effects of each of the four factors on the actual time series.
Time series models are used on the basis for assessing the influence of these four components
assumes a multiplicative relationship between them. The multiplicative time series model is expressed
algebraically as:
y=T×C×S×I
Where:
T =trend
C =cycles
S =seasonal effects
I =irregular fluctuations
In this section of the unit, we examine statistical approaches to quantify trend and seasonal variations
only. These two components account for the most significant proportion of an actual value in a time
series. By isolating them, most of an actual time series value will be explained.
5.9.4 Trend Analysis
The long-term trend in a time series may be isolated by removing the medium- and short-term
fluctuations (i.e. cycles, seasonal and random) in the series. This results in either a smooth curve or a
straight line, depending on the method selected.
Two procedures for trend isolation could be:
• moving average method, which produces a smooth curve
• Regression analysis, which results in a straight-line trend.
The moving average time series is a smoother series than the original time series values. It has
removed the effect of short-term fluctuations (i.e. seasonal and irregular fluctuations) from the
original observations, y, by averaging over these short-term fluctuations. The moving average value
can be seen as reflecting mainly the combined trend and cyclical movements.
In symbol terms for the multiplicative model:
74
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
Moving average = T × C × S × I S × I = T × C
The moving average technique is an average that is updated or recomputed for every new time period
being considered. The most recent information is utilized in each new moving average. This advantage
is offset by the disadvantages that:
1. it is difficult to choose the optimal length of time for which to compute the moving average,
and
2. moving averages do not usually adjust for such time-series effects as trend, cycles, or
seasonality.
To determine the more optimal lengths for which to compute the moving averages, we would need
to forecast with several different average lengths and compare the errors produced by them.
5.9.5 Seasonal Analysis
Seasonal effects are patterns of data trait that ensue in periods of time of less than one year. How can
we separate out seasonal effects? The ratio-to-moving-average method is used to measure and
quantify these seasonal effects. This method asserts the seasonal influence as an index number. It
measures the percentage digression of the actual values of the time series, y, from a base value that
disregards the short-term seasonal effects. These base values of a time series represent the
trend/cyclical impacts only.
5.9.6 Uses of Time Series Indicators
Time series indicators are important planning aids to managers in two ways:
1. To de-seasonalise a time series (i.e. exclusion of seasonal influences), and so afford a
clearer vision of the longer-term trend/cyclical movements surfaces
2. To create seasonally adjusted trend forecasts of future values of a time series.
5.9.7 Self-Assessment
Let us see what you have learned so far by taking this short self-assessment.
The Self-Assessment for this unit is embedded within your Principles of
Microeconomics in myClass. Head on to the quiz to see how you have fared
with this section of content!
Be sure to complete the self-assessment quiz before you move on to the next
section!
75
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
lOMoARcPSD|21931040
Quantitative Techniques – STUDY GUIDE
Damelin©
6 REFERENCES
Bergquist, T., Jones, S. and Freed, N (2013) Understanding Business Statistics. John Wiley & Sons
Black K (2013). Business Statistics: For Contemporary Decision Making, 7th Edition. John Wiley & Sons.
Wegner, T, (2016) Applied Business Statistics: Methods and Excel-based Applications, 4th ed. Juta.
Cape Town South Africa
76
Downloaded by Joshua Benjamin Rodriguez (joshuabenjaminrodriguez@gmail.com)
Download