quantitative data analysis

advertisement
Faculty of Engineering
INTRODUCTION TO DATA ANALYSIS
IE 204
Contents
• INTRODUCTION
• LEARNING OUTCOMES
• RESEARCH PROCESS
• QUALITATIVE ANALYSIS
• QUANTITATIVE ANALYSIS
• DATA ANALYSIS
Introduction
Welcome to the data analysis presentation. The presentation covers
both qualitative and quantitative approaches to data analysis.
Data analysis is an important stage of the research process. This presentation includes a
summary of that process and explores specific areas of data analysis that might be
applicable to learners studying at undergraduate and post graduate levels.
It aims to provide a definition of qualitative and quantitative data analysis and
opportunities to explore textual and numerical data analysis through providing worked
examples and further opportunities for learners to develop knowledge and skills in data
analysis.
Research Process
Definition
Researchers who are attempting to answer a research question employ the research process.
Though presented in a linear format, in practice the process of research can be less
straightforward. This said, researchers attempt to follow the process and use it to present
their research findings in research reports and journal articles.
Research Question
The specific question that guides the research process.
Research Process
The process undertaken by researchers to answer research questions/hypotheses.
Learning Outcomes
The presentation aims to achieve the following learning outcomes:
• An awareness of the situation of qualitative data analysis within the inductive paradigm
• An awareness of the situation of quantitative data analysis within the deductive paradigm
• Skills in critically appraising the data analysis component of research studies
• An appreciation of the different approaches to qualitative data analysis
• An appreciation of the different approaches to quantitative data analysis
• Skills in undertaking basic qualitative and quantitative data analysis
Research Process Stages
Identifying research problems
Research problems need to be researchable and can be generated from practice, but must be grounded in the
existing literature. They may be local, national or international problems, that need addressing in order to
develop the existing evidence base.
Searching the existing literature base
A thorough search of the literature using data bases, internet, text and expert sources should support the
need to research the problem. This should be broad and in depth, showing a comprehensive search of the
problem area.
Critical review of the literature
A critical review framework should be employed to review the literature in a systematic way.
Developing the questions/ and or hypothesis
A more specific research question and /or hypothesis may be developed from the literature review, that
provides the direction for the research, which aims to provide answers to the question /hypothesis posed.
Theoretical base
The research may employ a theoretical base to examining the problem, especially seen in masters level
research and in many research studies.
Research Process Stages
Sampling strategies
Sampling is the method for selecting people, events or objects for study in research. Non-probability and
probability sampling strategies enable the researcher to target data collection techniques. These may need to
be of a specific size (sometimes determined by a power calculation) or composition.
Data collection techniques
These are the tools and approaches used to collect data to answer the research question/hypothesis. More
than one technique can be employed, the commonest are questionnaires, interviews and surveys.
Approaches to qualitative and quantitative data analysis
This component involves qualitative and quantitative approaches, dependent on the type of data collected.
Interpretation of results
The results are interpreted, drawing conclusions and answering the research question/hypothesis. Implications
for practice and further research are drawn, which acknowledge the limitations of the research.
Dissemination of research
The research and results can be presented through written reports, articles, papers and conferences, both in
print and electronic forms.
Qualitative Data
Qualitative data is extremely varied in nature. It includes virtually any information that can
be captured that is not numerical in nature. Here are some of the major categories or
types:
• In-Depth Interviews
These can be individual interviews of group interviews sometimes referred to as ‘focus
groups’. The data can be recorded in many ways. The purpose of the interview is to
understand the ideas of the interviewees.
• Direct Observation
• Written Documents
Qualitative Data
What types of questions produce qualitative data?
• Questions with textual responses (i.e where the answers are not in numerical format)
• Open ended questions; where the subject is expected to provide an answer in textual or
written form.
Therefore, the nature of qualitative data and quantitative data is different. Qualitative data
usually comes in the form of words whereas quantitative data consists of numbers.
However, qualitative data, can be coded numerically.
Explanation
• Assign numerical values to all responses received to any given question.
• Thereafter, the qualitative data is in numerical form and can be processed as quantitative
data.
Qualitative Data
A Simple Example
Population Size: 10
Open Ended Question:
-What topic of books are your favorite to read?
A) History
B) Sports
All responses received:
- History
(2)
- Sports
(1)
- Science
(0)
- Islam
(3)
- Novels
(2)
- I do not read any books. (2)
C) Science
D) Islam
‘Other’ Responses
E) Other ________
From a population size of 10,
we have received
6 different responses. The
responses are
qualitative in nature. We now
have to work
in order to quantify the data.
Qualitative Data
Now we have to work to translate the qualitative data into numerical format, which is
easier to tabulate and interpret.
First, we have to assign each received response a label, which is easier to understand that
words or phrases.
We can refer to this as a ‘key’ , as it simplifies the process of representing the data and
understanding it.
KEY DESIGN
For this example, the word ‘theme’ will be used in order to translate the data numerically.
The word itself is interchangeable and therefore letters (A, B, C), numbers (1, 2, 3) or any
other identifying scale could be used. As we have received 6 different responses to our
open ended question, we will now have ‘6 themes’.
Qualitative Data
EXAMPLE KEY
Theme 1 = History
Theme 2 = Sports
Theme 3 = Science
Theme 4 = Islam
Theme 5 = Novels
Theme 6 = I do not read any books
Now that we have our key in place, we can continue to produce tables on the results
received to our open ended question.
Within the cells of the table we use numbers to signify the type of response received, e.g:
1= positive response
2= negative response
3= no response
Qualitative Data
Provided Responses
RESULTING TABLE
Person History
‘Other’ Responses
Sports
Science
Islam
Novels
Don’t
Read
1
1
0
0
0
0
0
2
0
1
0
0
0
0
3
0
0
0
1
0
0
4
0
0
0
1
0
0
5
0
0
0
1
0
0
6
0
0
0
0
1
0
7
1
0
0
0
0
0
8
0
0
0
0
1
0
9
0
0
0
0
0
1
10
0
0
0
0
0
1
Total
2
1
0
3
2
2
Qualitative Data
GRAPHICAL REPRESENTATION – BAR GRAPH
Using the above table, we can now work to translate our results into graphical format.
3.5
3
2.5
2
1.5
1
0.5
0
History
Sports
Science
Islam
Novels
Don't
Read
Qualitative Data
GRAPHICAL REPRESENTATION – PIE CHART
History
Sports
Science
Islam
Novels
Don't Read
Qualitative Data
GRAPHICAL REPRESENTATION – LINE GRAPH
3.5
3
2.5
2
1.5
1
0.5
0
History
Sports
Science
Islam
Novels
Don't
Read
Quantitative Data
As stated earlier, quantitative data is numerical in nature. Typically questions that provide
quantitative are not open ended questions, unless otherwise stated.
Typically quantitative questions are multiple choice in nature, providing the subject with
numerical answers to select from.
A Simple Example
Population Size : 10
-How many news papers do you read daily?
• 0 – 1 reply
• 1 – 2 replies
As the data is already numeric in
• 2 – 3 replies
nature, there is no need for us to
• 3 – 1 reply
design a key. The data, as is, can be
• 4 – 1 replies
tabulated.
• 5 – 2 replies
Quantitative Data
Person 0
1
2
3
5
6
1
0
1
0
0
0
0
2
0
0
1
0
0
0
3
0
0
1
0
0
0
4
0
0
1
0
0
0
5
0
0
0
0
0
1
6
1
0
0
0
0
0
7
0
1
0
0
0
0
8
0
0
0
1
0
0
9
0
0
0
0
1
0
10
0
0
0
0
0
1
Total
1
2
3
1
1
2
Quantitative Data
3.5
3
2.5
Replies
2
1.5
1
0.5
0
0
1
2
3
Newspapers Read
4
5
Quantitative Data
Newspapers Read
0
1
2
3
4
5
Quantitative Data
3.5
3
2.5
2
Replies
1.5
1
0.5
0
0
1
2
3
Newspapers Read
4
5
Data Analysis
Now that we have successfully created our tables and graph, we can now begin to
analyze the data.
With regards to data analysis, the main statistical analysis we wish to consider are as follows:
• Mean = being the average of the results
• Median = being the middle value, when all the numbers are in numerical order from lowest
to highest
• Mode = being the number that is repeated the most often
• Range = being the difference between the highest number and the lowest
• Standard Deviation = showing how much variation or difference there is within the results
from the mean.
Data Analysis
Given that our results from the quantitative analysis are as follows:
1, 2, 3, 1, 2, 2
The mean, median and mode can be calculated as follows:
MEAN – take the average of the results
(1 + 2 + 3 + 1 + 2 + 2)/6 = 1.67 papers read
MEDIAN – put the numbers in numerical order from lowest to highest and select the middle
number
1, 1, 2, 2, 2, 3 – as there are an even amount of numbers, take the average of the middle two
(2+2)/2 = 2 papers read
MODE – the number that appears the most, therefore MODE = 2 papers read
RANGE – 3 – 1 = 2 papers read
Data Analysis
Standard Deviation
Consider our results from our quantitative question:
1, 2, 3, 1, 2, 2
With the mean (average) being = 2
To calculate the Standard Deviation, from the difference of each data point from the mean and
then square the result, then find the average of all the results, finishing by finding the square
root:
(1 – 2)₂ = (-1)₂ = 1
(2 – 2)₂ = (0)₂ = 0
(3 – 2)₂ = (1)₂ = 1
(1 – 2)₂ = (-1)₂ = 1
(2 – 2)₂ = (0)₂ = 0
(2 – 2)₂ = (0)₂ = 0
√(1 + 0 + 1 + 1 + 0 + 0)/6 = √0.5
Standard Deviation = 0.707
Data Analysis
What does this result tell us?
Assuming a normal distribution, as shown above, the red area accounts for approximately 68%
of the population. The standard deviation is a statistic that tells you how tightly all the various
examples are clustered around the mean in a set of data. When the examples are pretty tightly
bunched together and the bell-shaped curve is steep, the standard deviation is small. When the
examples are spread apart and the bell curve is relatively flat, that tells you have a relatively
large standard deviation.
Therefore, for our qualitative question, regarding how many newspapers people read a day,
68% of our population reads between 2.707 – 1.293 papers a day.
Data Analysis
THEREFORE, from the quantitative question asked, and from the tables and graphs
Produced, we have the following results:
MEAN = 1.67
MEDIAN = 2
MODE = 2
RANGE = 2
STANDARD DEVIATION = 0.707
NOTE: This process would be repeated for all the meaningful questions you have.
Thank You
To help you with your data analysis, you may visit
www.download.com
Where you can find the following data analysis software:
• SAS
• Minitab
• Lotus
• Others…
To download the presentation, please visit the TLSU website
http://lsueng.kau.edu.sa
Go to Files and then Forms
Download