Note that this table deals only with the statistical tests

advertisement
DATA ANALYSIS AND REPORTING ON INVESTIGATIONS
To read up on data analysis and reporting on investigations, refer to pages 720–745 of Eysenck’s A2 Level
Psychology.
Ask yourself
 What are descriptive statistics?
 Why do we test the probability of any given set of data being due to chance?
 Which two criteria determine which statistical test should be used?
What you need to know
DESCRIPTIVE
STATISTICS AND
STATISTICAL TESTS
QUALITATIVE ANALYSIS THE CONVENTIONS FOR
THE REPORTING OF
PSYCHOLOGICAL
RESEARCH
Levels of
 The process of
measurement and
qualitative analysis
measures of central
 Interpretation and
tendency and
evaluation of
dispersion
qualitative analysis
 Appropriate use of
graphs, charts, and
tables
 Justification of
statistical tests
 Probability and
significance
including Type 1
and Type 2 errors
Inferential tests:
chi-squared, Mann–
Whitney, Wilcoxon,
Spearman’s rho
DESCRIPTIVE STATISTICS AND STATISTICAL TESTS



The structure of
journal reports
Title, abstract,
Introduction, aims,
and hypothesis,
method, results,
discussion,
references,
appendices
For the exam, you need to be able to know how to select and interpret statistical tests. There are just four
tests you need to cover: Mann–Whitney, Wilcoxon, chi-squared, and Spearman’s rho tests. You will not be
asked to carry out a statistical analysis during the exam but you must understand which test would be used
and when, and the nature of the data each test generates.
Descriptive statistics give us convenient and easily understood summaries of what we have found. They
give an indication of what statistical analysis is likely to reveal.
Descriptive statistics include: graphs, tables, measures of central tendency, and measures of dispersion. The
appropriate descriptive statistics depend on the level of measurement, and so we will consider this first.
Levels of measurement
The levels of measurement progress in terms of the sophistication of the data from nominal (the least
sophisticated data) to ratio (the most sophisticated data).
Nominal: the data consist of the numbers of participants falling into various categories (e.g. fat, thin; men,
women).
Ordinal: the data can be placed in rank order, i.e. they can be ordered from lowest to highest (e.g. the
finishing positions of athletes in a race or rating scales).
Interval: at this level the data has fixed intervals and so differ from ordinal data because the units of
measurement are fixed throughout the range. For example, there is the same “distance” between a height of
1.82 metres and 1.70 metres as between a height of 1.70 metres and one of 1.58 metres.
Ratio: the data have the same characteristics as interval data, except that they have a meaningful zero point,
i.e. an absolute zero. For example, time measurements provide ratio data because the notion of zero time is
meaningful, and 10 seconds is twice as long as 5 seconds. The similarities between interval and ratio data
are so great that they are sometimes combined and referred to as interval/ratio data.
Measures of central tendency and dispersion
The measures of central tendency—the mode, the median, and the mean—are averages, and so involve the
calculation of a single number that is representative of the other numbers with which it is associated. The
appropriate measure to use depends on the level of the data.
 The mode is the number that occurs most frequently. It is quick and easy to


calculate and can be used whatever the level of data. Thus, it can be used for
the least sophisticated nominal data. However, it provides very limited
information because it does not tell us about the other values in the score
distribution. A further problem is that it is possible to have more than one
modal value; two modal values are known as bimodal values, and more than
two are called multimodal.
The median is the middle value when the scores are arranged from lowest to
highest. Advantages of the median are that it can be used when you are
unsure about the reliability of extreme or anomalous values, thus when you
have skewed distributions. The fact that it is based on ordering the data
means that it can be used only for data that are ordinal and above
The mean is the arithmetic average; it is calculated by adding all the values
together and dividing the total by the number of values. The mean is the best
measure of central tendency to use because it makes use of all the data in the
score distribution. However, it should be used only with data that form a
normal distribution (remember: the bell-shaped curve) and for data of
interval or ratio measurement. The mean should not be used when there are
extreme outlying values (anomalies) because, as it uses all of the data, it is
easily distorted; when there are outliers, the median should be used, not the
mean.
The measures of dispersion measure the variability within the data distribution, i.e. are the scores similar to
or different from each other? Thus, they are a measure of the spread of the scores in the data distribution.
The variation ratio complements the mode because it is the proportion of non-modal scores and so is
suitable for nominal data. Advantages include the fact that it is easy to calculate and can be used on
unsophisticated data, i.e. nominal data. The limitation is the same key disadvantage as with the mode: it is
not representative of all scores in the distribution and so tends not to be used.
The range is the difference between the highest and lowest scores in a dataset; the key advantage is that it
is easy to calculate. Limitations include the fact that the two most extreme values are used to calculate the
range and so if these are outlying the calculated range will not be representative of the distribution. Also,
the range does not make use of all the data in the score distribution and can only be used for data that is
ordinal and above. The interquartile range solves the problem of outlying values by using only the middle
50% of scores in the calculation. This gives a better idea of the distribution of values around the centre.
The standard deviation is a measure of variability. It measures scores in terms of difference from the mean.
Advantages are that, as with the mean, the standard deviation uses all the scores in a set of data and so is
the best measure of dispersion to use. We can make inferences based on the relationship between the
standard deviation and a normal distribution curve. However, limitations include the fact that data need to
be of interval or ratio levels of measurement, and to be approximately normally distributed, which is not
always possible because this is the most sophisticated type of data.
Graphs and tables
Graphs and charts present the data visually. They are a useful way of summarising information as the data
is easily accessible in visual format.
Bar charts
Bar charts illustrate data measured at nominal or ordinal level. They are used for non-continuous variables
because the bars are separate from each other. This is in comparison to the histogram where bars are
adjoined. Bar charts are often used to illustrate the means from different conditions.
Histograms
Histograms are used to present frequencies of continuous data. Thus, the data must be measured at interval
or ratio level for the histogram to be appropriate. The histogram represents the same information as the
frequency polygon, except one is presented as bars and the other as a line graph.
Frequency polygons
Frequency polygons show the frequencies of continuous data, i.e. data that achieve at interval or ratio level.
They are useful when representing results from two conditions at the same time. The x-axis shows the
scores and the y-axis shows the frequency.
Scattergraphs
Scattergraphs (or scattergrams) are used to present correlated data. It does not matter which variable goes
on which axis. Correlations range from perfect positive (+1) to no correlation to perfect negative (-1). The
sign indicates the direction, and the correlation coefficient (the number) indicates the strength of the
correlation. The closer the correlation coefficient is to 1, the stronger the correlation.
If scores are positively correlated, they increase and decrease together; if they are negatively correlated,
then as the scores on one variable increase, the scores on the other variable decrease. Perfect positive and
negative correlations are rare in psychological research; imperfect correlations are more common. You
must be able to interpret the direction and strength of a correlation, so remember, as a rule of thumb:
 low numbers (0.1 to 0.3) are weak correlations
 0.4 to 0.6 are moderate
 0.7 to 1 are strong (although it is a little more complex than this because
strength depends on the size of the sample).
Tables
A table can be an effective way of summarising a large amount of data, for example, measures of central
tendency and dispersion can be provided in the one table. Tables can provide information very simply and
clearly but can be harder to interpret than a graph because it is more difficult to visualise the data.
Statistical tests
Test of difference, association, or correlation?
To select the appropriate statistical test you need to decide if a test of difference or a test of association or
correlation is appropriate.
One-tailed or two-tailed test?
When using a statistical test, you need to take account of the alternative hypothesis. If you predicted the
direction of any effects (e.g. loud noise will disrupt learning and memory), then you have a directional
hypothesis, and so need a one-tailed test. If you did not predict the direction of any effects (e.g. loud noise
will affect learning and memory), then you have a non-directional hypothesis, and so need a two-tailed test.
Level of precision
Another factor to consider when deciding which statistical test to use is the type of data you have obtained.
There are four types of data, of increasing levels of precision (nominal; ordinal; interval; ratio, see above).
Statistical significance
The statistical test is a way of testing the probability that the results are due to chance. Probability in
psychology is used to determine if the probability of our results being due to chance is low enough for the
alternative/experimental/correlational hypothesis to be accepted. If so, the results are significant and
consequently the null hypothesis can be rejected and the alternative/experimental/correlational hypothesis
can be accepted.
The probability of the findings being due to chance is estimated from the level of statistical significance
achieved by the data.
 The conventional minimum level of significance to be accepted is p < 0.05
(which is also known as the 5% level); this means that if this level of
significance is achieved, the probability of the results being due to chance

(i.e. a fluke) is less than 5%. Thus, the null hypothesis is rejected (and the
alternative hypothesis is accepted) if the probability that the results were
due to chance alone is 5% or less. This is often expressed as p = 0.05, where p
= the probability of the result if the null hypothesis is true. If the statistical
test indicates that the findings do not reach the 5% (i.e. the p = 0.05) level of
statistical significance, then we retain the null hypothesis and reject the
alternative hypothesis.
The data sometimes indicate that the null hypothesis can be rejected with
greater confidence, say, at the 1% (i.e. one out of one hundred, shown as
0.01) level. If the null hypothesis can be rejected at the 1% level, it is
customary to state that the findings are “highly significant”. In general terms,
you should state the precise level of statistical significance of your findings,
whether it is the 5% level, the 1% level, or whatever.
There are two kinds of error that can occur when reaching a conclusion on the basis of the results of a
statistical test:
 Type 1 error: the null hypothesis is rejected incorrectly as the results are

actually due to chance.
Type 2 error: the alternative hypothesis is rejected incorrectly as the results
do in fact show a real difference or relationship.
It would be possible to reduce the likelihood of a type 1 error by using a more stringent level of
significance. For example, if we used the 1% (p = 0.01) level of significance, this would greatly reduce the
probability of a type 1 error. However, use of a more stringent level of significance increases the
probability of a type 2 error. Consequently, most psychologists favour the 5% (or (p = 0.05) level of
significance: it allows the probabilities of both type 1 and type 2 errors to remain reasonably low.
Choosing the appropriate statistical test
To summarise, two key factors determine choice of statistical test:
1) Research design: correlation or experimental determines if a test of association or a test of difference is
needed. If the design is experimental then choice of test depends on if the independent measures or
repeated measures design has been used, as independent measures requires a test of unrelated data whereas
repeated measures requires a test of related data.
2) Level of data: see above.
Note that other considerations include whether the hypothesis is directional or non-directional and the
level of significance needed to avoid both Type 1 and type 2 errors. These do not affect choice of test. They
are used when looking up the critical value but are not relevant to choice of test.
Correlational
Difference test
test
Level of
Independent data Related data (obtained from repeated measures
measurement
and matched pairs designs)
Nominal
Chi-squared test Sign test
Chi-squared test
Ordinal and
Mann – Whitney U Wilcoxon matched pairs signed ranks
Spearman’s rho
interval
test
Note that this table deals only with the statistical tests described in the text, although
other tests do exist.
Thus, you need to justify the statistical test based on the design and level of data.
For example, a test of gender difference in reported stress uses an independent measures design and the
data is ordinal level of measurement. (The fact the data test a difference, and that the data is unrelated
because the design is independent measures and achieves ordinal level of measurement, means that the
Mann–Whitney U test is appropriate.)
Statistical tests
Please refer to A2 Level Psychology pages 729–737 for worked calculations. Remember you will not be
asked to calculate any of these in the exam but working through the calculations will give you a better
understanding of the statistical tests.
The chi-squared test
The chi-squared test is a test of association and also a test of difference. It is used when we have nominal
data in the form of frequencies, and when each and every observation is independent of all the other
observations, and so when an independent measures design has been used.
Mann–Whitney U test
The Mann–Whitney U test can be used when an independent design has been used and the data are either
ordinal or interval.
Wilcoxon matched pairs signed ranks test
The Wilcoxon matched pairs signed ranks test can be used when a repeated measures or matched
participants design has been used and the data are at least ordinal. This test can also be used if the data are
interval/ratio.
Spearman’s rho
Suppose that we have scores on two variables from each of our participants, and we want to see whether
there is an association, or correlation, between the two sets of scores. Providing the data are at least ordinal,
this can be done using the test known as Spearman’s rho. Spearman’s rho or rs indicates the strength of the
association.
rs is +1.0 = a perfect positive correlation between the two variables.
rs is –1.0 = a perfect negative correlation between the two variables.
rs is 0.0 = no relationship between the two variables.
Calculating the observed value
The next step is to perform the calculations. The outcome of a statistical test is a
number, called the observed value. The worked examples in the boxes in A2 Level
Psychology pages 729–737 show how to calculate each statistical test.
Using a table of significance to compare the observed and critical values
To establish significance, your calculated (or observed) value must be compared with a critical value.
These can be found in critical value tables in Appendix A of Eysenck’s A2 Level Psychology. The table will
tell you if the calculated value has to be less than or more than the critical value for significance to be
achieved. For the purposes of the exam, you need to know if the observed or calculated value needs to be
lower or higher than the critical value. This differs, depending on which statistical test is used.
Comparing the calculated and critical values
Statistical test Chi-squared
test
Mann–
Whitney U
Wicoxon
signed ranks
Spearman’s
rho
Calculated
compared to
critical value
Calculated
value must be
less than or
equal to the
critical value
Calculated
value must be
less than or
equal to the
critical value
Calculated
value must be
greater than or
equal to the
critical value
Calculated
value must be
greater than or
equal to the
critical value
Reporting the result
The final step is to record the outcome of this whole process. You should include the following information
in a statement of significance:
 details of the level of significance
 the critical and observed values
 degrees of freedom, number of participants or number of paired scores

whether the hypothesis was directional or non-directional (one tailed or
two-tailed)
whether it was accepted or rejected.

See Eysenck’s A2 Level Psychology page 739 for example statements of significance.
QUALITATIVE ANALYSIS
Qualitative data can take many forms and consists of words:
 written records, e.g. notes or transcripts
 audio or video recordings
 direct quotations from participants.
The process of qualitative analysis
1. Data are gathered using non-experimental methods, which include
naturalistic observation, interview, questionnaire, and case study.
2. The data need to be categorised and these categories should be suggested by
participants to avoid researcher bias.
3. The researcher will look for recurrent themes and patterns in the data, which
might or might not fit with the previously constructed categories. For
example, discourse analysis would be interpreted by analysing the meanings
behind the words used.
4. Consider the research hypothesis and how this might have changed as a
result of the investigation.
5. It might be useful to make the qualitative data quantitative, e.g. content
analysis. The researcher might quantify the data by counting the number of
items that fall into each category. This can be done to summarise the
qualitative data and usually accompanies, rather than replaces, the more indepth qualitative analysis.
Evaluation of qualitative analysis
Advantages
Weaknesses
In-depth data increases validity (truth)
Lacks generalisability
Explanatory not just descriptive
Subjective and open to bias
Difficult to replicate as lacks reliability
THE CONVENTIONS FOR THE REPORTING OF PSYCHOLOGICAL RESEARCH
The research journals that psychological research is reported in follow a conventional structure, as detailed
below.
Title
This should be very specific, including the research design and the variables.
Abstract
This is a single paragraph that summarises the main points of the research study: aims, a brief description
of the background research, methods, findings, and conclusions.
Introduction, aims, and hypothesis
The introduction should summarise relevant background research; it should begin at a general level and
quickly narrow down to examine two or three particularly relevant pieces of research. The aims and
hypothesis must be a logical progression from the background research. The hypothesis must be testable
and so needs to be operationalised, which means how the variables were measured must be clear within the
hypothesis.
Method
This section of the report should provide the reader with sufficient detail to replicate the study. It is
typically subdivided into the four sections:
1. Design: includes design decisions, such as choice of method (e.g. experiment
or observation), experimental design, and the key variables. Any controls of
confounding variables, sources of bias, and ethical decisions that were taken
as part of the design should also be included.
2. Participants: where, when, who, how (sample method) need to be detailed.
3. Apparatus/materials: full details of all materials should be placed in the
appendix section of a report, so just a description of the materials is included
in the write-up. This might include a description of questionnaire
construction, observation criteria, standardisation of a test, etc.
4. Standardised procedures: this should be a clear but detailed summary of
exactly how the study was implemented.
Results
There are three ways to illustrate the results of psychological research:
1. Raw data: the numbers prior to any analysis. These should be placed in the
appendices but a summary might be included in the results section.
2. Descriptive statistics, such as the use of measures of central tendency (mean,
median, and mode) and/or spread (range or standard deviation), plus
graphical representation.
3. Statistical tests: determine whether the findings are significant. This section
must state clearly which test was used, justify the choice of statistical test,
record details of the test calculations in the appendix, and state the outcome
of the statistical test and so which hypotheses are supported or rejected.
Discussion
Four areas need to be covered:
1. Explanation of findings: the findings need to be related to the original aims
and hypotheses. Any unanticipated findings can be discussed.
2. Relationship to background research: The findings must be related to the
research in the introduction in terms of whether they support or contradict
it.
3. Limitations and modifications: key limitations need to be considered, and
improvements to resolve the issues.
4. Implications and suggestions: implications include the practical use of the
research, so how the findings can be used to explain real-life behaviour. The
implications should lead logically into suggestions for future research.
References
The reason for full references is so that readers have the details of the original article or book if they wish
to research the study/theory further themselves. This is the style used in the reference section of Eysenck’s
A2 Level Psychology. An alternative acceptable style is to state the details of a textbook, and list all the
studies with page numbers that have been cited from this book. This means that anyone who would like to
follow-up one of your references can locate the exact reference and the article.
Appendices
Examples of materials and/or questionnaires, standardised instructions, raw data, and statistical tests are
included in the appendices.
So what does this mean?
In this section we have considered how data should be analysed including at a descriptive and inferential
level. Statistical tests need to be understood but fortunately not calculated in the exam. Make sure you can
justify the different statistical tests and know for each test whether the calculated value has to be greater or
lower than the critical value. Finally, make sure you are familiar with how psychological research needs to
be written up.
Over to you
Please see the Psychological research and scientific methods: specimen question in chapter 1 of Eysenck’s
A2 Level Psychology (see A2 Level Psychology page 6) for a typical question on this topic.
Download