File

advertisement
STATISTICAL PACKAGE FOR SOCIAL SCIENCES
 Statistical Package for Social Science (SPSS) is the most popular quantitative
analysis software used in conducting statistical analysis, manipulating data
and generating tables and graphs that summarizes data.
 Statistical analysis range from basic descriptive statistics such as averages and
frequencies to advance such as regression models ANOVA and factor analysis.
 SPSS provides a window user interface which makes the software very user
friendly.
 Researchers can use simple menus, pop-up boxes and dialogue boxes to
perform complex analyses without writing even a single line of syntax.
 It is integrated software which also provides a spreadsheet-like utility function
for entering data and browsing the working data file.
 Its whole operation is pivoted around three windows.
o A data window with a blank data sheet ready for analyses in the first
window. It is used to define and enter data and to perform statistical
procedures
o The syntax window is used to keep records of all commands issues in a
stata session. Researchers so not have to know the language to write the
syntax; instead they can just select the appropriate option from the
menu and dialogue box and click a paste function. This would paste an
equivalent syntax of the selected operation in the syntax window. Besides
serving as a log of operations, it is possible to run commands from the
syntax window.
o Whenever a procedure is run, the output is directed to a separate
window called output window.
 SPSS automatically adds a three-letter suffix to the end of the file name that is
‘.sav’ for data editor files, ‘.sop’ for output files and ‘.saps’ for syntax files.
Entering Data in SPSS
 Researchers can create data file by simply entering the data.
 The following are step-by-step procedures for entering information’s about the
variables
o The SPSS main data window provides two options in the bottom left
hand corner of the screen and researcher can access either the data
view or variable view window.
o The variable view window depicts the characteristics of variables in
terms of name, type, labels and values.
Name: The variable names in SPSS are not case sensitive, but they must
begin with a letter. Variable name should not exceed 8 characters
Type: SPSS also provides the facility to specify the data type, i.e.,
whether the data are in numeric, string, or date format, though the
default is numeric format.
Labels: SPSS also provides the facility of attaching labels to variable
names. By using a variable label researcher can use 256 characters to
attach a label to the variable name. This provides the ability to have very
descriptive labels that appear at the output. Researchers can enhance
the readability of the output by using labels option.
Values: Researchers can assign variable values
eg., Male and Female can be coded as 1 and 2 respectively.
DESCRIPTIVE STATISTICS
Descriptive statistics describes the properties of a group or data score. Research uses
descriptive statistics to have a fist hand feel of data, that is, in the case of categorical
data counts, propositions, rates and ratios are calculated.
Describes the performance of one class of variables and do not make generalization
about several classes
Example: Graphs, Tables and Charts.
INFERENTIAL STATISTICS
Inferential statistics represents a set of statistics, which is used to make inferences
about a population from sample selected from the population. It tests the statistical
significance, to assess how accurately a sample predicts the population parameters.
Inferential statistics are also used to test sample differences between two samples, to
assess whether differences actually exist or are there just due to chance.
Inferential statistics are generally used for two purposes:
i.
ii.
Tests for differences of means and
Test for statistical significance, which is further subdivided into
a. Parametric and
b. Non-parametric
depending upon the distribution of parameters of population distribution
characteristics.
Generalizations are made beyond the data for the original class of variables.
Example: Chi-square, t-test, correlation, Regression, ANOVA etc.,
Advanced techniques for data analysis
i.
ii.
iii.
iv.
v.
Factor Analysis
Discriminant Analysis
Conjoint Analysis
Multidimensional Scaling
Cluster method
ANOVA - Analysis of Variances
The Analysis Of Variance is a method which separates the variation ascribable to one
set of causes from the variation ascribable to the other set. i.e. it is the method of
splitting the total variation of data into constituent parts which measures the different
sources of variations. i.e.
a) Variations within the sub-group of sample and
b) Variations between the subgroup of sample
and this technique of analysis of variance is referred as ANOVA
Factor Analysis
Factor analysis is concerned with identifying the underlying source of variation
common to two or more variables.
The two objects of factor analysis are
i.
ii.
To reduce the number of variables
To detect the structure in the relationships between variables.
Factor analysis can either be explanatory or confirmatory in nature. There are two
methods of factor analysis:
i.
ii.
Common factor analysis which extracts factors based on the variance shared by
the factors.
Principle components analysis, which extracts factors based on the total
variance of the factors.
Discriminant Analysis
It involves deriving linier combinations of independent variables which can
discriminate between defined groups in such a way that misclassification error is
minimized. It can be done by maximizing the between group variance relative to within
group variance.
Conjoint Analysis
 It is concerned with the measurement of psychological judgments. Such as
consumer preferences etc., In conjoint analysis.
 The researchers try to deconstruct the overall responses so that the utility of
each attribute can be inferred. There are two type of conjoint analysis
i.
Metric Conjoint analysis – where the dependent variable has a metric
value
ii.
Non-metric Conjoint analysis – where the dependent variable is non
metric in nature
Multidimensional Scaling (MDS)
 It is originated from the study of psychometrics and was first proposed by
Torgerson. MDS can be described as an alternative to factor analysis. It refers
to a set of methods used to obtain spatial representation of the similarities or
proximity between the data sets.
 The goal of MDS is to use proximity or similarity between data sets to create a
map of appropriate dimensionality such that distances in the map closely
resemble the similarities that are used to create it.
 It is also described as a set of multivariate statistical methods for estimating the
parameters in and assessing the model fit of various spatial distance models for
proximity data.
 It is also described as a decomposition approach that is uses perpetual
mapping to present the dimensions.
 It is further classified into
o Metric multidimensional scaling – similarity data is used to reflect the
actual distance between the physical objects in the data sets to recover
the underlying configuration
o Non-metric multidimensional scaling – rank order of proximity is used as
the base data to explore the underlying configuration.
Cluster Analysis
Cluster analysis envisages grouping similar observations into the respective clusters
or categories to which they belong, based on the similarities between observations.
Cluster analysis is undertaken with the objective of addressing the heterogeneity of
data. Three main clustering methods are
i.
ii.
iii.
Hierarchical
Non-Hierarchical
Combination of both
Download