Data analysis PTK48

advertisement
PTK Course for Local Governments, UTM, Skudai, 2025/11/2008
RESEARCH MANAGEMENT
DATA ANALYSIS
Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman,
Centre for Real Estate Studies,
Faculty of Geoinformation Engineering & Sciences,
Universiti Teknologi Malaysia.
E-mail: hamidiman@utm.my
Web: http://ac.utm.my/web/hamidiman
(C) Copyrights of the Author. No part of materials in these slides should be
extracted in any electronic or non-electronic method without permission from
the Author.
1
Definition
 A systematic programme of planning , coordinating,
implementing, and controlling knowledge process
through information development, with a view to
obtaining a strategic fit between an organisation’s
goals and its internal capabilities.
 It is basically a practice-related research
management.
 The nature of the research may be fundamental,
developmental, or commercial.
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
2
Purpose of Research
 Intelligence purposes
 Ad-hoc or planned problem-solving.
 Strengthening overall research programs
within a particular organisation.
 Enhancing organisational capabilities, e.g.
→ medium-term & long-term planning,
strategy, decision-making ability, etc.
 Else?
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
3
Basic Structure of Research Unit
The ‘state of
affair’ of each
component of
this structure?
Institutional
links
Resources
Targets
(e.g.
groups)
Supervisory
system
Administration,
rules &
regulations
Research
training
program
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
What, how,
how much,
when, and
who to
improve?
Possible
outcomes &
obstacles?
4
Developing Research Skills
 Organisational research philosophy.
 Strategic research areas.
 Proper administrative structure.
 Adequate & good facilities.
 Qualified staff.
 Research training:
* Research programs;
* Research methodology;
* Intelligence gathering & ad-hoc research;
* Information management.
 Funding.
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
5
Organisational Research Philosophy
 Go beyond administrative functions.
 Producing practice-related research outcomes.
 Fulfilling organisational mission.
 Directed research:
* Problem-solving research.
* Industry orientation (applied research),
aligning with government’s policies & within
the ambit of organisational policies.
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
6
What is Directed Research
♦ A research that is pivoted on the riority areas of an
organisation in which it has the expertise, resources, and
institutional set-up readily available.
♦ To help an organisation focus on some strategic research
areas, reflecting its research niches and strength and thus
giving it competitive advantages in those areas.
♦ These focus areas are the “shooting targets” at which Key
Performance Indicators (KPI) are used to gauge
institutional achievement.
♦ Can be implemented in collaboration with universities
through Intensification of Research in Priority Areas
(IRPA), E-Science, Technofund, and the National Property
Research Coordinator (NAPREC), etc.
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
7
Strategic Research Areas
 Need for strategic research planning.
 Purpose: to identify research niches, strengths, and thus,
comparative advantages.
 Should be established at departmental level.
 Example:
* Set research mission, goal & objectives, portfolio &
functional strategies;
* Establish two-tier research programs: (1) priority
research; (2) fundamental research;
* Documentation of departmental strategy.
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
8
Strategic Research Planning
PTK Course for Local Governments, UTM, Skudai, 20-25/11/2008
9
Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman
Centre for Real Estate Studies
Faculty of Engineering and Geoinformation Science
Universiti Tekbnologi Malaysia
Skudai, Johor
Objectives
 Overall: Reinforce your understanding from the main
lecture
 Specific:
* Concepts of data analysis
* Some data analysis techniques
* Some tips for data analysis
 What I will not do:
* To teach every bit and pieces of statistical analysis
techniques
Data analysis – “The Concept”
 Approach to de-synthesizing data, informational,
and/or factual elements to answer research
questions
 Method of putting together facts and figures
to solve research problem
 Systematic process of utilizing data to address
research questions
 Breaking down research issues through utilizing
controlled data and factual information
Categories of data analysis
 Narrative (e.g. laws, arts)
 Descriptive (e.g. social sciences)
 Statistical/mathematical (pure/applied sciences)
 Audio-Optical (e.g. telecommunication)
 Others
Most research analyses, arguably, adopt the first
three.
The second and third are, arguably, most popular
in pure, applied, and social sciences
Statistical Methods
 Something to do with “statistics”
 Statistics: “meaningful” quantities about a sample of objects,
things, persons, events, phenomena, etc.
 Widely used in social sciences.
 Simple to complex issues. E.g.
* correlation
* anova
* manova
* regression
* econometric modelling
 Two main categories:
* Descriptive statistics
* Inferential statistics
Descriptive Statistics
 Use sample information to explain/make abstraction of
population “phenomena”.
 Common “phenomena”:
 * Association (e.g. σ1,2.3 = 0.75)
 * Tendency (left-skew, right-skew)
 * Causal relationship (e.g. if X, then, Y)
 * Trend, pattern, dispersion, range
 Used in non-parametric analysis (e.g. chi-square, t-test,
2-way anova)
 Basically non-parametric
Examples of “abstraction” of phenomena
350,000
300,000
No. of houses
200000
150000
100000
50000
200,000
1991
150,000
2000
100,000
50,000
1
2
3
4
5
6
7
8
32635.8
38100.6
42468.1
47684.7
48408.2
61433.6
77255.7
97810.1
Demand f or shop shouses (unit s)
71719
73892
85843
95916
101107
117857
134864
86323
Supply of shop houses (unit s)
85534
85821
90366
101508
111952
125334
143530
154179
0
Ba
tu
J o Pa
ho h a
rB t
ah
r
Kl u
Ko ua
ta ng
Ti
n
M ggi
er
si
ng
M
u
Po ar
n
Se tian
ga
m
at
0
Loan t o propert y sect or (RM
250,000
million)
Year (1990 - 1997)
District
Trends in property loan, shop house dem and & supply
200
Price (RM/sq. ft of built area)
14
10
8
6
4
2
0
180
160
140
120
70
-7
4
60
-6
4
50
-5
4
40
-4
4
30
-3
4
20
-2
4
100
10
-1
4
04
Proportion (%)
12
80
0
20
40
60
80
Age Category (Years Old)
Demand (% sales success)
100
120
Examples of “abstraction” of phenomena
200
50.00
180
Distance from Rakaia (km)
160
140
120
%
prediction
error
40.00
100.00
80.00
60.00
40.00
20.00
0.00
-20.00
-40.00
-60.00
-80.00
-100.00
30.00
20.00
10.00
100
80
20
40
60
80
Demand (% sales success)
100
120
10.00 20.00 30.00 40.00 50.00 60.00
Distance from Ashurton (km)
Inferential statistics
 Using sample statistics to infer some “phenomena” of
population parameters
 Common “phenomena”: cause-and-effect
* One-way r/ship
Y = f(X)
* Multi-directional r/ship
Y1 = f(Y2, X, e1)
* Recursive
Y2 = f(Y1, Z, e2)
 Use parametric analysis
Y1 = f(X, e1)
Y2 = f(Y1, Z, e2)
Dep=9t – 215.8
Examples of relationship
Dep=7t – 192.6
Coefficientsa
Model
1
(Cons tant)
Tanah
Bangunan
Ans ilari
Umur
Flo_go
Uns tandardized
Coefficients
B
Std. Error
1993.108
239.632
-4.472
1.199
6.938
.619
4.393
1.807
-27.893
6.108
34.895
89.440
a. Dependent Variable: Nilaism
Standardized
Coefficients
Beta
-.190
.705
.139
-.241
.020
t
8.317
-3.728
11.209
2.431
-4.567
.390
Sig.
.000
.000
.000
.017
.000
.697
Which one to use?
 Nature of research
* Descriptive in nature?
* Attempts to “infer”, “predict”, find “cause-and-effect”,
“influence”, “relationship”?
* Is it both?
 Research design (incl. variables involved). E.g.
 Outputs/results expected
* research issue
* research questions
* research hypotheses
At post-graduate level research, failure to choose the correct data
analysis technique is an almost sure ingredient for thesis failure.
Common mistakes in data analysis
 Wrong techniques. E.g.
Issue
Data analysis techniques
Wrong technique
Correct technique
To study factors that “influence” visitors to
come to a recreation site
Likert scaling based on
interviews
Data tabulation based on
open-ended questionnaire
survey
“Effects” of KLIA on the development of
Sepang
Likert scaling based on
interviews
Descriptive analysis based
on ex-ante post-ante
experimental investigation
Note: No way can Likert scaling show “cause-and-effect” phenomena!
 Infeasible techniques. E.g.
How to design ex-ante effects of KLIA? Development occurs
“before” and “after”! What is the control treatment?
Further explanation!
 Abuse of statistics. E.g.
 Simply exclude a technique
Common mistakes (contd.) – “Abuse of statistics”
Issue
Data analysis techniques
Example of abuse
Correct technique
Measure the “influence” of a variable
on another
Using partial correlation
(e.g. Spearman coeff.)
Using a regression
parameter
Finding the “relationship” between one
variable with another
Multi-dimensional
scaling, Likert scaling
Simple regression
coefficient
To evaluate whether a model fits data
better than the other
Using R2
Many – a.o.t. Box-Cox
2 test for model
equivalence
To evaluate accuracy of “prediction”
Using R2 and/or F-value
of a model
Hold-out sample’s
MAPE
“Compare” whether a group is
different from another
Multi-dimensional
scaling, Likert scaling
Many – a.o.t. two-way
anova, 2, Z test
To determine whether a group of
factors “significantly influence” the
observed phenomenon
Multi-dimensional
scaling, Likert scaling
Many – a.o.t. manova,
regression
How to avoid mistakes - Useful tips
 Crystalize the research problem → operability of it!
 Read literature on data analysis techniques.
 Evaluate various techniques that can do similar
things w.r.t. to research problem
 Know what a technique does and what it doesn’t
 Consult people, esp. supervisor
 Pilot-run the data and evaluate results
 Don’t do research??
Principles of analysis
 Goal of an analysis:
* To explain cause-and-effect phenomena
* To relate research with real-world event
* To predict/forecast the real-world
phenomena based on research
* Finding answers to a particular problem
* Making conclusions about real-world event
based on the problem
* Learning a lesson from the problem
Principles of analysis (contd.)
 Data can’t “talk”
 An analysis contains some aspects of scientific
reasoning/argument:
* Define
* Interpret
* Evaluate
* Illustrate
* Discuss
* Explain
* Clarify
* Compare
* Contrast
Principles of analysis (contd.)
 An analysis must have four elements:
* Data/information (what)
* Scientific reasoning/argument (what?
who? where? how? what happens?)
* Finding (what results?)
* Lesson/conclusion (so what? so how?
therefore,…)
 Example
Principles of data analysis
 Basic guide to data analysis:
* “Analyse” NOT “narrate”
* Go back to research flowchart
* Break down into research objectives and
research questions
* Identify phenomena to be investigated
* Visualise the “expected” answers
* Validate the answers with data
* Don’t tell something not supported by
data
Principles of data analysis (contd.)
Shoppers
Male
Old
Young
Female
Old
Young
Number
6
4
10
15
More female shoppers than male shoppers
More young female shoppers than young male shoppers
Young male shoppers are not interested to shop at the shopping complex
Data analysis (contd.)
 When analysing:
* Be objective
* Accurate
* True
 Separate facts and opinion
 Avoid “wrong” reasoning/argument. E.g. mistakes in
interpretation.
Some Principles of Statistical Methods
in
Data Analysis
What is Statistics
 “Meaningful” quantities about a sample of objects,
things, persons, events, phenomena, etc.
 Something to do with “data”
 Widely used in various discipline of sciences.
 Used to solve simple to complex issues.
 Three main categories:
* Descriptive statistics
* Inferential statistics
* Probability theory
Forms of “Statistical” Relationship
 Relationship can be non-parametric or parametric
 E.g. of non-parametric r/ships:
* Correlation
* Contingency
 E.g. of parametric → cause-and-effect
* Causal
* Feedback
* Multi-directional
* Recursive
 The “parametric” categories are normally dealt with
through regression
Non-Parametric Data Analysis Methods – A Summary
Scale of
measurement
One-sample
Two
independent
Sample
K
independent
Sample
Measures of
Association
Independent
Sample
Single
treatment
repeat
Measures
Multiple
treatment
repeat
Measures
Nominal
Binomial
test;
one-way
contingency
Table
McNemar
test
Cochrane Q
Test
Two-way
contingency
Table
Contingenc
y
Table
Contingenc
y
Coefficients
Ordinal
Runs test
Wilcoxon
signed rank
test
Friedman
test
MannWhitney
Test
KruskalWallis
Test
Spearman
rank
Correlation
Interval/ratio
Z- or t-test
of variance
Paired t-test
Repeat
measures
ANOVA
Unpaired
t-test;
tests of
variance
ANOVA
Regression,
Pearson
correlation,
time series
Parametric Analysis - Regression
The significance of each variable to the model can be determined by looking at the “t”
values.
Coefficients(a)
Unstandardized Coefficients
Model
1
B
(Constant)
Std. Error
29680.695
2885.532
-705.817
38.491
NB211001
12374.064
NB211002
NB211003
Standardized
Coefficients
Beta
t
Sig.
10.286
.000
-.212
-18.337
.000
2176.815
.061
5.684
.000
-1094.891
1527.977
-.008
-.717
.474
-938.838
1136.671
-.010
-.826
.409
NB211005
12639.946
2139.489
.066
5.908
.000
NB211006
AGE
852.109
2535.266
.004
.336
.737
SQFT1
31.388
7.815
.039
4.016
.000
SQFT2
44.166
1.365
.595
32.349
.000
SQFT3
52.939
1.265
.808
41.857
.000
SQFT4
60.447
3.561
.164
16.974
.000
SQFT5
94.723
2.943
.312
32.186
.000
LAND75
11.788
.433
.303
27.240
.000
BATHS
7714.093
1338.204
.076
5.765
.000
POOL
13359.275
1184.469
.105
11.279
.000
10.750
3.137
.038
3.427
.001
GARAGE
Rule of Thumb: “t” scores
Should be 2.0 or greater.
Nilai “t” seharusnya lebih
Besar atau sama dengan
2,0
NB211002
NB211003
NB211006
are insignificant
a Dependent Variable: SALEPRIC
34
Download