ANALYSIS of DEAD RATE

advertisement
The Analysis and Forecast of
Chinese Population
Instructor: 王凯波
Group Members:
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
李文华 2009210538
杨丽丹 2009210561
宋 芹 2009210568
杨春晖 2009220200
1
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
2
INTRODUCTION--Background


Before 1950 China had demographic characteristics of a premodern society with high dead rates and high fertility rates. This
situation produced certain stability in population size or, at least,
leads to a slow increase.
After the foundation of The People’s Republic of China in 1949,
China entered its demographic transition: first dead rates began
to fall rapidly and second, fertility remained for many years at
about an average of six children per woman. As a result of this
China experienced rapid population growth due to the high
number of children born, a sharp decline of baby dead rate.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
3
INTRODUCTION-China Population Development
China Population Development
1982
1977
1953
清(AD 1911)
清(AD 1840)
清(AD 1800)
明(AD 1600)
宋元(AD 1100)
隋唐
魏晋
秦汉
春秋战国
0
20000
--自强不息,厚德载物--
40000
60000
80000
IE @Applied Statistics, Group Report
100000
Population
(10k persons)
120000
4
INTRODUCTION-China Population Today
Now China has a population over 1.3 billion (2007), that is nearly
1/5 the world population.
Most of the population are in the east
(94%), which are more developed, and
enjoying a relatively lower dead rate, and
a lower baby dead rate
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
5
INTRODUCTION--Objective
Our report would like to apply the statistics method
with substantial evidence data got from CHINA
POPULATION STATISTICS YEARBOOK (19952006) , and proceed the research and the analysis on
the male-female birth rate, fertility rate and dead rate
among different area (city, town, village), and different
years, to have a trend analysis and prediction on the
total China population.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
6
INTRODUCTION--Terminology



City, Town & Village城市,乡镇,农村: City and Town in China is
administratively defined as statutory cities and statutory towns judging
from the population, economic, public finance and Infrastructure four
aspects. Village is referred to the areas other than cities and towns.
Birth Rate (or crude birth rate) 出生率:The number of live births per
1,000 population in a given year. Not to be confused with the growth rate.
Death Rate (or crude death rate) 死亡率:The number of deaths per
1,000 population in a given year.

Sex Ratio出生人口性别比:The number of males per 100 females in a
population.

Fertility Rate生育率:The number of live births per 1,000 women ages
15-44 or 15-49 years in a given year.
----Definition from Administrative Office of the State Council &Population Reference
Bureau, USA
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
7
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
8
DESCRIPTIVE ANALYSIS-Variables





1104
Population size of China:
Fertility rate:(生育率) ‰ (1994-2005)
Male-female birth rate:F:100 (1994-2005)
Male (female) ratio of a certain age: %
 the percentage of the male number of total male
population.
Death rate: ‰
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
9
DESCRIPTIVE ANALYSIS-Data Sheet
The data were collected
from internet ,such as
CHINA POPULATION
STATISTICS YEARBOOK
(1995-2006)
(中国人口统计年鉴) etc.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
10
DESCRIPTIVE ANALYSIS-Population size
Observation: Continuous increase since 1962
Increase rate decrease last 20 years
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
11
DESCRIPTIVE ANALYSIS- Fertility rate
Observation

The age distribution of fertility is different.

The birth peak for village comes earlier than city.

And for all ages the village has higher birth rate.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
12
DESCRIPTIVE ANALYSIS- Male/female rate of newborn
Observation

Jumping town data and stationary city and village data

All exceed the rational range (102 to 107)
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
13
DESCRIPTIVE ANALYSIS- Male-Female rate & death
rate(2005)
• Does there are any gender choice?
• Does female lives longer?
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
14
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
15
Male-Female Birth Rate--Data
Data selection: mainly survey and observation
• 10 variables
• 27,000 data points
Data process
• integrate original data(15 forms) into one form
• recalculate them to get new data
• select data points to build a new sample
e.g. fertility rate , death rate
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
16
Population Balance: Male-Female Birth Rate


Main basis of population balance, of great importance.
Number of baby boys when 100 baby girls:
babyboy
male  femalerate 
100
babygirl
(year:1994 -2005. type: city, town, village. 36 data points.)
• One-way ANOVA:
Male-female birth rate versus type (city, town and village)
H 0 : C  T  V
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
17
Population Balance: Male-Female Birth Rate
Conclusions and cause analysis:
Three types own significant difference of gender balances and choices.
P-value = 0.000<0.05
Boy preference: village (highest) town  city (lowest)
•
•
•
•
Viewpoint that Man is superior to woman
The farm work and lifestyle
Education
Medical technique (helps sharpen the
gender choice)
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
18
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
19
Population replacement: Fertility rate
• Main basis of population balance, of great importance.
• number of babies per 1000 women from age 15-49:
(year:1994 -2005. type: city, town, village. 2520 original points.)
Scatterplot of City fertili, Town fertili, Village fert vs Age
200
Scatterplot of City fertili, Town fertili, Village fert vs Age
20
Variable
City fertility rate
Town fertility rate
Village fertility rate
150
Scatterplot of City fertili, Town fertili, Village fert vs Age
Variable
City fertility rate
Town fertility rate
Village fertility rate
Scatterplot of City fertili, Town fertili, Village fert vs Age
Variable
City fertility rate
Town fertility rate
Village fertility rate
200
15
150
50
10
5
0
20
30
Age
40
50
100
0
0
10
2004
--自强不息,厚德载物--
20
30
Age
40
50
100
50
50
0
10
Y-Data
Y-Data
Y-Data
Y-Data
150
100
Variable
City fertility rate
Town fertility rate
Village fertility rate
200
10
20
30
Age
40
10
50
2003
IE @Applied Statistics, Group Report
2002
20
30
Age
40
50
2001
20
Population replacement: Fertility rate
• Two-way ANOVA:
Fertility rate versus type (city, town and village), year
• Intersection of type and year
Both significant!
type difference: village> town> city
year difference: negative trend
Conclusions and cause analysis:
It proves that one-child policy in our country works a lot.
Small rise and fall around 2003 result from the very epidemic SARS around 2003,which
reduced the contact and pregnant chances.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
21
Population replacement: Fertility rate
Data process
year(2001-2005), age( with highest fertility), type (city, town and village)
Build up a new sample "fertility peak”
• Scatter plot of fertility peak
Fertility peak (highest fertility age) versus type, year
•The age peak is around 24
• Fertility peak decreases
(city changes most. village most stable)
•The village peak is the highest
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
22
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
23
ANALYSIS of RATIO
Step 1: Data collection
Take city male ratio for example
City male ratio=
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
24
ANALYSIS of RATIO
Step 2: Descriptive date analysis
Boxplot of ratio
0.52
ratio
0.51
0.50
0.49
0.48
area
sex
C
T
F
V
C
T
M
V
In city, both male and female ratios are near 0.5. But the difference between
male ratio and female ratio is getting larger and larger from town to village.
Basically, there are more male than female in society. That is the reason why it
is hard for many young men to find “Mrs. Right”.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
25
ANALYSIS of RATIO
Step 3 Exploratory date analysis
Residual Plots for ratio
Normal Probability Plot
Versus Fits
99
0.006
Residual
Percent
90
50
0.003
0.000
-0.003
10
-0.006
1
-0.008
-0.004
0.000
0.004
Residual
0.008
0.49
Histogram
0.50
0.51
Fitted Value
Versus Order
0.006
Residual
Frequency
8
6
4
0.003
0.000
-0.003
2
-0.006
0
-0.006 -0.004 -0.002 0.000 0.002 0.004 0.006
Residual
--自强不息,厚德载物--
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Observation Order
IE @Applied Statistics, Group Report
26
ANALYSIS of RATIO
Step 3 Exploratory date analysis
Interaction Plot for ratio
Data Means
sex
F
M
0.515
0.510
Mean
0.505
0.500
0.495
0.490
0.485
C
T
area
--自强不息,厚德载物--
V
Conclusion:
•In city, male ratio is equal to
female ratio. But it is larger
than female ratio in town
and village.
•The difference between
male ratio and female ratio
is getting larger and larger
from city to village.
IE @Applied Statistics, Group Report
27
ANALYSIS of RATIO
Step 4 Cause analysis
Reason 1
Reason 2
City people
•Just have one kid
•Higher education
•Higher pressures in life
•Dink family
Town and village people
•More than one kid
•value the male child only
And this phenomenon in
village is more serious than
that in town, so the
difference between male ratio
and female ratio in village is
larger than that in town.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
28
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
29
ANALYSIS of DEAD RATE
Step 1: Data collection
Take city male dead rate for example
City male dead rate ratio=
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
30
ANALYSIS of DEAD RATE
Step 2: Descriptive date analysis
Boxplot of dead rate
0.45
Observation:
male dead rate is higher
than female’s. And there is
another conclusion that the
dead rate is increasing from
city to village
dead rate
0.40
0.35
0.30
0.25
0.20
area
sex
C
T
F
--自强不息,厚德载物--
V
C
T
M
V
IE @Applied Statistics, Group Report
31
ANALYSIS of DEAD RATE
Step 3 Exploratory date analysis
Residual Plots for dead rate
Normal Probability Plot
Versus Fits
99
0.02
Residual
Percent
90
50
10
1
-0.030
0.01
0.00
-0.01
-0.02
-0.015
0.000
0.015
Residual
0.030
0.20
0.40
Versus Order
4.8
0.02
3.6
0.01
Residual
Frequency
Histogram
0.25
0.30
0.35
Fitted Value
2.4
1.2
0.00
-0.01
-0.02
0.0
-0.02
-0.01
0.00
0.01
Residual
0.02
--自强不息,厚德载物--
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Observation Order
IE @Applied Statistics, Group Report
32
ANALYSIS of DEAD RATE
Step 3 Exploratory date analysis
Interaction Plo t fo r dead ra te
Data Means
sex
F
M
0.40
Mean
0.35
0.30
0.25
Conclusion:
•Male dead rate is higher
than female dead rate
•Dead rate is increasing
from city to village
0.20
C
T
area
--自强不息,厚德载物--
V
IE @Applied Statistics, Group Report
33
ANALYSIS of DEAD RATE
Step 4 Cause analysis
It is increasing from city
to village
1:Male dead rate is higher
reason
•Male just have one X
chromosome
•Main labor force in society
•Bad habit: smoking drinking
•Accident, crime
•
•
•
•
City
Higher education
Better living standard
Better medical care
Better work condition
generally speaking, city
better than town and
village; and town is a
little better than village.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
34
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
35
TOTAL POPULATION SIZE PREDICT-Guideline



The analysis here is based on the population data
since the foundation of China, and based on 58
year’s population data we could do trend analysis
and prediction in qualitative or quantitative analysis.
Trend analysis
 Linear, exponential, quadratic and S-curve
 Deviation analysis
ARIMA
 Stationary
 Model determination based on ACF/PACF
 Prediction and deviation analysis
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
36
TOTAL POPULATION SIZE PREDICT- Trend object


Continuous increase and a odd decrease point were observed
in annually collected population data. From the Figure, we see
the increase rate declined in last 20 years.
China’s population has reached 132,129*104(2007) , we still
face the serious population problem and also aging
population problem too.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
37
TOTAL POPULATION SIZE PREDICT- Result of trend analysis
Trend Analysis Plot for Population size
Trend Analysis Plot for Population size
Trend Analysis Plot for Population size
Trend Analysis Plot for Population size
Linear Trend Model
Yt = 51011 + 1458*t
Growth Curve Model
Yt = 56244.4 * (1.01628**t)
Quadratic Trend Model
Yt = 49384 + 1618.1*t - 2.668*t**2
S-Curve Trend Model
Yt = (10**6) / (6.07150 + 14.0151*(0.961598**t))
100000
90000
80000
70000
140000
Variable
Actual
Fits
Accuracy Measures
MAPE
4
MAD
3525
MSD
19641671
125000
100000
75000
130000
120000
110000
100000
90000
80000
70000
60000
60000
50000
1
6
•
•
12
18
24
30 36
Index
42
48
54
1
6
12
18
24
30 36
Index
42
48
54
140000
Accuracy Measures
MAPE
2
MAD
1494
MSD
3454107
120000
Variable
Actual
Fits
Forecasts
130000
Curve Parameters
Intercept
49784
Asymptote
164704
Asym. Rate
1
110000
100000
90000
Accuracy Measures
MAPE
2
MAD
1189
MSD
2225774
80000
70000
60000
50000
50000
Variable
Actual
Fits
Population size
Accuracy Measures
MAPE
2
MAD
1633
MSD
3932649
110000
Population size
120000
Population size
150000
Variable
Actual
Fits
130000
Population size
140000
50000
1
6
12
18
24
30 36
Index
42
48
54
1
6
12
18
24
30 36
Index
42
48
54
60
Linear, exponential, quadratic and S-curve models were
used to analysis the increase features. Parameters
estimation is based on OLS methods.
4 results were evaluated in 3 elementary indexes as
MAD, MAPE, MSE. The result tells us that S-curve
models fits China’s increase sharply then slowly reality.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
38
TOTAL POPULATION SIZE PREDICT-Evaluation of 4 models
Method
linear
exponential growth
Equation
Yt = 51011 + 1458*t
Yt = 56244.4 * (1.01628**t)
MAPE
2
4
MAD
1633
3525
MSD
3932649
19641671
quadratic trend
S-curve
Yt = 49384 + 1618.1*t - 2.668*t**2
Yt = (10**6) / (6.07150
+14.0151*(0.961598**t))
2
2
1494
1189
3454107
2225774
Trends analysis’s use is to predict future. So we focus on the most recent
regression deviation to evaluate these four models. That means we only take
the deviation from 2000~2007.
Method
linear
exponential groth
quadratic trend
S-curve
Year
2008
Population size 134975
(unite:1*104)
--自强不息,厚德载物--
MAPE
1.8
6.4
1.2
0.7
2009
135918
2010
136836
MSE
8193544
80808235
3602967
1144129
2011
137731
IE @Applied Statistics, Group Report
MAD
2385
8344
1518
872
2012
138603
39
TOTAL POPULATION SIZE PREDICT-ARIMA Description


ARIMA is developed by Box and Jenkins in 1970s,
and it is a famous model in time serious analysis
combined auto regression, moving average and also
difference operation to treat unstationary time series
data.
ARIMA(p,d,q), is determined in 3 step:
 Stationary transfer
 Model determination
 Parameters estimation
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
40
TOTAL POPULATION SIZE PREDICT-Stationary Test

A stationary time series data means the mean of the series does
not change with time shift, and standard deviation could be
limited in a range.
T im e Se rie s Pl ot o f C2
2500
2000
1500
C2
1000
500
0
-500
-1000
1950


1956
1962
1968
1974
1980
Index
1986
1992
1998
2004
Obvious increase trend was observed, so difference operation is
needed to transfer the unstationary series into stationary one.
But what is the difference order?
“Augmented Dickey-Fuller , ADF” test is used in Matlab to test
whether the series has a unit root.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
41
TOTAL POPULATION SIZE PREDICTOne order difference solution
xt  1xt 1   2 xt 2 
  p xt  p     xt 1   t   t
H 0 :  1  0    1
H1 :  1  0    1
P-value is smaller than 0.05, so
we reject the null hypothesis.
The series does not have unit
root.
It passes the AFD test and then
come into the model
determination part.
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
42
TOTAL POPULATION SIZE PREDICTOne order difference solution
After determine the difference order of d=1, the
ARIMA model turns into ARMA model. Model
determination is based on ACF & PACF.
Autocorrelation Function for C1
Partial Autocorrelation Function for C1
(with 5% significance limits for the autocorrelations)
(with 5% significance limits for the partial autocorrelations)
1.0
1.0
0.8
0.8
Partial Autocorrelation

Autocorrelation
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
2
4
6
8
10
12
14
16
18
20
22
24
1
2
3
4
5
6
Lag

7
8
Lag
9
10
11
12
13
14
15
ACF has a heavy tail and PACF is bobtail, then it is
AR(1) model. So based on Box- Jenkins ARIMA(1,1,0)
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
43
TOTAL POPULATION SIZE PREDICTOne order difference solution

Residual check shows a odd points of “1961”. It is far
from the normality line. What shall we do?



Transfer?
Cut the series?
We cut this series and take only 1962~2007.
R es id ua l Pl ot s fo r C1
Normal Probability Plot
Versus Fits
99
1000
Re si dua l
P erc en t
90
50
10
1
0
-1000
-2000
-2000
-1000
Re sid ual
0
1000
-1000
2000
Versus Order
40
1000
30
0
Res id ual
F re que nc y
Histogram
0
1000
F itt ed Val ue
20
-1000
10
-2000
0
-2400
-1600
-800
Re sid ual
0
800
--自强不息,厚德载物--
1 5
10 15 20 25 30 35 40 45 50 55
Obse rva tio n O rde r
IE @Applied Statistics, Group Report
44
TOTAL POPULATION SIZE PREDICTTwo order difference solution
AFD test?
Decision vector H shows the
1962-2007 part is not a stationary
series any more.
And observation shows a
decrease trend. So higher order
difference is needed.
Time Series Plot of C10
750
500
250
C10

0
-250
-500
1963
--自强不息,厚德载物--
1968
1973
IE @Applied Statistics, Group Report
1978
1983
1988
Index
1993
1998
2003
2008
45
TOTAL POPULATION SIZE PREDICTTwo order difference solution
Autocorrelation Function for 2I
Partial Autocorrelation Function for 2I
(with 5% significance limits for the autocorrelations)
(with 5% significance limits for the partial autocorrelations)
1.0
0.8
0.8
Partial Autocorrelation
1.0
Autocorrelation
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
-1.0
2
4
6
8
10
12
14
16
18
20
22
1
24
2
3
4
5
Lag
6
Lag
7
8
9
10
11
Residual Plots for 2I
Normal Probability Plot
400
Residual
Percent
90
50
10
1
-400
200
0
-200
-400
-200
0
Residual
200
400
-500
-250
Histogram
400
12
200
8
--自强不息,厚德载物--
500
0
-200
4
0
0
250
Fitted Value
Versus Order
16
Residual

ARMA is of p=3, q=1 based
on Box-Jenkins method
SS = 1114603, MS = 27865.
Frequency

Versus Fits
99
-400
-300 -200 -100
0
100
Residual
200
300
IE @Applied Statistics, Group Report
400
1
5
10
15
20
25
30
35
Observation Order
40
45
46
TOTAL POPULATION SIZE PREDICTTwo order difference solution
ARIMA Model: C1
Final Estimates of Parameters
Type Coef SE Coef T P
AR 1 0.9138 0.0714 12.88 0.000
Constant 124.68 31.32 3.37 0.002
Mean 1313.7 390.3
Number of observations: 46
Residuals: SS = 1773508 (backforecasts excluded)
MS = 40307 DF = 44
Population increase can be
estimated by above function
to get future value based on
historical ones.
yt  21.74  0.8538 yt 1  0.0203 yt 2  0.5614 yt 3  0.5958zt 1
Time Series Plot of 2I, FITS1
750
Time Series Plot of 1962-2007, C11
2500
Variable
2I
FITS1
Time Series Plot of C5, C7
140000
Variable
1962-2007
C11
Variable
C5
C7
130000
500
2000
120000
0
Data
110000
Data
Data
250
1500
100000
90000
-250
1000
80000
-500
70000
500
1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003 2007
Index
--自强不息,厚德载物--
2001
2006 2011
2016
2021 2026
Index
2031
2036 2041
2046
IE @Applied Statistics, Group Report
1963 1969 1975 1981 1987 1993 1999 2005 2011 2017
Index
47
TOTAL POPULATION-Prediction and Deviation Analysis
Method
S-curve
ARIMA(1,1,0)
Improvement( with S-curve )
MAPE
MSD
MAD
0.7
0.06
0.914286
1144129
6058
0.994705
872
73
0.916284
(2000-2007)
Future increase and it 95% CI could be predicted, then the total
population could be get respectably. MAPE dropped nearly 91%, MSD
99% and MAD 91%.
Time Series Plot of 2I, C19, C20, C21
750
Time Series Plot of pred, pred-lim, pred-up, 1962-2007
2500
Variable
2I
C19
C20
C21
500
2000
120000
110000
Data
0
Variable
total popu
C12
C13
C14
130000
1500
Data
Data
250
Time Series Plot of total popu, C12, C13, C14
140000
Variable
pred
pred-lim
pred-up
1962-2007
1000
100000
90000
-250
80000
500
70000
-500
0
1963 1968 1973 1978 1983 1988 1993 1998 2003 2008
Index
--自强不息,厚德载物--
60000
1962 1967 1972 1977 1982 1987 1992 1997 2002 2007 2012
Index
IE @Applied Statistics, Group Report
1962 1967 1972 1977 1982 1987 1992 1997 2002 2007 2012
Index
48
TOTAL POPULATION PREDICT-ARIMA Result
Time Series Plot of C9, C10, C11, C12
138000
Variable
C9
C10
C11
C12
136000
Data
134000
132000
130000
128000
126000
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
C8
Year
(unite:1*104)
ARIMA(3,2,1)
S-curve
2008
2009
2010
2011
2012
132,790
134,975
133,404
135,918
134,030
136,836
134,614
137,731
135,186
138,603
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
49
OUTLINE
PART 1:Introduction (Background, Objective, Terminology )
PART 2:Descriptive Analysis
PART 3:Hypothesis Test of Male-Female Birth Rate
(Descriptive date, Exploratory date , Cause analysis)
PART 4:Fertility Comparison
(Descriptive date , Exploratory date , Cause analysis)
PART 5:Analysis of Ratio
(Descriptive date , Exploratory date , Cause analysis)
PART 6:Analysis of Dead Rate
(Descriptive date , Exploratory date , Cause analysis)
PART 7:Time Series Analysis of Total Population Size
(Trend analysis, Model based analysis)
PART 8: Conclusion
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
50
TOTAL POPULATION PREDICT-Conclusion



Features: total population
population increase
what’s right now?

Chinese population takes up nearly 1/4 of the world population.

post-80’s has come into the region of birth peak which keeps a
relative high population increase.

health care improved in large extent and people will have a much
longer life.
What’s in future?

around 2050 China will face a first
time population decrease.

social problems but also economic challenges will show up.
Economy increase? Social welfare? Stability ?
Is it necessary for the government to revise the birth
control police to keep China’s population and the
increase at a reasonable
region?
--自强不息,厚德载物-IE @Applied Statistics, Group Report
51
Thanks!
--自强不息,厚德载物--
IE @Applied Statistics, Group Report
52
Download