The Analysis and Forecast of Chinese Population Instructor: 王凯波 Group Members: --自强不息,厚德载物-- IE @Applied Statistics, Group Report 李文华 2009210538 杨丽丹 2009210561 宋 芹 2009210568 杨春晖 2009220200 1 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 2 INTRODUCTION--Background Before 1950 China had demographic characteristics of a premodern society with high dead rates and high fertility rates. This situation produced certain stability in population size or, at least, leads to a slow increase. After the foundation of The People’s Republic of China in 1949, China entered its demographic transition: first dead rates began to fall rapidly and second, fertility remained for many years at about an average of six children per woman. As a result of this China experienced rapid population growth due to the high number of children born, a sharp decline of baby dead rate. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 3 INTRODUCTION-China Population Development China Population Development 1982 1977 1953 清(AD 1911) 清(AD 1840) 清(AD 1800) 明(AD 1600) 宋元(AD 1100) 隋唐 魏晋 秦汉 春秋战国 0 20000 --自强不息,厚德载物-- 40000 60000 80000 IE @Applied Statistics, Group Report 100000 Population (10k persons) 120000 4 INTRODUCTION-China Population Today Now China has a population over 1.3 billion (2007), that is nearly 1/5 the world population. Most of the population are in the east (94%), which are more developed, and enjoying a relatively lower dead rate, and a lower baby dead rate --自强不息,厚德载物-- IE @Applied Statistics, Group Report 5 INTRODUCTION--Objective Our report would like to apply the statistics method with substantial evidence data got from CHINA POPULATION STATISTICS YEARBOOK (19952006) , and proceed the research and the analysis on the male-female birth rate, fertility rate and dead rate among different area (city, town, village), and different years, to have a trend analysis and prediction on the total China population. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 6 INTRODUCTION--Terminology City, Town & Village城市,乡镇,农村: City and Town in China is administratively defined as statutory cities and statutory towns judging from the population, economic, public finance and Infrastructure four aspects. Village is referred to the areas other than cities and towns. Birth Rate (or crude birth rate) 出生率:The number of live births per 1,000 population in a given year. Not to be confused with the growth rate. Death Rate (or crude death rate) 死亡率:The number of deaths per 1,000 population in a given year. Sex Ratio出生人口性别比:The number of males per 100 females in a population. Fertility Rate生育率:The number of live births per 1,000 women ages 15-44 or 15-49 years in a given year. ----Definition from Administrative Office of the State Council &Population Reference Bureau, USA --自强不息,厚德载物-- IE @Applied Statistics, Group Report 7 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 8 DESCRIPTIVE ANALYSIS-Variables 1104 Population size of China: Fertility rate:(生育率) ‰ (1994-2005) Male-female birth rate:F:100 (1994-2005) Male (female) ratio of a certain age: % the percentage of the male number of total male population. Death rate: ‰ --自强不息,厚德载物-- IE @Applied Statistics, Group Report 9 DESCRIPTIVE ANALYSIS-Data Sheet The data were collected from internet ,such as CHINA POPULATION STATISTICS YEARBOOK (1995-2006) (中国人口统计年鉴) etc. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 10 DESCRIPTIVE ANALYSIS-Population size Observation: Continuous increase since 1962 Increase rate decrease last 20 years --自强不息,厚德载物-- IE @Applied Statistics, Group Report 11 DESCRIPTIVE ANALYSIS- Fertility rate Observation The age distribution of fertility is different. The birth peak for village comes earlier than city. And for all ages the village has higher birth rate. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 12 DESCRIPTIVE ANALYSIS- Male/female rate of newborn Observation Jumping town data and stationary city and village data All exceed the rational range (102 to 107) --自强不息,厚德载物-- IE @Applied Statistics, Group Report 13 DESCRIPTIVE ANALYSIS- Male-Female rate & death rate(2005) • Does there are any gender choice? • Does female lives longer? --自强不息,厚德载物-- IE @Applied Statistics, Group Report 14 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 15 Male-Female Birth Rate--Data Data selection: mainly survey and observation • 10 variables • 27,000 data points Data process • integrate original data(15 forms) into one form • recalculate them to get new data • select data points to build a new sample e.g. fertility rate , death rate --自强不息,厚德载物-- IE @Applied Statistics, Group Report 16 Population Balance: Male-Female Birth Rate Main basis of population balance, of great importance. Number of baby boys when 100 baby girls: babyboy male femalerate 100 babygirl (year:1994 -2005. type: city, town, village. 36 data points.) • One-way ANOVA: Male-female birth rate versus type (city, town and village) H 0 : C T V --自强不息,厚德载物-- IE @Applied Statistics, Group Report 17 Population Balance: Male-Female Birth Rate Conclusions and cause analysis: Three types own significant difference of gender balances and choices. P-value = 0.000<0.05 Boy preference: village (highest) town city (lowest) • • • • Viewpoint that Man is superior to woman The farm work and lifestyle Education Medical technique (helps sharpen the gender choice) --自强不息,厚德载物-- IE @Applied Statistics, Group Report 18 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 19 Population replacement: Fertility rate • Main basis of population balance, of great importance. • number of babies per 1000 women from age 15-49: (year:1994 -2005. type: city, town, village. 2520 original points.) Scatterplot of City fertili, Town fertili, Village fert vs Age 200 Scatterplot of City fertili, Town fertili, Village fert vs Age 20 Variable City fertility rate Town fertility rate Village fertility rate 150 Scatterplot of City fertili, Town fertili, Village fert vs Age Variable City fertility rate Town fertility rate Village fertility rate Scatterplot of City fertili, Town fertili, Village fert vs Age Variable City fertility rate Town fertility rate Village fertility rate 200 15 150 50 10 5 0 20 30 Age 40 50 100 0 0 10 2004 --自强不息,厚德载物-- 20 30 Age 40 50 100 50 50 0 10 Y-Data Y-Data Y-Data Y-Data 150 100 Variable City fertility rate Town fertility rate Village fertility rate 200 10 20 30 Age 40 10 50 2003 IE @Applied Statistics, Group Report 2002 20 30 Age 40 50 2001 20 Population replacement: Fertility rate • Two-way ANOVA: Fertility rate versus type (city, town and village), year • Intersection of type and year Both significant! type difference: village> town> city year difference: negative trend Conclusions and cause analysis: It proves that one-child policy in our country works a lot. Small rise and fall around 2003 result from the very epidemic SARS around 2003,which reduced the contact and pregnant chances. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 21 Population replacement: Fertility rate Data process year(2001-2005), age( with highest fertility), type (city, town and village) Build up a new sample "fertility peak” • Scatter plot of fertility peak Fertility peak (highest fertility age) versus type, year •The age peak is around 24 • Fertility peak decreases (city changes most. village most stable) •The village peak is the highest --自强不息,厚德载物-- IE @Applied Statistics, Group Report 22 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 23 ANALYSIS of RATIO Step 1: Data collection Take city male ratio for example City male ratio= --自强不息,厚德载物-- IE @Applied Statistics, Group Report 24 ANALYSIS of RATIO Step 2: Descriptive date analysis Boxplot of ratio 0.52 ratio 0.51 0.50 0.49 0.48 area sex C T F V C T M V In city, both male and female ratios are near 0.5. But the difference between male ratio and female ratio is getting larger and larger from town to village. Basically, there are more male than female in society. That is the reason why it is hard for many young men to find “Mrs. Right”. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 25 ANALYSIS of RATIO Step 3 Exploratory date analysis Residual Plots for ratio Normal Probability Plot Versus Fits 99 0.006 Residual Percent 90 50 0.003 0.000 -0.003 10 -0.006 1 -0.008 -0.004 0.000 0.004 Residual 0.008 0.49 Histogram 0.50 0.51 Fitted Value Versus Order 0.006 Residual Frequency 8 6 4 0.003 0.000 -0.003 2 -0.006 0 -0.006 -0.004 -0.002 0.000 0.002 0.004 0.006 Residual --自强不息,厚德载物-- 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Observation Order IE @Applied Statistics, Group Report 26 ANALYSIS of RATIO Step 3 Exploratory date analysis Interaction Plot for ratio Data Means sex F M 0.515 0.510 Mean 0.505 0.500 0.495 0.490 0.485 C T area --自强不息,厚德载物-- V Conclusion: •In city, male ratio is equal to female ratio. But it is larger than female ratio in town and village. •The difference between male ratio and female ratio is getting larger and larger from city to village. IE @Applied Statistics, Group Report 27 ANALYSIS of RATIO Step 4 Cause analysis Reason 1 Reason 2 City people •Just have one kid •Higher education •Higher pressures in life •Dink family Town and village people •More than one kid •value the male child only And this phenomenon in village is more serious than that in town, so the difference between male ratio and female ratio in village is larger than that in town. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 28 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 29 ANALYSIS of DEAD RATE Step 1: Data collection Take city male dead rate for example City male dead rate ratio= --自强不息,厚德载物-- IE @Applied Statistics, Group Report 30 ANALYSIS of DEAD RATE Step 2: Descriptive date analysis Boxplot of dead rate 0.45 Observation: male dead rate is higher than female’s. And there is another conclusion that the dead rate is increasing from city to village dead rate 0.40 0.35 0.30 0.25 0.20 area sex C T F --自强不息,厚德载物-- V C T M V IE @Applied Statistics, Group Report 31 ANALYSIS of DEAD RATE Step 3 Exploratory date analysis Residual Plots for dead rate Normal Probability Plot Versus Fits 99 0.02 Residual Percent 90 50 10 1 -0.030 0.01 0.00 -0.01 -0.02 -0.015 0.000 0.015 Residual 0.030 0.20 0.40 Versus Order 4.8 0.02 3.6 0.01 Residual Frequency Histogram 0.25 0.30 0.35 Fitted Value 2.4 1.2 0.00 -0.01 -0.02 0.0 -0.02 -0.01 0.00 0.01 Residual 0.02 --自强不息,厚德载物-- 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 Observation Order IE @Applied Statistics, Group Report 32 ANALYSIS of DEAD RATE Step 3 Exploratory date analysis Interaction Plo t fo r dead ra te Data Means sex F M 0.40 Mean 0.35 0.30 0.25 Conclusion: •Male dead rate is higher than female dead rate •Dead rate is increasing from city to village 0.20 C T area --自强不息,厚德载物-- V IE @Applied Statistics, Group Report 33 ANALYSIS of DEAD RATE Step 4 Cause analysis It is increasing from city to village 1:Male dead rate is higher reason •Male just have one X chromosome •Main labor force in society •Bad habit: smoking drinking •Accident, crime • • • • City Higher education Better living standard Better medical care Better work condition generally speaking, city better than town and village; and town is a little better than village. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 34 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 35 TOTAL POPULATION SIZE PREDICT-Guideline The analysis here is based on the population data since the foundation of China, and based on 58 year’s population data we could do trend analysis and prediction in qualitative or quantitative analysis. Trend analysis Linear, exponential, quadratic and S-curve Deviation analysis ARIMA Stationary Model determination based on ACF/PACF Prediction and deviation analysis --自强不息,厚德载物-- IE @Applied Statistics, Group Report 36 TOTAL POPULATION SIZE PREDICT- Trend object Continuous increase and a odd decrease point were observed in annually collected population data. From the Figure, we see the increase rate declined in last 20 years. China’s population has reached 132,129*104(2007) , we still face the serious population problem and also aging population problem too. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 37 TOTAL POPULATION SIZE PREDICT- Result of trend analysis Trend Analysis Plot for Population size Trend Analysis Plot for Population size Trend Analysis Plot for Population size Trend Analysis Plot for Population size Linear Trend Model Yt = 51011 + 1458*t Growth Curve Model Yt = 56244.4 * (1.01628**t) Quadratic Trend Model Yt = 49384 + 1618.1*t - 2.668*t**2 S-Curve Trend Model Yt = (10**6) / (6.07150 + 14.0151*(0.961598**t)) 100000 90000 80000 70000 140000 Variable Actual Fits Accuracy Measures MAPE 4 MAD 3525 MSD 19641671 125000 100000 75000 130000 120000 110000 100000 90000 80000 70000 60000 60000 50000 1 6 • • 12 18 24 30 36 Index 42 48 54 1 6 12 18 24 30 36 Index 42 48 54 140000 Accuracy Measures MAPE 2 MAD 1494 MSD 3454107 120000 Variable Actual Fits Forecasts 130000 Curve Parameters Intercept 49784 Asymptote 164704 Asym. Rate 1 110000 100000 90000 Accuracy Measures MAPE 2 MAD 1189 MSD 2225774 80000 70000 60000 50000 50000 Variable Actual Fits Population size Accuracy Measures MAPE 2 MAD 1633 MSD 3932649 110000 Population size 120000 Population size 150000 Variable Actual Fits 130000 Population size 140000 50000 1 6 12 18 24 30 36 Index 42 48 54 1 6 12 18 24 30 36 Index 42 48 54 60 Linear, exponential, quadratic and S-curve models were used to analysis the increase features. Parameters estimation is based on OLS methods. 4 results were evaluated in 3 elementary indexes as MAD, MAPE, MSE. The result tells us that S-curve models fits China’s increase sharply then slowly reality. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 38 TOTAL POPULATION SIZE PREDICT-Evaluation of 4 models Method linear exponential growth Equation Yt = 51011 + 1458*t Yt = 56244.4 * (1.01628**t) MAPE 2 4 MAD 1633 3525 MSD 3932649 19641671 quadratic trend S-curve Yt = 49384 + 1618.1*t - 2.668*t**2 Yt = (10**6) / (6.07150 +14.0151*(0.961598**t)) 2 2 1494 1189 3454107 2225774 Trends analysis’s use is to predict future. So we focus on the most recent regression deviation to evaluate these four models. That means we only take the deviation from 2000~2007. Method linear exponential groth quadratic trend S-curve Year 2008 Population size 134975 (unite:1*104) --自强不息,厚德载物-- MAPE 1.8 6.4 1.2 0.7 2009 135918 2010 136836 MSE 8193544 80808235 3602967 1144129 2011 137731 IE @Applied Statistics, Group Report MAD 2385 8344 1518 872 2012 138603 39 TOTAL POPULATION SIZE PREDICT-ARIMA Description ARIMA is developed by Box and Jenkins in 1970s, and it is a famous model in time serious analysis combined auto regression, moving average and also difference operation to treat unstationary time series data. ARIMA(p,d,q), is determined in 3 step: Stationary transfer Model determination Parameters estimation --自强不息,厚德载物-- IE @Applied Statistics, Group Report 40 TOTAL POPULATION SIZE PREDICT-Stationary Test A stationary time series data means the mean of the series does not change with time shift, and standard deviation could be limited in a range. T im e Se rie s Pl ot o f C2 2500 2000 1500 C2 1000 500 0 -500 -1000 1950 1956 1962 1968 1974 1980 Index 1986 1992 1998 2004 Obvious increase trend was observed, so difference operation is needed to transfer the unstationary series into stationary one. But what is the difference order? “Augmented Dickey-Fuller , ADF” test is used in Matlab to test whether the series has a unit root. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 41 TOTAL POPULATION SIZE PREDICTOne order difference solution xt 1xt 1 2 xt 2 p xt p xt 1 t t H 0 : 1 0 1 H1 : 1 0 1 P-value is smaller than 0.05, so we reject the null hypothesis. The series does not have unit root. It passes the AFD test and then come into the model determination part. --自强不息,厚德载物-- IE @Applied Statistics, Group Report 42 TOTAL POPULATION SIZE PREDICTOne order difference solution After determine the difference order of d=1, the ARIMA model turns into ARMA model. Model determination is based on ACF & PACF. Autocorrelation Function for C1 Partial Autocorrelation Function for C1 (with 5% significance limits for the autocorrelations) (with 5% significance limits for the partial autocorrelations) 1.0 1.0 0.8 0.8 Partial Autocorrelation Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 2 4 6 8 10 12 14 16 18 20 22 24 1 2 3 4 5 6 Lag 7 8 Lag 9 10 11 12 13 14 15 ACF has a heavy tail and PACF is bobtail, then it is AR(1) model. So based on Box- Jenkins ARIMA(1,1,0) --自强不息,厚德载物-- IE @Applied Statistics, Group Report 43 TOTAL POPULATION SIZE PREDICTOne order difference solution Residual check shows a odd points of “1961”. It is far from the normality line. What shall we do? Transfer? Cut the series? We cut this series and take only 1962~2007. R es id ua l Pl ot s fo r C1 Normal Probability Plot Versus Fits 99 1000 Re si dua l P erc en t 90 50 10 1 0 -1000 -2000 -2000 -1000 Re sid ual 0 1000 -1000 2000 Versus Order 40 1000 30 0 Res id ual F re que nc y Histogram 0 1000 F itt ed Val ue 20 -1000 10 -2000 0 -2400 -1600 -800 Re sid ual 0 800 --自强不息,厚德载物-- 1 5 10 15 20 25 30 35 40 45 50 55 Obse rva tio n O rde r IE @Applied Statistics, Group Report 44 TOTAL POPULATION SIZE PREDICTTwo order difference solution AFD test? Decision vector H shows the 1962-2007 part is not a stationary series any more. And observation shows a decrease trend. So higher order difference is needed. Time Series Plot of C10 750 500 250 C10 0 -250 -500 1963 --自强不息,厚德载物-- 1968 1973 IE @Applied Statistics, Group Report 1978 1983 1988 Index 1993 1998 2003 2008 45 TOTAL POPULATION SIZE PREDICTTwo order difference solution Autocorrelation Function for 2I Partial Autocorrelation Function for 2I (with 5% significance limits for the autocorrelations) (with 5% significance limits for the partial autocorrelations) 1.0 0.8 0.8 Partial Autocorrelation 1.0 Autocorrelation 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 -1.0 2 4 6 8 10 12 14 16 18 20 22 1 24 2 3 4 5 Lag 6 Lag 7 8 9 10 11 Residual Plots for 2I Normal Probability Plot 400 Residual Percent 90 50 10 1 -400 200 0 -200 -400 -200 0 Residual 200 400 -500 -250 Histogram 400 12 200 8 --自强不息,厚德载物-- 500 0 -200 4 0 0 250 Fitted Value Versus Order 16 Residual ARMA is of p=3, q=1 based on Box-Jenkins method SS = 1114603, MS = 27865. Frequency Versus Fits 99 -400 -300 -200 -100 0 100 Residual 200 300 IE @Applied Statistics, Group Report 400 1 5 10 15 20 25 30 35 Observation Order 40 45 46 TOTAL POPULATION SIZE PREDICTTwo order difference solution ARIMA Model: C1 Final Estimates of Parameters Type Coef SE Coef T P AR 1 0.9138 0.0714 12.88 0.000 Constant 124.68 31.32 3.37 0.002 Mean 1313.7 390.3 Number of observations: 46 Residuals: SS = 1773508 (backforecasts excluded) MS = 40307 DF = 44 Population increase can be estimated by above function to get future value based on historical ones. yt 21.74 0.8538 yt 1 0.0203 yt 2 0.5614 yt 3 0.5958zt 1 Time Series Plot of 2I, FITS1 750 Time Series Plot of 1962-2007, C11 2500 Variable 2I FITS1 Time Series Plot of C5, C7 140000 Variable 1962-2007 C11 Variable C5 C7 130000 500 2000 120000 0 Data 110000 Data Data 250 1500 100000 90000 -250 1000 80000 -500 70000 500 1963 1967 1971 1975 1979 1983 1987 1991 1995 1999 2003 2007 Index --自强不息,厚德载物-- 2001 2006 2011 2016 2021 2026 Index 2031 2036 2041 2046 IE @Applied Statistics, Group Report 1963 1969 1975 1981 1987 1993 1999 2005 2011 2017 Index 47 TOTAL POPULATION-Prediction and Deviation Analysis Method S-curve ARIMA(1,1,0) Improvement( with S-curve ) MAPE MSD MAD 0.7 0.06 0.914286 1144129 6058 0.994705 872 73 0.916284 (2000-2007) Future increase and it 95% CI could be predicted, then the total population could be get respectably. MAPE dropped nearly 91%, MSD 99% and MAD 91%. Time Series Plot of 2I, C19, C20, C21 750 Time Series Plot of pred, pred-lim, pred-up, 1962-2007 2500 Variable 2I C19 C20 C21 500 2000 120000 110000 Data 0 Variable total popu C12 C13 C14 130000 1500 Data Data 250 Time Series Plot of total popu, C12, C13, C14 140000 Variable pred pred-lim pred-up 1962-2007 1000 100000 90000 -250 80000 500 70000 -500 0 1963 1968 1973 1978 1983 1988 1993 1998 2003 2008 Index --自强不息,厚德载物-- 60000 1962 1967 1972 1977 1982 1987 1992 1997 2002 2007 2012 Index IE @Applied Statistics, Group Report 1962 1967 1972 1977 1982 1987 1992 1997 2002 2007 2012 Index 48 TOTAL POPULATION PREDICT-ARIMA Result Time Series Plot of C9, C10, C11, C12 138000 Variable C9 C10 C11 C12 136000 Data 134000 132000 130000 128000 126000 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 C8 Year (unite:1*104) ARIMA(3,2,1) S-curve 2008 2009 2010 2011 2012 132,790 134,975 133,404 135,918 134,030 136,836 134,614 137,731 135,186 138,603 --自强不息,厚德载物-- IE @Applied Statistics, Group Report 49 OUTLINE PART 1:Introduction (Background, Objective, Terminology ) PART 2:Descriptive Analysis PART 3:Hypothesis Test of Male-Female Birth Rate (Descriptive date, Exploratory date , Cause analysis) PART 4:Fertility Comparison (Descriptive date , Exploratory date , Cause analysis) PART 5:Analysis of Ratio (Descriptive date , Exploratory date , Cause analysis) PART 6:Analysis of Dead Rate (Descriptive date , Exploratory date , Cause analysis) PART 7:Time Series Analysis of Total Population Size (Trend analysis, Model based analysis) PART 8: Conclusion --自强不息,厚德载物-- IE @Applied Statistics, Group Report 50 TOTAL POPULATION PREDICT-Conclusion Features: total population population increase what’s right now? Chinese population takes up nearly 1/4 of the world population. post-80’s has come into the region of birth peak which keeps a relative high population increase. health care improved in large extent and people will have a much longer life. What’s in future? around 2050 China will face a first time population decrease. social problems but also economic challenges will show up. Economy increase? Social welfare? Stability ? Is it necessary for the government to revise the birth control police to keep China’s population and the increase at a reasonable region? --自强不息,厚德载物-IE @Applied Statistics, Group Report 51 Thanks! --自强不息,厚德载物-- IE @Applied Statistics, Group Report 52