Analysis of per capita Net State Domestic Product as a function of Unemployment & Literacy Charvi Rampuria (310), Sukshma Amogha (762) ABSTRACT: This project aims to learn more about India's unemployment and literacy condition and how they impact the state counterpart to a country's net domestic product. We have used data from the years 20112012 in our project. In this project, regression analysis is used to determine the relationship between India’s NSDP, unemployment rate, and literacy rate. The technique of finding the connections between two or more variables is known as regression analysis. The unemployment rate and literacy rate are independent variables and NSDP is the dependent variable. The findings of the final study are presented as a linear and multiple regression analysis. We can readily determine how the unemployment and literacy rates of various States and Union Territories across India influence the Net state domestic product using linear and multiple regression analysis. Unemployment is greatly influenced by GDP, India's unemployment rate falls as the country's GDP rises. Here, in this project, we have tried to study how unemployment impacts the state counterpart of NDP (GDP-Depreciation). Furthermore, there is enough evidence to study the impact education or literacy has on GDP, which we have tried to analyze in this project. Keywords: unemployment, NSDP, literacy, GDP 1. INTRODUCTION Net state domestic product (NSDP) is the state counterpart to a country's net domestic product (NDP), which equals the gross domestic product (GDP) minus depreciation on a country's capital goods. It is a list of Indian states and union territories by NSDP per capita. According to UNESCO, the literacy rate is defined by the percentage of a given age group population that can read and write. The adult literacy rate corresponds to ages 15 and above, the youth literacy rate to ages 15 to 24, and the elderly to ages 65 and above. It is typically measured according to the ability to comprehend a short simple statement on everyday life. Generally, literacy also encompasses numeracy, and measurement may incorporate a simple assessment of arithmetic ability. The literacy rate and the number of literates should be distinguished from functional literacy, a more comprehensive measure of literacy assessed on a continuum in which multiple proficiency levels can be determined. According to OECD the unemployed are people of working age who are without work, are available for work, and have taken specific steps to find work. The uniform application of this definition results in estimates of unemployment rates that are more internationally comparable than estimates based on national definitions of unemployment. This indicator is measured in the numbers of unemployed people as a percentage of the labor force and it is seasonally adjusted. The labor force is defined as the total number of unemployed people plus those in employment. In the project, we try to analyze the impact of literacy rate and unemployment rate on the per capita net state domestic product across all States and Union Territories of India. We use regression analysis to understand the impact of the independent variables, unemployment rate, and literacy rate on per capita state products. The next section presents a literature review of the studies which have been referred to in producing this research. In conclusion, we talk about some policy suggestions. 2. LITERATURE REVIEW In their paper Impact of Schooling on the Economic Development of Low-Income Nations, Germinal G. Van and Marcella Taleb Da Costa (2021) indicate that average years of education can raise GDP per capita in lowincome countries. They chose to look into the case of India, where there is no way to assess educational quality due to a lack of data. They used data from the World Statistics Bank for GDP per capita and data from In Our World for Mean Years of Schooling. They employed polynomial regression to estimate the model's parameters. Their findings revealed a positive relationship between educational attainment squared and GDP per capita, implying that a one-year increase in average educational attainment improves GDP per capita by about 132.75 dollars. In the context of this study, we could speculate that a country's average years of schooling are influenced by its GDP per capita. Increases in GDP per capita improve education levels. Their findings suggest that education is a critical instrument for improving a country's economy. Individuals become more productive and gain more talents to contribute to the labor market as a result of their education. Furthermore, education fosters creativity, which leads to innovation, which is another aspect that adds to a country's growth and economic prosperity. Altaf Hussain Padder & B. Mathavan ( 2021) in their Granger Causality Approach: The Relationship between Unemployment and Economic Growth in India, looked at the relationship between unemployment and real gross domestic product in India's economy from 1990 to 2020. Their analysis used data such as the gross domestic product, which is a true indicator of economic growth, and the unemployment rate. The final result of the calculated regression of unemployment and economic growth as an explanatory variable for India indicated that economic growth has only a 6% impact on unemployment and that they are negatively connected, with the remaining 94% explained by other factors. The small value of R-squared showed that unemployment rate evolution is largely influenced by other factors, which were not part of this study. According to the study, the government should create more employment opportunities as soon as possible to absorb the country's swarming population of unemployed workers by modernizing the agriculture sector, which is the most important sector, providing more than 42 percent of livelihood while contributing only 13 percent to GDP. Okun's Law is an empirically observed relationship between unemployment and losses in a country's production (GDP). It can also be used to estimate gross national product (GNP). Further, Okun’s law states that a country’s gross domestic product (GDP) must grow at about a 4% rate for one year to achieve a 1% reduction in the rate of unemployment over time. The law has evolved to fit the current economic climate and employment trends. 3. METHODOLOGY The study is based on secondary data. To understand the impact of the unemployment and literacy rates, econometric analysis has been done using simple regression models for various States and Union Territories of India. Cross-sectional data for the year 2011-12 has been taken for various States and Union Territories of India from the Reserve Bank of India data source. The data for the NSDP variable is in INR. The ordinary least square method under the Classical Linear Regression Model is used for regression analysis where NSDP is taken as the dependent variable and unemployment and literacy rates as the independent variables. R-Programming has been used for the analysis. Regression is a technique used to model and analyze the relationships between variables and oftentimes how they contribute and are related to producing a particular outcome together. There are various types of regressions, but here we will only cover the types of regression that are relevant to our research purpose. Linear Regression attempts to model the relationship between two variables by fitting a linear equation to observed data. The explanatory variable is one, while the dependent variable is the other. A linear regression line has an equation of the form Y = a + bX where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0). We have used linear regression in one variable for Model 1 and Model 2 of our study. Multiple Linear Regression seeks to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Henceforth, the model for multiple linear regression, given n observations, is y= i 0 + 1 x + i1 2 x + ... i2 x + for i = 1,2, ... n. p ip i We have used multiple linear regression for Model 3 of our study. THE CLASSICAL LINEAR REGRESSION MODEL • The regression model is linear in the parameters; it may or may not be linear in the variables. That is, the regression model is of the following type. • The explanatory variable(s) X is uncorrelated with the disturbance term u. • Given the value of Xi, the expected, or mean, the value of disturbance term u is zero. That is, E(u/Xi) = 0 The first assumption states that these other factors or forces are not related to Xi (the variable explicitly introduced in the model) and therefore, given the value of Xi, their mean value is zero. • The variance of each ui is constant, homoscedastic. That is var(ui)= σ2 • There is no correlation between two error terms. This is the assumption of no autocorrelation Algebraically, this assumption can be written as cov(ui,uj) = 0 i ≠ j This assumption means that there is no systematic relationship between two error terms, which means the error terms ui are random. • The regression model is correctly specified. Alternatively, there is no specific bias or specification error in the model used in empirical analysis. 4. RESEARCH OBJECTIVE The main objectives of this project are: • To understand the concept of the NSDP and trends across all the Indian states and Union Territories. • To highlight the nature of the relationship between unemployment, literacy, and NSDP. • To analyze the literacy and unemployment situation across all states and Union Territories of India to provide policy recommendations towards achieving higher per capita net domestic state product and a higher GDP. 5. RESULTS & INTERPRETATION OF REGRESSION Interpretation of Model 1 • • One dependent and one independent variable Per capita Net State Domestic Product and Literacy rates Y: Per Capita Net State Domestic Product X1:Literacy Rate Y= B1 + B2X1 + ui (Population regression) Here, B1: Intercept coefficient B2: Slope coefficient ui: random error term Ŷ=b1 + b2X1, where Ŷ is the estimator of Y and b1 and b2 are OLS estimators of B1 and B2 respectively. A priori Expectations of CoefficientsHere, A priori expectations of b2 are positive because when there is an increase in literacy rates across all States and Union Territories, there is an increase in the net state domestic product, establishing a positive relationship between literacy rate and NSDP. H0 : b2 = 0 Ha : b2 > 0 Running regression by OLS methodDependent variable: NSDP (Y) Independent variable: Literacy rate (X1) Intercept Literacy Rate Coefficient 4.5705 1.5272 Mean of Dependent variable Sum Of Residual Square Multiple R-squared F-statistic (1,30) Std. Error 3.5728 0.8221 77.55 8.206232 0.1032 3.451 t-value 1.279 1.858 p-value 0.211 0.073 S.D. dependent variable Residual standard error Adjusted R-squared p-value 8.8255 0.523 0.07328 0.07304 Using the result of regression run by OLS method it can be seen that the estimated coefficients are: b1= 4.5705 b2= 1.5272 Ŷ= 4.5705 + 1.5272X1 Interpretation of coefficientsb1= 4.505 essentially means that when the Literacy rate is 0, the NSDP would be about 4.5705%. In our linear model, b1 is irrelevant as the Literacy rate can never be zero. b2= 1.5272 means that other things remain the same, an increase in Literacy rates by one unit leads to an increase in NSDP by 1.5272%. b2 is positive means that there is a positive relationship between Literacy rates and NSDP in India. R2 (overall goodness of fit measure) of 0.1032 means that 10.32% of total variation in NSDP around its mean value is explained by Literacy rates. Significance of the modelH0: b2 = 0 Ha : b2 > 0 Observed p value= 0.073 α - value= 0.05 Since the p-value > α, the data is statistically insignificant and we fail to reject the null hypothesis. ANOVA Table Analysis of VarianceRegression Residuals Total Df 1 30 31 Sum of Squares 0.944 8.206 9.15 Mean Square 0.9441 0.2735 1.2176 R2 = 0.944/9.15 = 0.1032 F(1,31) = 0.9441/0.2735 = 3.451 [p-value 0.073] F-test (test of overall significance) H0: r2 = 0 Ha: r2 > 0 F-observed: 3.451 F-critical= 4.1708 F observed < F critical which means r2 is statistically insignificant at 5% level of significance. Thus, the null hypothesis cannot be rejected. Test for normality of residualsH0: error normally distributed Ha: error is not normally distributed Test statistic: Chi-square = 41.9 with observed p-value 0.073 (right tail) Chicrit = 43.7729 Since, Chical < Chicrit Errors are not normally distributed and we fail to reject the null hypothesis. Interpretation of Model 2 • • One dependent and one independent variable Per capita Net State Domestic Product and Unemployment rates Y: Per Capita Net State Domestic Product X2: Unemployment Rate Y= B1 + B2X2 + uj (Population regression) Here, B1: Intercept coefficient B2: Slope coefficient uj: random error term Ŷ=b1 + b2X2, where Ŷ is the estimator of Y and b1 and b2 are OLS estimators of B1 and B2 respectively. A priori Expectations of CoefficientsHere, A priori expectations of b2 are negative because when there is an increase in unemployment rates across all States and Union Territories, there is a decrease in the net state domestic product, establishing a negative relationship between unemployment rate and NSDP. H0 : b2 = 0 Ha : b2 < 0 Running regression by OLS methodDependent variable: NSDP (Y) Independent variable: Unemployment rate (X2) Intercept Unemployment Rate Coefficient 11.25698 -0.01435 Mean of Dependent variable Sum Of Residual Square Multiple R-squared F-statistic (1,30) Std. Error 0.48892 0.13444 77.55 9.146831 0.0003794 0.01139 t-value 23.024 -0.107 p-value <2e-16 *** 0.916 S.D. dependent variable Residual standard error Adjusted R-squared p-value 8.8255 0.5522 -0.03294 0.9157 Using the result of regression run by OLS method it can be seen that the estimated coefficients are: b1= 11.25698 b2= -0.01435 Ŷ= 11.25698 -0.01435X2 Interpretation of coefficientsb1= 11.25698 essentially means that when the Unemployment rate is 0, the NSDP would be about 11.25698%. In our linear model, b1 is irrelevant as unemployment can never be removed altogether in a developing economy like India. b2= -0.01435 means that other things remain the same, an decrease in Unemployment rates by one unit will lead to an increase in the NSDP by 0.01435%. b2 is negative means that there is a negative relationship between Unemployment rates and NSDP in India. R2 (overall goodness of fit measure) of 0.0003794 means that 0.03794% of total variation in NSDP around its mean value is explained by Unemployment rates. Significance of the modelH0: b2 = 0 Ha : b2 < 0 Observed p value= 0.9157 α - value= 0.05 Since the p-value > α, the data is statistically insignificant and we fail to reject the null hypothesis. ANOVA Table Analysis of Variance- Regression Residuals Total Dof Sum of Squares Mean Square 1 30 31 0.003 9.147 9.15 0.00347 0.30489 0.30836 R2 = 0.003/9.15 = 0.00032787 F = 0.00347/0.30489 = 0.01138115 [p-value 0.9157] F-test (test of overall significance) H0: r2 = 0 Ha: r2 > 0 F-observed: 0.01139 F-critical= 4.1708 F observed < F critical which means r2 is statistically insignificant at 5% level of significance. Thus, the null hypothesis cannot be rejected. Test for normality of residualsH0: error normally distributed Ha: error is not normally distributed Test statistic: Chi-square = 20.03 with observed p-value 0.9157 (right tail) Chicrit = 43.7729 Since, Chical < Chicrit Errors are not normally distributed and we fail to reject the null hypothesis. Interpretation of Model 3 • • One dependent and two independent variables Per capita Net State Domestic Product, Literacy and Unemployment rates Y: Per Capita Net State Domestic Product X1: Literacy Rate X2: Unemployment Rate Y= B1 + B2X1 + B3X2 + ui (Population regression) Here, B1: Intercept coefficient B2: Partial regression coefficients B3: Partial regression coefficients uj: random error term Ŷ=b1 + b2X1 + b3X2, where Ŷ is the estimator of Y and b1,b2 and b3 are OLS estimators of B1, B2, and B3 respectively. A priori Expectations of Partial Coefficientsb2: Here, a-priori expectations of b2 are positive because when there is an increase in literacy rates across all States and Union Territories, there is an increase in the net state domestic product, establishing a positive relationship between literacy rate and NSDP. b3: Similarly, a-priori expectations of b3 are negative because when there is an increase in unemployment rates across all States and Union Territories, there is a decrease in the net state domestic product, establishing a negative relationship between the unemployment rate and NSDP. Thus, the hypothesis to be tested areH0: b2 = 0 H0: b3 = 0 Ha: b2 > 0 Ha: b3 < 0 Running regression by OLS methodDependent variable: NSDP (Y) Independent variable: Literacy rate (X1) Independent variable: Unemployment rate (X2) Intercept Literacy Rate Unemployment Rate Coefficient 4.12743 1.70544 -0.09287 Mean of Dependent variable Sum Of Residual Square Multiple R-squared F-statistic (2,29) Std. Error 3.66112 0.86860 0.13455 77.55 8.073593 0.1177 1.934 t-value 1.127 1.963 -0.690 p-value 0.2688 0.0593 0.4955 S.D. dependent variable Residual standard error Adjusted R-squared p-value 8.8255 0.5276 0.05682 0.1628 Using the result of regression run by OLS method it can be seen that the estimated coefficients are: b1= 4.12743 b2= 1.70544 b3= -0.09287 Ŷ= 4.12743 + 1.70544X1 -0.09287X2 Interpretation of coefficientsb1= 4.1273 essentially means that when the Unemployment rate and Literacy rate are 0, the NSDP would be about 4.1273%. In our linear model, b1 is irrelevant as unemployment and literacy rates can never be zero. b2= 1.70544 means that other things remain the same, an increase in literacy rate by one unit leads to an increase in the NSDP by 1.70544%. b2 is positive implying that there is a positive relationship between Literacy rates and NSDP in India. b3= -0.09287 means that other things remain the same, a decrease in the unemployment rate by one unit leads to an increase in the NSDP by 0.09287 %. b3 is negative implying that there is a negative relationship between Unemployment rates and NSDP in India. R2 (overall goodness of fit measure) of 0.1177 means that 11.77% of total variation in NSDP around its mean value is explained by Unemployment and Literacy rates in India. Significance of the modelb1 is statistically insignificant as its p-value is greater than 5% (α), i.e. , 0.2688 > 0.05. H0: b2 = 0 Ha : b2 > 0 Observed p value= 0.0593 α - value= 0.05 Since the p-value > α, the data is statistically insignificant and we fail to reject the null hypothesis. H0: b3 = 0 Ha: b3 < 0 Observed p value= 0.4955 α - value= 0.05 Since the p-value > α, the data is statistically insignificant and we fail to reject the null hypothesis. ANOVA Table Analysis of Variance- Regression Residuals Total Dof 2 29 31 Sum of Squares 1.077 8.074 9.151 Mean Square 0.5331 0.2784 1.3551 R2 = 1.077/9.151 = 0.1177 F = 0.5331/0.2784 = 1.914 [p-value 0.1628] F-test (test of overall significance) H0: r2 = 0 Ha: r2 > 0 F-observed: 1.934 F-critical= 3.33 F observed < F critical which means r2 is statistically insignificant at 5% level of significance. Thus, the null hypothesis cannot be rejected. Test for normality of residualsH0: error normally distributed Ha: error is not normally distributed Test statistic: Chi-square = 36.38 with observed p-value 0.1628 (right tail) Chicrit = 42.5569 Since, Chical < Chicrit Errors are not normally distributed and we fail to reject the null hypothesis. 6. DISCUSSION OF RESULTS & POLICY RECOMMENDATIONS The percentage of NSDP shows a positive relationship with the Literacy rate and a negative relationship with the Unemployment rate. This means, that with an increase in literacy level, educational attainment, and the average years of schooling, the percentage of NSDP would rise which would further increase the GDP of the country. Further, with a decrease in the unemployment rate and an improvement in the country’s labor force, the percentage of NSDP can improve adding more to the GDP. However, there are other factors not included in the model that affect the percentage of net state productivity other than unemployment and literacy. Both these aspects should be taken into consideration while building a more productive nation. Some of our recommendations are as follows: ➢ Providing scholarships, so that talents from the economically challenged population could also access better education and therefore a brighter future all along. Scholarships and grants would also incentivize the parents to send their children to school rather than making them labour. ➢ Mid-day meal, encourages children to attend school, therefore increasing the enrolment ratio in primary and secondary education. ➢ More relevant education system like those in foreign nations which helps in the job market rather than focusing on rote learning should be encouraged. ➢ Revamping the teacher education (TE) system. We should focus on revamping the curriculum and pedagogy to bring modern and innovative elements within it and make it a lot more rigorous. ➢ Since India is a labour abundant country, the study, therefore, suggests that government should as a matter of urgency create more employment opportunities to absorb the teeming population of the unemployed workforce in the country. ➢ Various measures like MGNREGA, National Policy for skill development and entrepreneurship, Startup India Initiative, Pradhan Mantri Kaushal Vikas Yojana, etc have been implemented by the government of India in the past to improve the employment situation. More such initiatives and measures should be taken up by the government, especially after the post-pandemic times when a lot of workers have lost their livelihoods. 7. LIMITATIONS & DIRECTIONS FOR FUTURE WORK No work is free from limitations and this paper is no exception and thus the limitations need to be highlighted for better critical appreciation. ➢ The result obtained from the estimated regression of unemployment, literacy, and NSDP confirms only 11.77 percent impact of unemployment and literacy on NSDP while the remaining 88.23 percent are due to other factors. For this study, only two factors were taken into account while there are various other factors affecting NSDP. ➢ In this study, the CLRM assumptions were not considered but the residuals were checked for normal distribution. ➢ The statistical significance of the collected data couldn’t be proved resulting in the failure of rejection of the null hypothesis. 8. CONCLUSION This study has analysed the relationship between per capita Net State Domestic Product, Unemployment and Literacy rate using the data from Indian States and Union Territories for the year 2011-12. From the study, it can be concluded that if the state governments take appropriate policy measures for the expansion of literacy and employment opportunities, the percentage of NSDP and further the GDP will be higher. Furthermore, the results of descriptive statistics revealed that the variables are not normally distributed. Thus, our model has been effective in identifying two factors that affect the percentage of NSDP in different states and regions of the country. However, several factors cannot be numerically measured easily, or data is not available so far for them, or working out a mathematic model with them may be difficult, which could therefore not be included in this model. Therefore, the model only partially explains why the percentage of NSDP is higher in some states and lower in others. 9. BIBLIOGRAPHY • https://m.rbi.org.in/scripts/AnnualPublications.aspx?head=Handbook+of+Statistics+on+Indian+States# • https://www.investopedia.com/articles/economics/12/okuns-law.asp • http://www.stat.yale.edu/Courses/1997-98/101/linreg.htm • http://www.stat.yale.edu/Courses/1997-98/101/linmult.htm • https://acrobat.adobe.com/link/review?uri=urn:aaid:scds:US:ea998ee1-85f1-3ab0-a2f7-c121abd48cd9 • https://acrobat.adobe.com/link/review?uri=urn:aaid:scds:US:0fe36525-ef0b-3f33-87d5-0a05da77d941 • https://data.oecd.org/unemp/unemployment-rate.htm • http://uis.unesco.org/en/glossary-term/literacy-rate • https://data.gov.in/keywords/net-state-domestic-product