STATISTICS FOR THE SOCIAL & BEHAVIORAL SCIENCES Recitation #1 Answers Using and describing data 2. The dataset has 602,833 observations. That means that 602,833 subjects were surveyed. That is roughly 0.05% of the Indian population. (602,833/1237000000 x 100) 3. - The command ‘describe’ shows the storage type, display format, and label of each variable. The variable label allows us to know what the variable is measures. The storage type tells us whether the variable contains numbers that are integers, strings (letters), or numbers with decimals (float). For example, variable year’s type is integer (int) since years cannot be decimal values (this dataset only has 2004 data). On the other hand, variable wtper’s type is float (number with decimals) since it is a variable that measures the respondent’s weight. - The command ‘codebook’ shows the variable type, label, and range among other important information (whether there are any missing values and how many unique values are there). By using ‘codebook’ we can see which variables are continuous, discrete, categorical, and quantitative. - 602,833 individuals are part of this sample. - India’s population is 1.237 billion. Therefore, roughly 0.05% of the Indian population was surveyed. (602,833/1237000000 x 100) - By typing ‘codebook’, we can see that variable serial (Household serial number) has 124680 unique values. That means that 124680 different households have been surveyed. - Continuous variables: wtper - Discrete: cntry, year, sample, serial, urban, regionw, geolev1, geo1a_in, geo1b_in, geo2b_in, pernum, age, age2, sex - Quantitative variables: year, sample, serial, pernum, wtper, age - Categorical variables: cntry, urban, regionw, geolev1, geo1a_in, geo1b_in, geo2b_in, age2, sex Note: For those of you confused about the variable pernum, this variable is not needed for our exercise today. However, you may be interested in knowing that pernum numbers all persons within each household consecutively (starting with "1" for the first person record of each household). When combined with serial, pernum uniquely identifies each person in our sample. 5. 294205 π€ππππ 308627 πππ = 0.953 π€ππππ πππ πππ ππ πΌππππ 1 6. Our finding is consistent with Amartya Sen’s assertion that “in South Asia, West Asia, and China, the ratio of women to men can be as low as 0.94”. We can observe how our 0.953 ratio of women per man is definitely lower than the ratio of women to men in Europe as reported by Amartya Sen. In the case of Europe, there are slightly more women than men. Our data suggests in India it is the other way around. 7. Punjab’s women per man ratio: 10467 π€ππππ 11522 πππ = 0.908 2 Haryana’s women per man ratio: Kerala’s women per man ratio: 6677 π€ππππ 7507 πππ 12086 π€ππππ 10942 πππ =0.889 =1.104 8. Yes, we can use a similar approach as in question 7. We type: tabulate age sex At age less than one year, women to men ratio is 0.921. At age 1, the ratio is 0.905. At age 2, the ratio is 0.952. At age 3, the ratio is 0.977. At age 4, the ratio is 0.892. At age 10, the ratio is 0.840. At age 21, the ratio is 0.934. At age 31, the ratio is 1.159. At age 41, the ratio is 0.912. Professor Sen’s statement says that at birth there are around 106 male children for every 100 female children everywhere in the world - a female to men ratio of 0.943 – but that from then on biology seems to side with women. Even though our results do not show a clearly increasing pattern in the female to male ratio, and thus seem to contradict Amartya Sen’s statement, we must note he assumes that women and men receive similar nutritional and medical attention. Amartya Sen suggests that if women do not engage in ‘gainful’ employment, then women and men are less likely to receive similar nutritional and medical attention. In the article, he also explains Southern Asia ranks one of the lowest in both ‘gainful’ employment for women and female life expectancy. Thus, we cannot use our results to seek evidence for the increasing biological advantage of women because there is evidence women and men do not face the same conditions in India. 9. Rural female to male ratio: 0.958 3 Urban female to male ratio: 0.945 There is evidence that the ratio is lower in urban areas. We could interpret this difference as due to the fact that in urban areas individuals may have easier access to selective abortion. This goes in line with Amartya Sen’s assertion that policies or differences that initially may seem neutral and unrelated to female-male ratios end up producing a negative effect on female life expectancy due to already existing societal norms. In this case, the difference in access to abortion may seem in itself neutral. Nevertheless, the different prospects of men versus women to find employment may produce a preference for male babies, which in turn could lead to seeking selective abortion if available. 4