SOC 100 – Introduction to Sociology

advertisement
Income Inequality in the United States
Data Analysis Assignment
For this assignment we will explore the impact of gender and race on the earnings of
full-time workers in 2000. The purpose of this assignment is to introduce you to some
basic data analysis software (WebCHIP), to develop some familiarity with working with
data from the Current Population Survey, and to apply what you have learned in the
course to try to explain differences in earnings based on race and gender.
Learning Objectives
Skill
After using this module, students will gain skills in:
 Using software to access and analyze census data
 Identifying independent and dependent variables
 Employing control variables
 Forming testable hypotheses using quantitative data
 Quantitative writing
 Learning how to construct, read, and interpret bivariate tables displaying
frequencies and percentages
 Using real world data to enhance and support key course concepts
I.
Data Sets
You will need to use the EARN2KC.DAT and EARN2KC.NY plus one other data set
from a state of your choice. You can access WebCHIP through the SSDAN
website. Use these instructions:
1. http://www.ssdan.net/datacounts/data/
2. From there, click “Browse” on the left sidebar. Find “geo2000” in the
drop-down box and select it.
3. Scroll down through the list of data sets until you find the appropriate
dataset. Highlight and click “submit.” This will bring up the data set in the
WebCHIP program and it is ready for analysis.
4. You can also click here to open EARN2KC.DAT in WebCHIP.
5. You can also click here to open EARN2KC.NY in WebCHIP.
II.
Variables
Although there are several ways in which the following terms may be
conceptualized, defined and measured, these are the definitions used by the U.S.
Census Bureau and the Bureau of Labor Statistics:
Income (Earning) – the money a person makes from working, as wages, salary,
or a form of self-employment, expressed as an annual amount.
Race (RaceLat) – individual’s self-identification as:





Non-Hispanic White (NHwhite) – all persons who indicated their race as
white and not of Hispanic origin.
Black – all persons who indicated their race as black.
Hispanic – persons of white or “other” races who identified themselves
as Mexican, Puerto Rican, Cuban, or Other Spanish/Hispanic.
Asian (or Pacific Islander) – includes all persons who indicated their race
or ethnicity as Chinese, Filipino, Japanese, Asian Indian, Korean,
Vietnamese, Cambodian, Hmong, Laotian, Thai, or other Asian as well as
Hawaiian, Samoan, Guamanian or other Pacific Islander.
American Indian (AmIndian) – all persons who classified themselves as
American Indian, Eskimo or Aleut.
Gender (Gender) - individual’s self-identification as either male or female.
Work Age (WkAge) – age in years, grouped into 9 – 10 year intervals, starting at
age 16
III.
Using WebCHIP
A. Frequency Distributions or Marginals
A. To get a listing of the variables and their frequencies, use the
“Marginals” function. You should get the following output:
RaceLat
NHwhite
73.6
Black
11.7
Hispanic
9.9
Gender
Male
58.7
Earning
Female
41.3
Total
100.0%
Asian
4.0
AmIndian
.7
Total
100.0%
<15K 15-25K 25-35K 35-50K 50K-75K 75K+
12.3
22.3 20.9
20.4
14.7
9.4
Total
100.0%
WkAge
16-24 25-34 35-44 45-54 55-64 65+
7.8
24.4 30.2 25.0
10.7 1.9
Total
100.0%
This is a frequency table for the CPS sample of full-time, year-round workers in
2000. According to the table, 58.7% of all full-time workers are male as opposed
to 41.3% female. 12.3% of full-time workers earn less than $15,000 a year while
9.4% earn more than $75,000. 73.6% of full-time workers are non-Hispanic
white, 11.7% are black, 9.9% are Hispanic, 4.0% are Asian, and only 0.7% are
American Indian. Workers between the ages of 35-54 account for 55.2% of all
full-time workers. Only 1.9% of full-time workers are over the age of 65.
The frequency table allows one to get an overall sense of the distribution of a
particular variable or set of variables that is an important place to start.
However, what we are interested in exploring further is the impact of gender
and race on earnings. That is, do men still generally earn more than women?
What racial group typically has the highest earnings? Which group has the lowest
earnings? Do blacks earn more than Hispanics? In order to address these
questions, we will need to crosstabulate the variables of interest.
IV.
Crosstabs
A. In order to do the crosstabulation of the two variables to explore, you want
to be sure that you know how the variables might be associated. It makes
sense to say that one’s gender may influence his or her earnings, but it does
not make sense to say that earnings influences whether one is male or
female. The variable that influences or affects another variable is known as
the independent variable (x) and the variable that is influenced or affected by
another variable is called the dependent (y) variable. You can write this as: x
_ y. In this case, we are interested in how gender influences earnings. Gender
would be the independent variable (x) and earnings would be the dependent
variable (y). In other words, earnings to some extent, depends on gender:
Gender Earnings.
B. To create a crosstabulation table in WebCHIP, you need to tell the program
which is the independent and which is the dependent variable. Create a
percent down crosstab with the dependent (y) or row variable “Earning” and
the independent (x) or column variable “Gender.” You should get the
following table:
75K+
50-75K
35-50K
25-35K
15-25K
<15K
100%=
Male
13.3
18.6
21.9
19.1
17.9
9.2
56212417
Female
3.9
9.3
18.3
23.4
28.5
16.7 1
39610636
All
9.4
14.7
20.4
20.9
22.3
2.3
N = 95823053
In this case, we can see that 31.9% of male full-time workers make $50,000 or
more a year as opposed to only 13.2% of females. On the other hand, 45.2% of
female full-time workers earn under $25,000 a year as opposed to only 27.1% of
male full-time workers.
1. Do males or female full-time workers typically earn more money?
Report some of the specific findings (as I did above).
2. What does this suggest about the current state of income
inequality between men and women? Does the gender earning
gap appear to be any smaller in the year 2000?
Now examine how race influences earnings by following the same process as
step two above but substituting the “RaceLat” variable in place of “Gender.” You
should be able to complete the table below and answer the following questions:
Earnings
75K+
50-75K
35-50K
25-35K
15-25K
<15K
100%=
NHwhite
Black
Hispanic
Asian
AmIndian
All
70500890
11220490
9510429
3880288
710956
N=
95823053
3. What racial group has the highest earnings? Report some specific
percentages from the table.
4. What racial group has the lowest earnings? Again, report some
specifics.
5. Since these are all full-time workers, what factors might
contribute to the observed racial differences in earnings?
6. What do you find most interesting and/or surprising about these
findings?
V.
Finishing Up
A. Repeat Part III using the EARN2KC.NY dataset for New York State. After
calculating the marginals and crosstabulations for New York, answer the
following:
1. What differences in income do you observe in NY state in comparison
to the entire United States?
2. Is the gender gap in wage earnings higher or lower in New York?
3. How do the racial differences in earnings in New York compare to the
national data?
4. What might account for the differences observed in New York? Be
sure to summarize your findings using the marginal frequencies as
well as the two crosstabulation tables using gender and race. Report
specific findings from your tables, just as I did in my earlier analysis.
B. Repeat Part III again for a different state of your choosing. Remember that
the suffix for each data set indicates the state it represents (for example
EARN2KC.WI is the 2000 earnings data for Wisconsin). Answer the four
questions for this state as you did in part A above. How does the gender and
racial income inequality in this state compare to New York and nationwide?
Why do you think the income situation in this state might be any different?
C. Your completed assignment should include:
1. A cover page with your name, course, section number, and title.
2. ALL of the output from your WebCHIP analysis, which includes the
marginals, earnings by gender and earnings by race for the national,
New York and the additional state data. Note that the output from
WebCHIP is easily copied by left clicking and dragging your mouse on
the output and then pasting it (right click) into your word processor.
Please make sure that you have cleaned up your output, so you don’t
have four copies of the same table before you print it! If you copy the
output into a word-processor, it would be helpful to label each table
so that you can refer to it in your analysis.
3. A 3-4 page typed (word-processed) analysis of your findings
addressing:
a. The marginals from each of the three data sets. Describe the
distribution of gender, race and earnings of full-time workers
at the national and state level. Is there much difference in the
New York frequencies in comparison to the other state you
selected? How does New York compare to the national level
data?
b. The six questions posed in Part IVB and the four questions in
Part VA. You should refer to the specific tables from your
WebCHIP output and the web resources provided at the end
of this assignment to guide you in understanding and
interpreting your findings.
VI.
Useful Web Resources
Bureau of Labor Statistics: http://www.bls.gov/
CensusScope: http://www.censusscope.org/
Current Population Survey: http://www.bls.census.gov/cps/cpsmain.htm
The Social Science Data Analysis Network: http://www.ssdan.net/
U.S. Census Bureau: http://www.census.gov/
Download