Dummy variables refresher

advertisement

Dummy Variables and

Interactions

Dummy Variables

What is the the relationship between the % of non-Swiss residents (IV) and discretionary social spending (DV) in Swiss municipalities?

. reg def_social_head log_pctforeign if year==2005

Source | SS df MS Number of obs = 1151

-------------+-----------------------------F( 1, 1149) = 0.54

Model | 155287.676 1 155287.676 Prob > F = 0.4631

Residual | 331138316 1149 288196.968 R-squared = 0.0005

-------------+-----------------------------Adj R-squared = -0.0004

Total | 331293604 1150 288081.395 Root MSE = 536.84

-------------------------------------------------------------------------------def_social_h~d | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---------------+---------------------------------------------------------------log_pctforeign | 7.449417 10.14842 0.73 0.463 -12.4621 27.36093

_cons | 663.3708 27.61287 24.02 0.000 609.1935 717.5481

Data drawn from 6 different cantons (states).

Dummy Variables

. tabstat def_social_head pctforeign, by(canton)

Summary statistics: mean by categories of: canton canton | def_so~d pctfor~n

---------+--------------------

1 | 504.8596 .135417

2 | 202.6653 .2730006

3 | 474.0729 .0919272

5 | 945.3242

.0633038

6 | 377.7097 .1101544

7 | 110.8237

.0918817

---------+--------------------

Total | 501.6756 1.689802

------------------------------

Dummy Variables

We can control for the fact that municipalities are drawn from different cantons by allowing the default expectation (intercept) for each canton to vary:

. reg def_social_head log_pctforeign i.canton

if year==2005

Source | SS df MS Number of obs = 1151

-------------+-----------------------------F( 6, 1144) = 127.25

Model | 132602310 6 22100384.9 Prob > F = 0.0000

Residual | 198691294 1144 173681.201 R-squared = 0.4003

-------------+-----------------------------Adj R-squared = 0.3971

Total | 331293604 1150 288081.395 Root MSE = 416.75

-------------------------------------------------------------------------------def_social_h~d | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---------------+---------------------------------------------------------------log_pctforeign | 220.8124 16.05465 13.75 0.000 189.3126 252.3123

| canton |

2 | -670.2634 67.5107 -9.93 0.000 -802.7221 -537.8047

3 | -94.19855 40.78477 -2.31 0.021 -174.2199 -14.17721

5 | 468.9531 35.77836 13.11 0.000 398.7545 539.1516

6 | -1193.182 86.82131 -13.74 0.000 -1363.529 -1022.835

7 | -468.4175 41.32189 -11.34 0.000 -549.4927 -387.3424

|

_cons | 1180.505 40.99826 28.79 0.000 1100.065 1260.946

--------------------------------------------------------------------------------

Canton 1 is a ‘reference category’ – the intercept for canton 1 is “_cons”

Dummy Variables

This is not the same as running separate regressions for each canton, because we still assume that the slope is identical for every subgroup.

Social Spending = B0 + B1*Log_Pctforeign + B2*Canton1 + B3*Canton2 + B4*Canton3 …

For municipalities in Canton 1:

Social Spending = B0 + B1*Log_Pctforeign + B2* 1 + B3* 0 + B4* 0 …

Social Spending = (B0 + B2) + B1*Log_Pctforeign

For municipalities in Canton 2:

Social Spending = B0 + B1*Log_Pctforeign + B2* 0 + B3* 1 + B4* 0 …

Social Spending = (B0 + B3) + B1*Log_Pctforeign

All we are doing is changing the starting value -- allowing the expectation when log_pctforeign = 0 to differ across cantons. B1 still describes the effect of Log_Pctforeign on Social Spending across the entire sample.

Dummy Variables

What would happen if I added the following variables to the previous regression:

A) Variable that measures whether a canton is German-speaking (1) or Frenchspeaking (0)

B) Variable that measures average GDP per capita in the canton.

B) Variable that measures whether a municipality allows (1) or does not allow (0) immigrants to vote.

Dummy Variables and Interactions

We can allow the relationship between log_pctforeign and def_social_head (the “slope”) to vary for each canton by using interactions:

Social Spending = B0 + B1*Log_Pctforeign + B2*Canton1 + B3*Canton2 …

+ B4*Canton1*Log_Pctforeign + B5*Canton2*Log_Pctforeign

For municipalities in Canton 1:

Social Spending = B0 + B1*Log_Pctforeign + B2* 1 + B3* 0 + B4* 1* Log_Pctforeign + B5* 0 *Log_Pctforeign

Social Spending = (B0 + B2) + (B1 + B4)Log_Pctforeign

For municipalities in Canton 2:

Social Spending = (B0 + B3) + (B1 + B5)Log_Pctforeign

This is the same as estimating the relationship between social spending and log_pctforeign separately for each subgroup. We are assuming that the relationship differs in each canton.

Interactions

Interactions should be justified by theory. For instance, we might reasonably assume that the relationship between % foreign and social expenditure would be different if municipalities allowed immigrants to vote.

Voteright is a variable coded 1 if immigrants have voting rights in a municipality, 0 otherwise.

. gen logpctforeignXvoteright = voteright * log_pctforeign

. reg def_social_head log_pctforeign logpctforeignXvoteright voteright i.canton if year == 2005

----------------------------------------------------------------------------------------def_social_head | Coef. Std. Err. t P>|t| [95% Conf. Interval]

------------------------+---------------------------------------------------------------log_pctforeign | 179.9446 19.08151 9.43 0.000 142.5076 217.3816

logpctforeignXvoteright | 85.12434 32.40203 2.63 0.009 21.55309 148.6956

voteright | 112.3438 417.7021 0.27 0.788 -707.1681 931.8557

| canton |

2 | -703.456 68.08389 -10.33 0.000 -837.0333 -569.8786

3 | -73.26651 41.22884 -1.78 0.076 -154.1556 7.62253

4 | -1033.193 60.82019 -16.99 0.000 -1152.519 -913.8663

5 | 361.7032 411.5004 0.88 0.380 -445.6411 1169.048

6 | -1079.953 424.194 -2.55 0.011 -1912.201 -247.7039

7 | -559.3247 409.6522 -1.37 0.172 -1363.043 244.3936

|

_cons | 1160.733 414.167 2.80 0.005 348.1573 1973.31

-----------------------------------------------------------------------------------------

Interactions

Social Spending = B0 + B1*Log_Pctforeign + B2*Canton1 + B3*Canton2 + …

B4*VotingRights + B5*Log_Pctforeign*VotingRights

For non-immigrant voting municipalities in Canton 1:

Social Spending = B0 + B1*Log_Pctforeign + B2* 1 + B3* 0 +

+B4* 0 + B5*Log_Pctforeign* 0

Social Spending = (B0+B2) + B1*Log_Pctforeign

For immigrant voting municipalities in Canton 1:

Social Spending = B0 + B1*Log_Pctforeign + B2* 1 + B3* 0 +

+B4* 1 + B5*Log_Pctforeign* 1

Social Spending = (B0+B2+B4) + (B1+B5)*Log_Pctforeign

Direct + Indirect Effects

Direct Effect = Multivariate Regression Coefficient

Indirect Effect = Bivariate – Multivariate

Download