7 Regressions

  1. model 1 with \(y\) = pctGOP and variable 1 from each category.

  2. model 2 with \(y\) = pctGOP and variable 2 from each category.

  3. model 3 with \(y\) = log(pctGOP) and variable 1 from each category.

  4. model 4 with \(y\) = log(pctGOP) and variable 2 from each category.

model1 <- lm(pctGOP~pct_white+pct_whiteage+pct_us_born+pct_employment_male,data=countyGIS_stat)
model2 <- lm(pctGOP~pct_asian+pct_asianage+pct_us_naturalized+pct_employment_female,data=countyGIS_stat)
model3 <- lm(log(pctGOP)~pct_white+pct_whiteage+pct_us_born+pct_employment_male,data=countyGIS_stat)
model4 <- lm(log(pctGOP)~pct_asian+pct_asianage+pct_us_naturalized+pct_employment_female,data=countyGIS_stat)
stargazer(model1, model2, model3, model4,
          type = "html", 
          report=('vc*p'),
          keep.stat = c("n","rsq","adj.rsq"), 
          notes = "<em>&#42;p&lt;0.1;&#42;&#42;p&lt;0.05;&#42;&#42;&#42;p&lt;0.01</em>", 
          notes.append = FALSE, 
          model.numbers = FALSE, 
          column.labels = c("(1)","(2)","(3)","(4)"))
Dependent variable:
pctGOP log(pctGOP)
(1) (2) (3) (4)
pct_white 0.021*** 0.038***
p = 0.000 p = 0.000
pct_whiteage -0.021*** -0.037***
p = 0.000 p = 0.000
pct_us_born 0.009*** 0.017***
p = 0.000 p = 0.000
pct_employment_male 0.009*** 0.016***
p = 0.000 p = 0.000
pct_asian -0.007 0.019
p = 0.466 p = 0.283
pct_asianage -0.009 -0.059***
p = 0.458 p = 0.009
pct_us_naturalized -0.018*** -0.036***
p = 0.000 p = 0.000
pct_employment_female -0.013*** -0.022***
p = 0.000 p = 0.000
Constant -0.828*** 1.228*** -3.315*** 0.558***
p = 0.000 p = 0.000 p = 0.000 p = 0.000
Observations 3,083 3,083 3,083 3,083
R2 0.490 0.283 0.498 0.293
Adjusted R2 0.489 0.282 0.497 0.292
Note: *p<0.1;**p<0.05;***p<0.01

The correlation between data doesn’t mean that one of these variables is the cause for the other. For example, pct_white might be correlated to pctGOP, but this does not mean that the White population is the cause for the amount of votes the Republican Party gets. There could be a third factor in play not shown in the model. Because of this, we’re using the “zero conditional mean” assumption. When we select a random member of the population such as White adults, we expect all confounding variables to be 0, which means there should be no correlation between the either of the two selected variables and a confounding variables.