Intro to GLM: Lab session 4

Exercise 1: Multinomial logit

Theories of voting behavior in Europe used to be focused for long on people’s political loyalties, regarded as a political expression of their socio-structural characteristics. This was due to the strong legacy of the “cleavage” theory. However, in the last decades other models relating shorter term factors (such as issues or party character) to the vote have replaced the old idea of party preferences as long term loyalties. In his view, party preferences do not come as a direct consequence of people’s membership to social groups (like class or religion), but depend to a greater extent on what parties do - e.g. what position they take on issues or ideology, how competent they are in dealing with salient problems, and so on.

Germany is quite a textbook example of a political system where theories of voting ‘work’, let’s have a look at how theories of voting based on social group membership perform there. To do so, we take some German data from the European Elections of 2009. Back then, 5 parties used to compete for most of the votes:

CDU, or Christian Democratic Union, the party of the (now and then) chancelor Angela Merkel. The CDU is a rather conservative christian-democratic party, that in Bavaria has its expression in the CSU - the somewhat more conservative Christian Social Union. Note that in our file there will be only 1 option for both CDU and CSU, as the two parties are regarded as functional equivalent.
SPD, or German Socialist Party, is the biggest left-wing German party, a rather moderate social-democratic party that has been in government several times, both alone and in grand coalition with the CDU.
The Greens, a middle-sized party that is decidedly placed on the left, in spite of being abit more conservative than other green parties in Europe (like for instance the GroenLinks in the Netherlands). Born as a single-issue environmental party, it’s collecting more and more consensus (now) among young left-wingers who care more about social policy issues than economic policy issues.
FDP, aka the German liberal party. As the name says, this is a liberal party, both in terms of economic and social policies (unlike the CDU, that in the latter field is more concerned with traditional family values and stuff).
The Linke. An old-fashion left-wing party, born from a merger between the leftmost fringe of the SPD and the PDS, the descendent of the East-German ruling socialist party.

install.packages("mlogit")    # This is my favorite package to do multinomial logit in R

options(scipen=9)    # To fix notation
library(foreign)
library(ggplot2)
library(MASS)
library(xtable)
library(mlogit)
library(reshape2)    # We will need this package at the very end
setwd("~/Dropbox/Teaching/GLM_ECPR_2017/Lab/Day4")
data <- read.dta("EES_2009_DE.dta")
names(data)

##  [1] "vote_int"  "lrsp"      "eusp"      "polint"    "polinfo"  
##  [6] "educ"      "female"    "age"       "class"     "workclass"
## [11] "midclass"  "upclass"   "relig"

head(data)

##   vote_int lrsp eusp polint polinfo educ female age class workclass
## 1       NA    4    2      2       5    1      1  70     3         0
## 2       NA   10    3      2       2   NA      0  20     1         1
## 3        1    5    1      1       1    1      1  48     3         0
## 4        4    3   10      1       4    2      0  68     2         0
## 5        1    5    5      2       4    2      1  43     3         0
## 6       NA    5    5      0       1    1      1  54     1         1
##   midclass upclass relig
## 1        1       0     8
## 2        0       0     3
## 3        1       0     1
## 4        1       0    10
## 5        1       0     7
## 6        0       0     4

The variables that we have are:

“vote_int”: vote intention at the national elections (categorical)
“eusp”: self-placement on EU integration scale, from against to in favor of further EU unification (0-10 scale)
“lrsp”: left-right self placement, from left to right (0-10 scale)
“polint”: interest in politics (0-4 ordinal scale)
“polinfo”: additive index of political knowledge, from low to high (0-7 ordinal scale)
“educ”: education (ordinal, 3 caregories, from low to high)
“female”: gender = female (dummy)
“age”: age in years (linear)
“class”: social class self-identification (categorical/kind of ordinal, 5 categories, from low to high)
“workclass”: from “class” (= 1), whether R feels part of the working class (dummy)
“midclass”: from “class” (= 2,3,4), whether R feels part of the middle class (dummy)
“upclass”: from “class” (= 5), whether R feels part of the upper class (dummy)
“relig”: respondent’s degree of religiosity, from low to high (0-10 scale)

The party codes are:

CDU/CSU
SPD
Greens
Linke
FDP

table(data$vote_int, useNA = "always")

## 
##    1    2    3    4    5 <NA> 
##  288  161   84   51  104  316

There are quite a few missing observations. We can drop them.

data <- data[!is.na(data$vote_int),]

This kills more than 300 cases, but we wouldn’t use them anyway.

For a multinomial model of vote choice we pick social class (the 3 dummies), religiosity, and a bunch of controls, such as gender, age, education and political information.

The function “mlogit()” (from the package of the same name) will do the work. The syntax is a tad bit different than in the “glm()” function of R, but in an intuitive way. The intercept (the “1”) is specified (while in “glm” it is included by default), and we need to add two more arguments:

How the data matrix looks like
The reference level

We pick “CDU/CSU”. After all it’s the most important party in Germany, so it makes sense to compare all the other parties to it.

mnl.1 <- mlogit(vote_int ~ 1 | age + female + educ + polinfo +
                  workclass + upclass +
                  relig,
                data = data, shape = "wide", reflevel = "1")
summary(mnl.1)

## 
## Call:
## mlogit(formula = vote_int ~ 1 | age + female + educ + polinfo + 
##     workclass + upclass + relig, data = data, reflevel = "1", 
##     shape = "wide", method = "nr", print.level = 0)
## 
## Frequencies of alternatives:
##        1        2        3        4        5 
## 0.424437 0.228296 0.118971 0.072347 0.155949 
## 
## nr method
## 16 iterations, 0h:0m:1s 
## g'(-H)^-1g = 3.78E-07 
## gradient close to zero 
## 
## Coefficients :
##                   Estimate   Std. Error t-value     Pr(>|t|)    
## 2:(intercept)   -0.5494873    0.6039645 -0.9098    0.3629277    
## 3:(intercept)   -2.8011235    0.8436574 -3.3202    0.0008995 ***
## 4:(intercept)   -3.4008645    1.0762256 -3.1600    0.0015777 ** 
## 5:(intercept)   -0.0175918    0.6545086 -0.0269    0.9785571    
## 2:age            0.0085560    0.0067985  1.2585    0.2082086    
## 3:age           -0.0142864    0.0088833 -1.6082    0.1077849    
## 4:age            0.0255630    0.0112477  2.2727    0.0230422 *  
## 5:age           -0.0119719    0.0074704 -1.6026    0.1090282    
## 2:female         0.1361002    0.2290180  0.5943    0.5523267    
## 3:female         0.8890121    0.2963157  3.0002    0.0026978 ** 
## 4:female        -0.3204562    0.3807699 -0.8416    0.4000115    
## 5:female        -0.2210516    0.2622717 -0.8428    0.3993212    
## 2:educ           0.0548316    0.2249891  0.2437    0.8074572    
## 3:educ           0.8279904    0.3002898  2.7573    0.0058280 ** 
## 4:educ           0.9082849    0.3942995  2.3035    0.0212485 *  
## 5:educ           0.0184443    0.2496423  0.0739    0.9411036    
## 2:polinfo       -0.0779493    0.0695449 -1.1208    0.2623520    
## 3:polinfo        0.2084637    0.0942486  2.2118    0.0269771 *  
## 4:polinfo       -0.0159211    0.1166877 -0.1364    0.8914719    
## 5:polinfo        0.0086176    0.0785360  0.1097    0.9126248    
## 2:workclass      0.4276752    0.2938251  1.4555    0.1455189    
## 3:workclass     -0.7570396    0.6387980 -1.1851    0.2359778    
## 4:workclass      0.8381831    0.4620890  1.8139    0.0696932 .  
## 5:workclass     -0.3083670    0.3973650 -0.7760    0.4377315    
## 2:upclass        0.9718464    1.0281899  0.9452    0.3445562    
## 3:upclass      -14.8264715 2680.7259079 -0.0055    0.9955871    
## 4:upclass        1.7267618    1.3263762  1.3019    0.1929628    
## 5:upclass        0.9324537    1.0253184  0.9094    0.3631240    
## 2:relig         -0.0950517    0.0386427 -2.4598    0.0139031 *  
## 3:relig         -0.0765886    0.0492395 -1.5554    0.1198433    
## 4:relig         -0.3139166    0.0637395 -4.9250 0.0000008436 ***
## 5:relig         -0.0576338    0.0444641 -1.2962    0.1949107    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Log-Likelihood: -841.84
## McFadden R^2:  0.056196 
## Likelihood ratio test : chisq = 100.25 (p.value = 0.00000000046142)

The output of “mlogit()” is not very handy (at least not as well organized as the output of multinomial logit in Stata), but we can play a bit on it. Extract betas and standard errors for different categories and tidy them up a bit.

betas <- coef(mnl.1)
varcov <- vcov(mnl.1)
se <- sqrt(diag(varcov))
cf.2 <- betas[c(1, 5, 9,  13, 17, 21, 25, 29)]
cf.3 <- betas[c(2, 6, 10, 14, 18, 22, 26, 30)]
cf.4 <- betas[c(3, 7, 11, 15, 19, 23, 27, 31)]
cf.5 <- betas[c(4, 8, 12, 16, 20, 24, 28, 32)]
se.2 <- se[c(1, 5, 9,  13, 17, 21, 25, 29)]
se.3 <- se[c(2, 6, 10, 14, 18, 22, 26, 30)]
se.4 <- se[c(3, 7, 11, 15, 19, 23, 27, 31)]
se.5 <- se[c(4, 8, 12, 16, 20, 24, 28, 32)]
tab <- xtable(data.frame(SPD.beta = cf.2, SPD.se = se.2, Greens.beta = cf.3, Greens.se = se.3, 
           Linke.beta = cf.4, Linke.se = se.4, FDP.beta = cf.5, FDP.se = se.5), digits = 3)
print.xtable(tab, type = "html")

	SPD.beta	SPD.se	Greens.beta	Greens.se	Linke.beta	Linke.se	FDP.beta	FDP.se
2:(intercept)	-0.549	0.604	-2.801	0.844	-3.401	1.076	-0.018	0.655
2:age	0.009	0.007	-0.014	0.009	0.026	0.011	-0.012	0.007
2:female	0.136	0.229	0.889	0.296	-0.320	0.381	-0.221	0.262
2:educ	0.055	0.225	0.828	0.300	0.908	0.394	0.018	0.250
2:polinfo	-0.078	0.070	0.208	0.094	-0.016	0.117	0.009	0.079
2:workclass	0.428	0.294	-0.757	0.639	0.838	0.462	-0.308	0.397
2:upclass	0.972	1.028	-14.826	2680.726	1.727	1.326	0.932	1.025
2:relig	-0.095	0.039	-0.077	0.049	-0.314	0.064	-0.058	0.044

What do you read into this?

Let’s simulate the predicted probabilities to vote for each party as a function of religiosity. First let’s have a look at the coefficients and standard errors for the four parties:

tab.relig <- data.frame(beta = betas[c(29:32)], se = se[c(29:32)])
row.names(tab.relig) <- c("SPD", "Greens", "Linke", "FDP")
print.xtable(xtable(tab.relig, digits = 3), type = "html")

	beta	se
SPD	-0.095	0.039
Greens	-0.077	0.049
Linke	-0.314	0.064
FDP	-0.058	0.044

At least SPD and more clearly the Linke should have slopes of “relig” that are significantly different from the CDU. However, these results do not tell us much about the effect of religiosity on the probability to vote for CDU itself. By simulating predicted probabilities, we will be able to see that as well.

Let’s set up the simulation.

set.seed(461982)
n <- 1000
sim.pred <- mvrnorm(n, betas, varcov)

Extract the variable names.

c.n.2 <- names(betas)[c(1, 5, 9,  13, 17, 21, 25, 29)]
c.n.3 <- names(betas)[c(2, 6, 10, 14, 18, 22, 26, 30)]
c.n.4 <- names(betas)[c(3, 7, 11, 15, 19, 23, 27, 31)]
c.n.5 <- names(betas)[c(4, 8, 12, 16, 20, 24, 28, 32)]

Set the independent variables to some meaningful values storing them into an object.

means <- c(mean(data$age, na.rm = T), 1, mean(data$educ, na.rm = T),
           mean(data$polinfo, na.rm = T), 0, 0)
x.range <- seq(min(data$relig, na.rm = T), max(data$relig, na.rm = T), length.out = 10)

Create empty matrices for the predictions.

res.1 <- matrix(NA, nrow = length(x.range), ncol = 4)
res.2 <- matrix(NA, nrow = length(x.range), ncol = 4)
res.3 <- matrix(NA, nrow = length(x.range), ncol = 4)
res.4 <- matrix(NA, nrow = length(x.range), ncol = 4)
res.5 <- matrix(NA, nrow = length(x.range), ncol = 4)

Run the loop.

for(i in 1:length(x.range)){
  a <- x.range[i]
  # Generate vectors of predictions on the linear predictor
  b.2 <- sim.pred[, c.n.2]%*%c(1, means,a)
  b.3 <- sim.pred[, c.n.3]%*%c(1, means,a)
  b.4 <- sim.pred[, c.n.4]%*%c(1, means,a)
  b.5 <- sim.pred[, c.n.5]%*%c(1, means,a)
  # Generate vectors of probabilities (notice that category 1 is the baseline)
  p.1 <- 1/(1 + exp(b.2) + exp(b.3) + exp(b.4) + exp(b.5))
  p.2 <- exp(b.2)/(1 + exp(b.2) + exp(b.3) + exp(b.4) + exp(b.5))
  p.3 <- exp(b.3)/(1 + exp(b.2) + exp(b.3) + exp(b.4) + exp(b.5))
  p.4 <- exp(b.4)/(1 + exp(b.2) + exp(b.3) + exp(b.4) + exp(b.5))
  p.5 <- exp(b.5)/(1 + exp(b.2) + exp(b.3) + exp(b.4) + exp(b.5))
  # Put means and CIs in 4 matrices
  res.1[i, ] <- c(a,
                  mean(p.1), 
                  quantile(p.1, p = 0.025),
                  quantile(p.1, p = 0.975))
  res.2[i, ] <- c(a,
                  mean(p.2), 
                  quantile(p.2, p = 0.025),
                  quantile(p.2, p = 0.975))
  res.3[i, ] <- c(a,
                  mean(p.3), 
                  quantile(p.3, p = 0.025),
                  quantile(p.3, p = 0.975))
  res.4[i, ] <- c(a,
                  mean(p.4), 
                  quantile(p.4, p = 0.025),
                  quantile(p.4, p = 0.975))
  res.5[i, ] <- c(a,
                  mean(p.5), 
                  quantile(p.5, p = 0.025),
                  quantile(p.5, p = 0.975))
}

data.sim <- data.frame(rbind(res.1, res.2, res.3, res.4, res.5))
names(data.sim) <- c("X", "Y", "LO", "HI")
data.sim$GR <- rep(c("1. CDU", "2. SPD", "3. Greens", "4. Linke", "5. FDP"), each = length(x.range))

Let’s now see what we just did.

ggplot(data.sim, aes(x = X, y = Y, group = GR)) +
  geom_line() +
  geom_ribbon(aes(ymin = LO, ymax = HI), fill = "gray70", alpha = 0.5) +
  facet_wrap(~ GR, ncol = 3) +
  scale_y_continuous(limits = c(0,1)) +
  scale_x_continuous(breaks = seq(0, 10, by = 1)) +
  xlab("Religiosity") +
  ylab("Probability to vote for party") +
  theme_bw()

Questions:

Imagine that you had to discuss this figure in a paper. How would you describe it?
Run a second model which includes all the predictors that are in “mnl.1” plus “eusp” and “lrsp”
1. Do the two new variables have an effect on vote choice in Germany? If yes, how does their effect change across different parties?
2. Calculate predicted probabilities to vote for all 5 parties when “lrsp” changes from minimum to maximum, holding all the other predictors constant (TIP: keeping “lrsp” as last variable in the model will make your life easier).
3. Describe the model results and the predicted probabilities as you would do in a research paper

Exercise 2: Ordinal logit

Suppose we want to study people’s engagement with the European Elections campaign in Belgium. Specifically, our research question is: does online mobilization have an effect on psychological engagement with the political campaign, even after controlling for a variety of socio-demographic predictors and, most importantly, for offline mobilization?

BE.data <- read.spss(file="~/Dropbox/Teaching/GLM_ECPR_2017/Lab/Day4/EES_2009_BE.sav", 
                     to.data.frame=TRUE, use.value.labels=FALSE)
head(BE.data)

##   age male employed union eng_tri edu_rec off_mob onl_mob
## 1  -1    1        1     0       1       1      NA      NA
## 2  23    0        0     1       2       2       0       0
## 3  31    1        0     0       1      -5       0       0
## 4  30    0        0     0       1       1       0       0
## 5   9    0        1     1       2       6       1       1
## 6  NA    0       NA     0       1      NA      NA      NA

The variables in the dataset are:

“age”: Respondent’s age in years. Initially going from 18 to 90, it has been centered around the median value (44), so it goes from -26 to 46
“male”: Dichotomous variable for gender, 1 being “male”
“employed”: Dichotomous variable for employment status, 1 being “employed or self-employed”
“union”: Dichotomous variable for union membership, 1 being “union member”
“eng_tri”: Psychological engagement, on a 3-point scale: low, medium, and high. This is our dependent variable
“edu_rec”: Education in 13 categories. It has been centered around the middle category (6), and it goes from very low (-6) to very high (6)
“off_mob”: Dichotomous variable measuring the extent of offline mobilization to which the respondent was subjected to, 1 meaning “high”
“onl_mob”: Dichotomous variable measuring the extent of online mobilization to which the respondent was subjected to, 1 meaning “high”

Let’s see how “eng_tri” is distributed

BE.data <- BE.data[!is.na(BE.data$eng_tri), ]    # Kill the missing cases
BE.data$eng_tri <- as.factor(BE.data$eng_tri)    # Convert it into a factor (required by R)
round(table(BE.data$eng_tri)/length(BE.data$eng_tri), digits = 2)

## 
##    0    1    2 
## 0.27 0.55 0.18

ggplot(BE.data, aes(x = eng_tri)) + geom_bar(col = "black", fill = "brown") + theme_bw()

OLS can’t really work here, and clearly neither does logit. How does it covary with the two variables indicating the two types of mobilization?

round(prop.table(table(BE.data$onl_mob, BE.data$eng_tri), 1), digits = 2)

##    
##        0    1    2
##   0 0.29 0.54 0.17
##   1 0.20 0.51 0.30

round(prop.table(table(BE.data$off_mob, BE.data$eng_tri), 1), digits = 2)

##    
##        0    1    2
##   0 0.30 0.54 0.16
##   1 0.15 0.54 0.31

There is weak evidence that both online and offline mobilization do have an effect on political engagement.

Ordinal logistic regression can be run with the “polr()” command from the “MASS” package. Note that there is also a package called “ordinal” that deals specifically with ordinal models. We will use “polr()” here, but if you need to specify more complex ordinal models (e.g. multilevel modeling), the package “ordinal” is the way to go. Before doing anything substantive, let’s get one thing straight: how do we interpret the estimated thresholds? To have an idea, let’s run an ordinal logit model without predictors:

ord.0 <- polr(eng_tri ~ 1,
              method = "logistic", Hess = T,
              data = BE.data)
summary(ord.0)

## Call:
## polr(formula = eng_tri ~ 1, data = BE.data, Hess = T, method = "logistic")
## 
## No coefficients
## 
## Intercepts:
##     Value    Std. Error t value 
## 0|1  -0.9983   0.0717   -13.9195
## 1|2   1.5361   0.0833    18.4328
## 
## Residual Deviance: 1950.707 
## AIC: 1954.707

In “polr” objects, the values of the threshold are stored in a vector called “zeta”. Let’s extract them and exponentiate them:

thr <- ord.0$zeta
exp(thr)

##       0|1       1|2 
## 0.3684939 4.6462619

To understand what the values of the thresholds are, let’s calculate the cumulative odds to end up in category 0 (versus 1 and 2) and in categories 0 and 1 (versus category 2):

table(BE.data$eng_tri)

## 
##   0   1   2 
## 266 547 175

# Odds of y = 0/(1 + 2)
(266)/(547+175)

## [1] 0.3684211

# Odds of y = (0 + 1)/2
(266+547)/(175)

## [1] 4.645714

This should clarify that, in ordinal logit models, the threshold values are the cumulative log odds to observe y in a category below the threshold instead of a category above the threshold (when all predictors are 0, but here this doesn’t matter because we have no predictors). We can also convert them into probabilities:

# What is the probability (when all predictors are 0) to end up in category 0?
exp(thr[1])/(1 + exp(thr[1]))

##       0|1 
## 0.2692697

# What is the probability to end up in cateogory 0 OR 1?
exp(thr[2])/(1 + exp(thr[2]))

##       1|2 
## 0.8228917

# What is the probability to end up exactly in category 1?
exp(thr[2])/(1 + exp(thr[2])) - exp(thr[1])/(1 + exp(thr[1]))

##      1|2 
## 0.553622

# What is the probability to end up in category 2?
1 - exp(thr[2])/(1 + exp(thr[2]))

##       1|2 
## 0.1771083

Let’s get into the business now, and run an ordinal logit model with just socio-demographic indicators:

ord.1 <- polr(eng_tri ~ age + union + male + edu_rec + employed, 
              method = "logistic", Hess = T,
              data = BE.data)
summary(ord.1)

## Call:
## polr(formula = eng_tri ~ age + union + male + edu_rec + employed, 
##     data = BE.data, Hess = T, method = "logistic")
## 
## Coefficients:
##             Value Std. Error t value
## age       0.01322   0.003879  3.4084
## union    -0.25230   0.141188 -1.7870
## male      0.35487   0.134962  2.6294
## edu_rec   0.10525   0.023335  4.5104
## employed  0.08211   0.140408  0.5848
## 
## Intercepts:
##     Value   Std. Error t value
## 0|1 -0.8859  0.1248    -7.0985
## 1|2  1.7951  0.1384    12.9695
## 
## Residual Deviance: 1657.993 
## AIC: 1671.993 
## (122 observations deleted due to missingness)

Let’s see the probability to end up in the 3 categories for men (vs. women)

c.m <- as.numeric(coef(ord.1)["male"])
# Probability to end up in category 0 when you are a men (and all other predictors are 0)
p.0.m <- exp(thr[1] - (1*c.m))/(1 + exp(thr[1] - (1*c.m)))
p.0.m

##       0|1 
## 0.2053477

# Let's compare it with women
p.0.w <- exp(thr[1] - (0*c.m))/(1 + exp(thr[1] - (0*c.m)))
p.0.w

##       0|1 
## 0.2692697

# Probability to end up exactly in category 1 when you are a men
p.1.m <- exp(thr[2] - (1*c.m))/(1 + exp(thr[2] - (1*c.m))) - exp(thr[1] - (1*c.m))/(1 + exp(thr[1] - (1*c.m)))
p.1.m

##       1|2 
## 0.5598148

# Let's compare it with women
p.1.w <- exp(thr[2] - (0*c.m))/(1 + exp(thr[2] - (0*c.m))) - exp(thr[1] - (0*c.m))/(1 + exp(thr[1] - (0*c.m)))
p.1.w

##      1|2 
## 0.553622

# Probability to end up exactly in category 2 when you are a men
p.2.m <- 1 - exp(thr[2] - (1*c.m))/(1 + exp(thr[2] - (1*c.m)))
p.2.m

##       1|2 
## 0.2348375

# Let's compare it with women
p.2.w <- 1 - exp(thr[2] - (0*c.m))/(1 + exp(thr[2] - (0*c.m)))
p.2.w

##       1|2 
## 0.1771083

Let’s plot these values:

prob.dat <- data.frame(cat = as.factor(rep(c(0, 1, 2), each = 2)), 
                       p = c(p.0.m, p.0.w, p.1.m, p.1.w, p.2.m, p.2.w),
                       sex = rep(c("Men", "Women"), 3))
ggplot(prob.dat, aes(x = cat, y = p, group = sex)) +
  geom_bar(aes(fill = sex), stat = "identity", position = "dodge") +
  theme_bw()

Let’s focus on the original question, which concerned online and offline mobilization:

ord.2 <- polr(eng_tri ~ age + union + male + edu_rec + employed + 
                off_mob + onl_mob, 
              method = "logistic", Hess = T,
              data = BE.data)
summary(ord.2)

## Call:
## polr(formula = eng_tri ~ age + union + male + edu_rec + employed + 
##     off_mob + onl_mob, data = BE.data, Hess = T, method = "logistic")
## 
## Coefficients:
##             Value Std. Error t value
## age       0.01049   0.004242  2.4730
## union    -0.37587   0.151803 -2.4760
## male      0.37410   0.145315  2.5744
## edu_rec   0.09523   0.025902  3.6763
## employed -0.06165   0.151116 -0.4079
## off_mob   0.74097   0.187717  3.9473
## onl_mob   0.61689   0.214837  2.8714
## 
## Intercepts:
##     Value   Std. Error t value
## 0|1 -0.7680  0.1381    -5.5620
## 1|2  1.8892  0.1556    12.1422
## 
## Residual Deviance: 1423.613 
## AIC: 1441.613 
## (242 observations deleted due to missingness)

Surprisingly, there is an effect for both variables. Confidence intervals look as we would expect.

mat.res <- cbind(exp(ord.2$coefficients), exp(confint.default(ord.2)))
colnames(mat.res) <- c("OR", "Lower CI", "Upper CI")
rownames(mat.res) <- c("Age", "Union membership", "Male", "Education", "Employed", "Offline mobilization", "Online mobilization")
mat.res

##                             OR  Lower CI  Upper CI
## Age                  1.0105460 1.0021788 1.0189832
## Union membership     0.6866921 0.5099734 0.9246484
## Male                 1.4536883 1.0933989 1.9326979
## Education            1.0999065 1.0454606 1.1571880
## Employed             0.9402160 0.6991934 1.2643228
## Offline mobilization 2.0979702 1.4521606 3.0309862
## Online mobilization  1.8531480 1.2163000 2.8234461

Question: How do you interpret the coefficients?

These coefficients are called proportional odds ratios, and one would interpret them similar to effects in binary logistic regression. For online mobilization, we would say that going from a “low” level to a “high” level (whatever that may mean here), the odds of moving from “low engagement” to “middle” or “high” (or, from “low” and “middle” to “high engagement”) increase by a factor of 1.85.

What “proportional odds” refers to in “polr()”, is a very important assumption of ordinary logistic (and ordinary probit) regression. This means that the coefficients that describe the relationship between the first group (“low engagement”) and the other two, are the same as the ones that describe the relationship between the second group and the third group - the “proportional odds” assumption. This is what allows the model to estimate only one set of coefficients.

Let’s plot predicted probabilities of engagement as a function of “age”, using a different technique to the one we saw yesterday and earlier today.

First, let’s create a new data matrix where we set our predictors at interesting values (age in a sequence between minimum and maximum, and all the other predictors fixed at zero, which makes sense since most of them are dummies and education is centered around the middle):

sim.data <- cbind(age = seq(min(BE.data$age, na.rm=TRUE), 
                            max(BE.data$age, na.rm=TRUE), length.out = 100),
                  union = rep(0, each = 100), 
                  male = rep(0, each = 100), 
                  edu_rec = rep(0, each = 100), 
                  employed = rep(0, each = 100), 
                  off_mob = rep(0, each = 100), 
                  onl_mob = rep(0, each = 100))
head(sim.data)

##            age union male edu_rec employed off_mob onl_mob
## [1,] -26.00000     0    0       0        0       0       0
## [2,] -25.27273     0    0       0        0       0       0
## [3,] -24.54545     0    0       0        0       0       0
## [4,] -23.81818     0    0       0        0       0       0
## [5,] -23.09091     0    0       0        0       0       0
## [6,] -22.36364     0    0       0        0       0       0

Second, we use the “predict()” function to obtain predicted probabilities given the model coefficients and the values of X that we have specified in “sim.data”. This function does something similar to what we did with the loop, the only difference being that predicted probabilities are not “simulated” here but obtained analytically. Note, however, that calling predict after the “polr()” function does not seem to allow any way to extract the standard errors – hence we won’t be able to add confidence intervals to our plot.

predictions <- data.frame(predict(ord.2, sim.data, type = "prob"))
names(predictions) <- c("0", "1", "2")
predictions$age <- seq(18, 90, length.out = 100)  # Remeber the original range for age
head(predictions)

##           0         1         2      age
## 1 0.3786621 0.5181164 0.1032215 18.00000
## 2 0.3768687 0.5192014 0.1039299 18.72727
## 3 0.3750787 0.5202787 0.1046426 19.45455
## 4 0.3732920 0.5213484 0.1053596 20.18182
## 5 0.3715088 0.5224102 0.1060809 20.90909
## 6 0.3697291 0.5234643 0.1068066 21.63636

To make a nice plot, we need to restructure the data a little bit. Specifically, we need the predictions for the three categories to be “stacked” on each other within the same variable (with an additional variable that identifies which cases in the data belong to which category). We can do this using the funciton “melt()” included in the “reshape2” package (we have loaded it in the library at the beginning):

pred.melt <- melt(predictions, id.vars = "age")
head(pred.melt)

##        age variable     value
## 1 18.00000        0 0.3786621
## 2 18.72727        0 0.3768687
## 3 19.45455        0 0.3750787
## 4 20.18182        0 0.3732920
## 5 20.90909        0 0.3715088
## 6 21.63636        0 0.3697291

Finally, we can plot our data. We can make an “area plot”, since we do not have confidence intervals to show anyway:

ggplot(pred.melt, aes(x = age, y = value, group = variable)) +
  geom_area(aes(fill = variable)) +
  scale_fill_manual(values = c("gray80", "gray50", "gray20"), "Category") +
  ylab("Probability") + xlab("age") +
  theme_bw()

Intro to GLM: Lab session 4

Federico Vegetti

ECPR Summer School in Methods and Techniques

Exercise 1: Multinomial logit

Exercise 2: Ordinal logit