if (!require("pacman")) install.packages("pacman") # solo la primera vez
pacman::p_load(lme4,
reghelper,
haven,
stargazer,
ggplot2, # gráficos
texreg, # tablas de regresion (screenreg)
dplyr # manipulacion de datos
) # paquetes a cargarPráctica 3. Estimación modelos multinivel con lmer en R
Correspondiente a la sesión del jueves, 3 de abril de 2025
En esta práctica:
- Correlación intra clase
- Modelos con efectos aleatorios
1. Cargar/instalar librerías
Para esta práctica utilizaremos la librería lme4, en particular la función lmer, así como también la librería haven, para lectura de base de datos externa. Además stargazer para visualizar outputs.
2. Leer datos base High School and Beyond (HSB)**
Como en la sesión anterior
mlm = read_dta("http://www.stata-press.com/data/mlmus3/hsb.dta")“mlm”” es el nombre que le daremos a la base de datos “High School and Beyond”
Variables relevantes para ejercicios:
- Nivel 1:
- minority, an indicator for student ethnicity (1 = minority, 0 = other)
- female, an indicator for student gender (1 = female, 0 = male)
- ses, a standardized scale constructed from variables measuring parental education, occupation, and income
- mathach, a measure of mathematics achievement
- Nivel 2
- size (school enrollment)
- sector (1 = Catholic, 0 = public)
- pracad (proportion of students in the academic track)
- disclim (a scale measuring disciplinary climate)
- himnty (1 = more than 40% minority enrollment, 0 = less than 40%)
- mnses (mean of the SES values for the students in this school who are included in the level-1 file)
- Cluster variable: schoolid
Seleccionar variables
mlm = mlm %>%
select(minority,female,ses,mathach,size,sector,pracad,disclim,himinty,mnses,schoolid) %>%
as.data.frame()
names(mlm) [1] "minority" "female" "ses" "mathach" "size" "sector"
[7] "pracad" "disclim" "himinty" "mnses" "schoolid"
#Tabla estadisticos descriptivos
stargazer(mlm,title="Estadísticos descriptivos", type = "text")
Estadísticos descriptivos
=================================================
Statistic N Mean St. Dev. Min Max
-------------------------------------------------
minority 7,185 0.275 0.446 0 1
female 7,185 0.528 0.499 0 1
ses 7,185 0.0001 0.779 -3.758 2.692
mathach 7,185 12.748 6.878 -2.832 24.993
size 7,185 1,056.862 604.172 100 2,713
sector 7,185 0.493 0.500 0 1
pracad 7,185 0.534 0.251 0.000 1.000
disclim 7,185 -0.132 0.944 -2.416 2.756
himinty 7,185 0.280 0.449 0 1
mnses 7,185 0.0001 0.414 -1.194 0.825
schoolid 7,185 5,277.898 2,499.578 1,224 9,586
-------------------------------------------------
3. Modelos
Modelo 0: Null model
results_0 = lmer(mathach ~ 1 + (1 | schoolid), data = mlm)
summary(results_0)Linear mixed model fit by REML ['lmerMod']
Formula: mathach ~ 1 + (1 | schoolid)
Data: mlm
REML criterion at convergence: 47116.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0631 -0.7539 0.0267 0.7606 2.7426
Random effects:
Groups Name Variance Std.Dev.
schoolid (Intercept) 8.614 2.935
Residual 39.148 6.257
Number of obs: 7185, groups: schoolid, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.6370 0.2444 51.71
screenreg(results_0) # de library texreg
========================================
Model 1
----------------------------------------
(Intercept) 12.64 ***
(0.24)
----------------------------------------
AIC 47122.79
BIC 47143.43
Log Likelihood -23558.40
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 8.61
Var: Residual 39.15
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
En este modelo es posible identificar \(\gamma_{00}\) (intercept - fixed effects), así como también los componentes de la varianza:
- \(\tau_{00}\) (Var:id)
- \(\sigma^2\) (Var:Residual).
Con estos últimos coeficientes aleatorios es posible calcular la correlación intraclase (\(\rho\)).
Correlación intra-clase = ICC = \(\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}\)
La correlación intraclase indica la proporción de la varianza de la variable dependiente que se asocia a la pertenencia a unidades de nivel 2
# Cálculo directo en R:
8.614/(8.614+39.148) # calculo ICC [1] 0.1803526
# O de manera directa con funcion ICC de reghelper
reghelper::ICC(results_0)[1] 0.1803518
En este ejemplo, el 18% de la varianza de la variable dependiente se asocia a la pertenencia a una escuela
Modelo 1: Predictores de nivel individual
results_1 = lmer(mathach ~ 1 + ses + female + (1 | schoolid), data = mlm)
screenreg(results_1, naive=TRUE)
========================================
Model 1
----------------------------------------
(Intercept) 13.28 ***
(0.20)
ses 2.36 ***
(0.11)
female -1.19 ***
(0.17)
----------------------------------------
AIC 46605.73
BIC 46640.13
Log Likelihood -23297.86
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 4.49
Var: Residual 36.81
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 2: Predictores nivel 2
results_2 = lmer(mathach ~ 1 + sector + mnses + (1 | schoolid), data = mlm)
screenreg(results_2)
========================================
Model 1
----------------------------------------
(Intercept) 12.13 ***
(0.20)
sector 1.23 ***
(0.31)
mnses 5.33 ***
(0.37)
----------------------------------------
AIC 46956.50
BIC 46990.90
Log Likelihood -23473.25
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.31
Var: Residual 39.16
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 3: Predictores individuales y grupales
results_3 = lmer(mathach ~ 1 + ses + female + sector + mnses + (1 | schoolid), data = mlm)
screenreg(results_3)
========================================
Model 1
----------------------------------------
(Intercept) 12.74 ***
(0.21)
ses 2.15 ***
(0.11)
female -1.20 ***
(0.16)
sector 1.25 ***
(0.30)
mnses 3.07 ***
(0.37)
----------------------------------------
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.15
Var: Residual 36.80
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Comparación individual, agregado y multinivel
Generar regresiones para comparar
reg_ind=lm(mathach ~ ses + female + mnses + sector, data=mlm)
agg_mlm=mlm %>% group_by(schoolid) %>% summarise_all(funs(mean))
reg_agg=lm(mathach ~ ses + female + mnses + sector, data=agg_mlm)Observar: ¿Qué sucede con los coeficientes y errores estándar cuando se comparan los coeficientes y los errores estándar?
screenreg(list(reg_ind, reg_agg, results_3))
=================================================================
Model 1 Model 2 Model 3
-----------------------------------------------------------------
(Intercept) 12.80 *** 13.13 *** 12.74 ***
(0.13) (0.35) (0.21)
ses 2.15 *** 5.19 *** 2.15 ***
(0.11) (0.37) (0.11)
female -1.34 *** -1.97 *** -1.20 ***
(0.15) (0.56) (0.16)
mnses 2.90 *** 3.07 ***
(0.22) (0.37)
sector 1.32 *** 1.25 *** 1.25 ***
(0.16) (0.31) (0.30)
-----------------------------------------------------------------
R^2 0.18 0.67
Adj. R^2 0.18 0.67
Num. obs. 7185 160 7185
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.15
Var: Residual 36.80
=================================================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Generación de tabla para publicar en HTML
htmlreg(list(reg_ind, reg_agg, results_3),
custom.model.names = c("Individual","Agregado","Multinivel"),
custom.coef.names = c("Intercepto", "$SES_{ij}$","$Mujer_{ij}$", "$SES_{j}$", "$Sector_{j}$"),
custom.gof.names=c(NA,NA,NA,NA,NA,NA,NA,
"Var:id ($\\tau_{00}$)","Var: Residual ($\\sigma^2$)"),
custom.note = "%stars. Errores estándar en paréntesis",
caption="Comparación de modelos Individual, Agregado y Multinivel",
caption.above=TRUE,
doctype = FALSE)| Individual | Agregado | Multinivel | |
|---|---|---|---|
| Intercepto | 12.80*** | 13.13*** | 12.74*** |
| (0.13) | (0.35) | (0.21) | |
| \(SES_{ij}\) | 2.15*** | 5.19*** | 2.15*** |
| (0.11) | (0.37) | (0.11) | |
| \(Mujer_{ij}\) | -1.34*** | -1.97*** | -1.20*** |
| (0.15) | (0.56) | (0.16) | |
| \(SES_{j}\) | 2.90*** | 3.07*** | |
| (0.22) | (0.37) | ||
| \(Sector_{j}\) | 1.32*** | 1.25*** | 1.25*** |
| (0.16) | (0.31) | (0.30) | |
| R2 | 0.18 | 0.67 | |
| Adj. R2 | 0.18 | 0.67 | |
| Num. obs. | 7185 | 160 | 7185 |
| AIC | 46515.61 | ||
| BIC | 46563.77 | ||
| Log Likelihood | -23250.80 | ||
| Num. groups: schoolid | 160 | ||
| Var:id (\(\tau_{00}\)) | 2.15 | ||
| Var: Residual (\(\sigma^2\)) | 36.80 | ||
| ***p < 0.001; **p < 0.01; *p < 0.05. Errores estándar en paréntesis | |||