if (!require("pacman")) install.packages("pacman") # solo la primera vez
::p_load(lme4,
pacman
reghelper,
haven,
stargazer,# gráficos
ggplot2, # tablas de regresion (screenreg)
texreg, # manipulacion de datos
dplyr # paquetes a cargar )
Práctica 3. Estimación modelos multinivel con lmer en R
Correspondiente a la sesión del jueves, 3 de abril de 2025
En esta práctica:
- Correlación intra clase
- Modelos con efectos aleatorios
1. Cargar/instalar librerías
Para esta práctica utilizaremos la librería lme4
, en particular la función lmer
, así como también la librería haven
, para lectura de base de datos externa. Además stargazer
para visualizar outputs.
2. Leer datos base High School and Beyond (HSB)**
Como en la sesión anterior
= read_dta("http://www.stata-press.com/data/mlmus3/hsb.dta") mlm
“mlm”” es el nombre que le daremos a la base de datos “High School and Beyond”
Variables relevantes para ejercicios:
- Nivel 1:
- minority, an indicator for student ethnicity (1 = minority, 0 = other)
- female, an indicator for student gender (1 = female, 0 = male)
- ses, a standardized scale constructed from variables measuring parental education, occupation, and income
- mathach, a measure of mathematics achievement
- Nivel 2
- size (school enrollment)
- sector (1 = Catholic, 0 = public)
- pracad (proportion of students in the academic track)
- disclim (a scale measuring disciplinary climate)
- himnty (1 = more than 40% minority enrollment, 0 = less than 40%)
- mnses (mean of the SES values for the students in this school who are included in the level-1 file)
- Cluster variable: schoolid
Seleccionar variables
= mlm %>%
mlm select(minority,female,ses,mathach,size,sector,pracad,disclim,himinty,mnses,schoolid) %>%
as.data.frame()
names(mlm)
[1] "minority" "female" "ses" "mathach" "size" "sector"
[7] "pracad" "disclim" "himinty" "mnses" "schoolid"
#Tabla estadisticos descriptivos
stargazer(mlm,title="Estadísticos descriptivos", type = "text")
Estadísticos descriptivos
=================================================
Statistic N Mean St. Dev. Min Max
-------------------------------------------------
minority 7,185 0.275 0.446 0 1
female 7,185 0.528 0.499 0 1
ses 7,185 0.0001 0.779 -3.758 2.692
mathach 7,185 12.748 6.878 -2.832 24.993
size 7,185 1,056.862 604.172 100 2,713
sector 7,185 0.493 0.500 0 1
pracad 7,185 0.534 0.251 0.000 1.000
disclim 7,185 -0.132 0.944 -2.416 2.756
himinty 7,185 0.280 0.449 0 1
mnses 7,185 0.0001 0.414 -1.194 0.825
schoolid 7,185 5,277.898 2,499.578 1,224 9,586
-------------------------------------------------
3. Modelos
Modelo 0: Null model
= lmer(mathach ~ 1 + (1 | schoolid), data = mlm)
results_0 summary(results_0)
Linear mixed model fit by REML ['lmerMod']
Formula: mathach ~ 1 + (1 | schoolid)
Data: mlm
REML criterion at convergence: 47116.8
Scaled residuals:
Min 1Q Median 3Q Max
-3.0631 -0.7539 0.0267 0.7606 2.7426
Random effects:
Groups Name Variance Std.Dev.
schoolid (Intercept) 8.614 2.935
Residual 39.148 6.257
Number of obs: 7185, groups: schoolid, 160
Fixed effects:
Estimate Std. Error t value
(Intercept) 12.6370 0.2444 51.71
screenreg(results_0) # de library texreg
========================================
Model 1
----------------------------------------
(Intercept) 12.64 ***
(0.24)
----------------------------------------
AIC 47122.79
BIC 47143.43
Log Likelihood -23558.40
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 8.61
Var: Residual 39.15
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
En este modelo es posible identificar \(\gamma_{00}\) (intercept - fixed effects), así como también los componentes de la varianza:
- \(\tau_{00}\) (Var:id)
- \(\sigma^2\) (Var:Residual).
Con estos últimos coeficientes aleatorios es posible calcular la correlación intraclase (\(\rho\)).
Correlación intra-clase = ICC = \(\rho=\frac{\tau_{00}}{\tau_{00}+\sigma^2}\)
La correlación intraclase indica la proporción de la varianza de la variable dependiente que se asocia a la pertenencia a unidades de nivel 2
# Cálculo directo en R:
8.614/(8.614+39.148) # calculo ICC
[1] 0.1803526
# O de manera directa con funcion ICC de reghelper
::ICC(results_0) reghelper
[1] 0.1803518
En este ejemplo, el 18% de la varianza de la variable dependiente se asocia a la pertenencia a una escuela
Modelo 1: Predictores de nivel individual
= lmer(mathach ~ 1 + ses + female + (1 | schoolid), data = mlm)
results_1 screenreg(results_1, naive=TRUE)
========================================
Model 1
----------------------------------------
(Intercept) 13.28 ***
(0.20)
ses 2.36 ***
(0.11)
female -1.19 ***
(0.17)
----------------------------------------
AIC 46605.73
BIC 46640.13
Log Likelihood -23297.86
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 4.49
Var: Residual 36.81
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 2: Predictores nivel 2
= lmer(mathach ~ 1 + sector + mnses + (1 | schoolid), data = mlm)
results_2 screenreg(results_2)
========================================
Model 1
----------------------------------------
(Intercept) 12.13 ***
(0.20)
sector 1.23 ***
(0.31)
mnses 5.33 ***
(0.37)
----------------------------------------
AIC 46956.50
BIC 46990.90
Log Likelihood -23473.25
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.31
Var: Residual 39.16
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Modelo 3: Predictores individuales y grupales
= lmer(mathach ~ 1 + ses + female + sector + mnses + (1 | schoolid), data = mlm)
results_3 screenreg(results_3)
========================================
Model 1
----------------------------------------
(Intercept) 12.74 ***
(0.21)
ses 2.15 ***
(0.11)
female -1.20 ***
(0.16)
sector 1.25 ***
(0.30)
mnses 3.07 ***
(0.37)
----------------------------------------
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. obs. 7185
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.15
Var: Residual 36.80
========================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Comparación individual, agregado y multinivel
Generar regresiones para comparar
=lm(mathach ~ ses + female + mnses + sector, data=mlm)
reg_ind=mlm %>% group_by(schoolid) %>% summarise_all(funs(mean))
agg_mlm=lm(mathach ~ ses + female + mnses + sector, data=agg_mlm) reg_agg
Observar: ¿Qué sucede con los coeficientes y errores estándar cuando se comparan los coeficientes y los errores estándar?
screenreg(list(reg_ind, reg_agg, results_3))
=================================================================
Model 1 Model 2 Model 3
-----------------------------------------------------------------
(Intercept) 12.80 *** 13.13 *** 12.74 ***
(0.13) (0.35) (0.21)
ses 2.15 *** 5.19 *** 2.15 ***
(0.11) (0.37) (0.11)
female -1.34 *** -1.97 *** -1.20 ***
(0.15) (0.56) (0.16)
mnses 2.90 *** 3.07 ***
(0.22) (0.37)
sector 1.32 *** 1.25 *** 1.25 ***
(0.16) (0.31) (0.30)
-----------------------------------------------------------------
R^2 0.18 0.67
Adj. R^2 0.18 0.67
Num. obs. 7185 160 7185
AIC 46515.61
BIC 46563.77
Log Likelihood -23250.80
Num. groups: schoolid 160
Var: schoolid (Intercept) 2.15
Var: Residual 36.80
=================================================================
*** p < 0.001; ** p < 0.01; * p < 0.05
Generación de tabla para publicar en HTML
htmlreg(list(reg_ind, reg_agg, results_3),
custom.model.names = c("Individual","Agregado","Multinivel"),
custom.coef.names = c("Intercepto", "$SES_{ij}$","$Mujer_{ij}$", "$SES_{j}$", "$Sector_{j}$"),
custom.gof.names=c(NA,NA,NA,NA,NA,NA,NA,
"Var:id ($\\tau_{00}$)","Var: Residual ($\\sigma^2$)"),
custom.note = "%stars. Errores estándar en paréntesis",
caption="Comparación de modelos Individual, Agregado y Multinivel",
caption.above=TRUE,
doctype = FALSE)
Individual | Agregado | Multinivel | |
---|---|---|---|
Intercepto | 12.80*** | 13.13*** | 12.74*** |
(0.13) | (0.35) | (0.21) | |
\(SES_{ij}\) | 2.15*** | 5.19*** | 2.15*** |
(0.11) | (0.37) | (0.11) | |
\(Mujer_{ij}\) | -1.34*** | -1.97*** | -1.20*** |
(0.15) | (0.56) | (0.16) | |
\(SES_{j}\) | 2.90*** | 3.07*** | |
(0.22) | (0.37) | ||
\(Sector_{j}\) | 1.32*** | 1.25*** | 1.25*** |
(0.16) | (0.31) | (0.30) | |
R2 | 0.18 | 0.67 | |
Adj. R2 | 0.18 | 0.67 | |
Num. obs. | 7185 | 160 | 7185 |
AIC | 46515.61 | ||
BIC | 46563.77 | ||
Log Likelihood | -23250.80 | ||
Num. groups: schoolid | 160 | ||
Var:id (\(\tau_{00}\)) | 2.15 | ||
Var: Residual (\(\sigma^2\)) | 36.80 | ||
***p < 0.001; **p < 0.01; *p < 0.05. Errores estándar en paréntesis |