--- title: 'CSSS/POLS 510 Maximum Likelihood Estimation: Lab 4' editor_options: chunk_output_type: console date: '2020-10-30' output: beamer_presentation: colortheme: seagull fonttheme: structurebold keep_tex: yes theme: Szeged slidy_presentation: default fontsize: 12pt subtitle: Heteroskedastic Normal & Logit model author: Kenya Amano --- # Mid-quarter Evaluation Your feedback is valuable: [\underline{Evaluation URL}](https://uw.iasystem.org/survey/230028) # Agenda 1. Heteroskedastic Normal Example: Simulation 2. Logit model: Voting example 3. Homework: Recap HW2, Question HW3 # 1. Heteroskedastic Normal Example + Steps + The full `R` code can be found [\underline{here from Chris' website}](https://faculty.washington.edu/cadolph/mle/HeteroskedMLE.r) 2.1 Generate Data 2.2 Fit OLS - `lm()` 2.3 Fit MLE - `optim()` 2.4 Calculate quantities of interest - Use `predict()` - Use simulation # 1.1 Recap \scriptsize ```{r, warning=F, message=F} rm(list=ls()) # Clear memory set.seed(123456) # For reproducible random numbers library(MASS) # Load packages library(simcf) library(tidyverse) n <- 1500 # Generate 1500 observations w0 <- rep(1, n) # Create the constant w1 <- runif(n) # Create two covariates w2 <- runif(n) x <- cbind(w0, w1, w2) # Create a matrix of the covariates z <- x # i.e., same covariates affect mu and sigma beta <- c(0, 5, 15) # Set a parameter vector for the mean # One for constant, one for covariate 1, # one for covariate 2. gamma <- c(1, 0, 3) # Set a parameter vector for the variance # Gamma estimate for covariate 2 is set to be 3, # whic creates heteroskedasticity ``` # 1.1 Recap \scriptsize ```{r} mu <- x %*% beta # Create systemtic component for the mean sigma2 <- exp(z %*% gamma) # Create systematic component for the variance # Since ith row of sigma2 = exp(1 + 0 + w2_i * 3) # i.e., it is a function of w2 (heteroskedastic) y <- rnorm(n = n, # Create the outcome variable mean = mu, # Think about the stochastic component! sd = sqrt(sigma2) ) data <- cbind(y, w1, w2) # Save the data to a data frame data <- as.data.frame(data) names(data) <- c("y", "w1", "w2") ``` # 1.1 Recap \scriptsize ```{r} ls.result <- lm(y ~ w1 + w2, data = data) # Fit a linear model # using simulated data ls.aic <- AIC(ls.result) # Calculate and print the AIC # i.e. Akaike Information Criterion # to assess goodness of fit; lower AIC is better ``` # 1.1 Recap \scriptsize ```{r} # A likelihood function for ML heteroskedastic Normal llk.hetnormlin <- function(param, y, x, z) { x <- as.matrix(x) # x (some covariates) as a matrix z <- as.matrix(z) # z (some covariates) as a matrix os <- rep(1, nrow(x)) # Set the intercept as 1 (constant) x <- cbind(os, x) # Add intercept to covariates x z <- cbind(os, z) # Add intercept to covariates z b <- param[1 : ncol(x)] # Parameters for x g <- param[(ncol(x) + 1) : (ncol(x) + ncol(z))] # Parameters for z xb <- x %*% b # Systematic components for mean s2 <- exp(z %*% g) # Systematic components for variance sum(0.5 * (log(s2) + (y - xb)^2 / s2)) # Likelihood we want to maximize # optim is a minimizer by default # To maximize lnL is to minimize -lnL # so the +/- signs are reversed } ``` # 1.1 Recap \scriptsize ```{r} # Create input matrices xcovariates <- cbind(w1, w2) zcovariates <- cbind(w1, w2) # Initial guesses of beta0, beta1, ..., gamma0, gamma1, ... # We need one entry per parameter, in order! # Note: also include beta and gamma estiamtes for constants stval <- c(0, 0, 0, 0, 0, 0) # Run ML, and get the output we need hetnorm.result <- optim(stval, # Initial guesses llk.hetnormlin, # Likelihood function method = "BFGS", # Gradient method hessian = TRUE, # Return Hessian matrix y = y, # Outcome variable x = xcovariates, # Covariates x (w/o constant) z = zcovariates # Covariates z (w/o constant) ) ``` Note: By default, `optim()` performs a minimizing procedure. You can make optim a maximizer by adding `control = list(fnscale = -1)` # 1.1 Recap \scriptsize ```{r} pe <- hetnorm.result$par # Point estimates round(pe, 3) vc <- solve(hetnorm.result$hessian) # Var-cov matrix (for computing s.e.) round(vc, 5) se <- sqrt(diag(vc)) # To compute standard errors (s.e.) # take the diagonal of the Hessian; # then take square root round(se, 3) ``` # 1.1 Recap \scriptsize ```{r} mle.result <- cbind(pe[1:3], se[1:3]) # Report ML results colnames(mle.result) <- c("Estimate", "Std.Error") rownames(mle.result) <- c("(Intercept)", "w1", "w2") round(mle.result, 3) round(coef(summary(ls.result))[, c(1, 2)], 3) # Compare with lm() results ll <- -hetnorm.result$value # Report likelihood at maximum # No need to have negative sign # if optim is set as maximizer ll ``` # 1.1 Recap \scriptsize ```{r} # AIC is 2 * number of parameters - 2 * ll (i.e. likelihood at maximum) hetnorm.aic <- 2 * length(stval) - 2 * ll # Compare AIC from ML and from lm(); lower is better print(hetnorm.aic) print(ls.aic) ``` # 1.2 Recap2: Predict QoI **Motivation**: We want to study how the change in a particular explanatory variable affects the outcome variable, *all else being equal* # 1.2 Recap2: Predict QoI + Scenario 1: Vary covariate 1; hold covariate 2 constant 1. Create a data frame with a set of hypothetical scenarios for covariate 1, while keeping covariate 2 at its mean + What is the sensible range of some hypothetical scenarios for covariate 1? Consider the original range of `w1`. 2. Calculate the predicted values using the `predict()` function + Hint: you need at least the following arguments: `predict(object = ... , newdata = ... , interval = ... , level = ...)` 3. Plot the prediction intervals # 1.2 Recap2: Predict QoI 4. Similarly, calculate the confidence intervals using the `predict()` function 5. Plot the confidence intervals; compare them with the predictive intervals # 1.2 Recap2: Predict QoI \scriptsize ```{r} # Set up w1range <- seq(from = 0, to = 1, by = 0.05) # Set up hypothetical values for w1 w2mean <- mean(w2) # Set up mean for w2 xhypo <- crossing(w1 = w1range, w2 = w2mean) # Create a new dataset using crossing() # Calculate predicted values simPI.w1 <- predict(ls.result, # A model object newdata = xhypo, # New dataset interval = "prediction", # What kind of intervals? level = 0.95) # What levels of confidence? ``` # 1.2 Recap2: Predict QoI \scriptsize ```{r} simPI.w1 <- simPI.w1 %>% as_tibble() %>% # Coerce it into a tibble bind_cols(w1 = w1range) # Combine hypo w1 with predicted y # Calculate confidence intervals using predict() simCI.w1 <- predict(ls.result, newdata = xhypo, interval = "confidence", level = 0.95) simCI.w1 <- simCI.w1 %>% as_tibble() %>% bind_cols(w1 = w1range) ``` # 1.2 Recap2: Predict QoI \scriptsize ```{r} # ggplot2 theme_set(theme_classic()) simALL.w1 <- bind_rows( simPI.w1 %>% mutate(type = "PI"), simCI.w1 %>% mutate(type = "CI") ) ``` # 1.2 Recap2: Predict QoI \scriptsize ```{r, fig.asp = 0.618, out.width = '60%', fig.align = "center"} # Plot confidence intervals and predictive intervals side by side ggplot(simALL.w1, aes(x = w1, y = fit, ymax = upr, ymin = lwr)) + geom_line() + geom_ribbon(alpha = 0.1) + labs(y = "Predicted Y", x = "Covariate 1") + facet_grid(~ type) ``` # 1.2 Recap2: Predict QoI Scenario 2: Vary covariate 2; hold covariate 1 constant \scriptsize ```{r} w2range <- seq(from = 0, to = 1, by = 0.05) # Set up hypothetical values for w1 w1mean <- mean(w1) # Set up mean for w2 xhypo <- crossing(w1 = w1mean, # Create a new dataset w2 = w2range) # crossing() is from tidyr package # Calculate predicted values simPI.w2 <- predict(ls.result, # A model object newdata = xhypo, # New dataset interval = "prediction", # What kind of intervals? level = 0.95) # What levels of confidence? simPI.w2 <- simPI.w2 %>% as_tibble() %>% bind_cols(w2 = w2range) ``` # 1.3 Simulating QoI using `simcf` **Motivation**: Can we use simulation methods to produce the same prediction and confidence intervals? Recall the lecture: we can draw a bunch of $\tilde{\beta}$ from a multivariate normal distribution Demonstration: + Scenario 1: Vary covariate 1; hold covariate 2 constant 1. Create a data frame with a set of hypothetical scenarios for covariate 1 while keeping covariate 2 at its mean 2. Simulate the predicted values using `MASS` and `simcf` 3. Plot the results # 1.3 Simulating QoI using `simcf` In order to use `simcf` to generate quantities of interest, the following steps are needed: 1. Estimate: MLE $\hat{\beta}$ and its variance $\hat{V}(\hat{\beta})$ 2. Simulate estimation uncertainty from a multivariate normal distribution:\ Draw $\tilde{\beta} \sim MVN \big[\hat{\beta}, \hat{V}(\hat{\beta})\big]$ 3. Create hypothetical scenarios of your substantive interest:\ Choose valuese of X: $X_c$ # 1.3 Simulating QoI using `simcf` 4. Calculate expected values:\ $\tilde{\mu_c} = g(X_c, \tilde{\beta})$ 5. Simulate fundamental uncertainty:\ $\tilde{y_c} \sim f(\tilde{\mu_c}, \tilde{\alpha})$\ or 5. Compute EVs, First Differences or Relative Risks\ EV: $\mathbb{E}(y|X_{c1})$\ FD: $\mathbb{E}(y|X_{c2}) - \mathbb{E}(y|X_{c1})$\ RR: $\frac{\mathbb{E}(y|X_{c2})}{\mathbb{E}(y|X_{c1})}$ # 1.3 Simulating QoI using `simcf` In order to use `simcf` to generate quantities of interest, the following steps are needed: 1. Estimate: MLE $\hat{\beta}$ and its variance $\hat{V}(\hat{\beta})$\ $\textcolor{red}{\rightarrow \texttt{optim(), glm()}}$ 2. Simulate estimation uncertainty from a multivariate normal distribution:\ Draw $\tilde{\beta} \sim MVN \big[\hat{\beta}, \hat{V}(\hat{\beta})\big]$\ $\textcolor{red}{\rightarrow \texttt{MASS::mvrnorm()}}$ 3. Create hypothetical scenarios of your substantive interest:\ Choose valuese of X: $X_c$ $\textcolor{red}{\rightarrow \texttt{simcf::cfmake(), cfchange()} \dots}$ # 1.3 Simulating QoI using `simcf` 4. Calculate expected values:\ $\tilde{\mu_c} = g(X_c, \tilde{\beta})$ \ 5. Simulate fundamental uncertainty:\ $\tilde{y_c} \sim f(\tilde{\mu_c}, \tilde{\alpha})$\ $\textcolor{red}{\rightarrow \texttt{simcf::hetnormsimpv()} \dots}$\ or 5. Compute EVs, First Differences or Relative Risks\ EV: $\mathbb{E}(y|X_{c1})$\ $\textcolor{red}{\rightarrow \texttt{simcf::logitsimev()} \dots}$\ FD: $\mathbb{E}(y|X_{c2}) - \mathbb{E}(y|X_{c1})$\ $\textcolor{red}{\rightarrow \texttt{simcf::logitsimfd()} \dots}$\ RR: $\frac{\mathbb{E}(y|X_{c2})}{\mathbb{E}(y|X_{c1})}$\ $\textcolor{red}{\rightarrow \texttt{simcf::logitsimrr()} \dots}$\ # 1.3 Simulating QoI using `simcf` Recall our likelihood function is: \scriptsize $$ \begin{aligned} \mathrm{P}\left(\mathbf{y} | \boldsymbol{\mu}, \boldsymbol{\sigma}^{2}\right) &=\prod_{i=1}^{n} f_{\mathcal{N}}\left(y_{i} | \mu_{i}, \sigma_{i}^{2}\right) & \scriptsize[\text{Joint probability}] \\ \mathrm{P}\left(\mathbf{y} | \boldsymbol{\mu}, \boldsymbol{\sigma}^{2}\right) &=\prod_{i=1}^{n}\left(2 \pi \sigma_{i}^{2}\right)^{-1 / 2} \exp \left[\frac{-\left(y_{i}-\mu_{i}\right)^{2}}{2 \sigma_{i}^{2}}\right] & \scriptsize[\text{Expressed in Normal distribution}]\\ \log \mathcal{L}\left(\boldsymbol{\beta}, \boldsymbol{\sigma}^{2} | \mathbf{y}\right) & \propto-\frac{1}{2} \sum_{i=1}^{n} \log \sigma_{i}^{2}-\frac{1}{2} \sum_{i=1}^{n} \frac{\left(y_{i}-\mathbf{x}_{i} \boldsymbol{\beta}\right)^{2}}{\sigma_{i}^{2}} & \scriptsize[\text{Converted to log likelihood; simplify}] \\ \log \mathcal{L}(\boldsymbol{\beta}, \boldsymbol{\gamma} | \mathbf{y}) & \propto-\frac{1}{2} \sum_{i=1}^{n} \log \mathbf{z}_{i} \gamma-\frac{1}{2} \sum_{i=1}^{n} \frac{\left(y_{i}-\mathbf{x}_{i} \boldsymbol{\beta}\right)^{2}}{\exp \left(\mathbf{z}_{i} \gamma\right)} & \scriptsize[\text{Substitute in systematic components}] \end{aligned} $$ # 1.3 Simulating QoI using `simcf` \scriptsize ```{r} # Draw parameters from the model predictive distribution sims <- 10000 simparam <- mvrnorm(n = sims, mu = pe, # M.N.D. with population mean = pe (from ML); Sigma = vc) # population var-covar matrix = vc (from ML) head(simparam) # Inspect ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{R} # Separate into the simulated betas and simulated gammas simbetas <- simparam[ , 1:(ncol(xcovariates) + 1)] simgammas <- simparam[ , (ncol(simbetas) + 1):ncol(simparam)] # Specify model formulas varmodel <- (y ~ w1 + w2) # Create hypothetical scenarios for covariate 1 w1range <- seq(from = 0, to = 1, by = 0.2) ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{r} # Use `cfMake` to create a baseline dataset xhypo <- cfMake(varmodel, data, nscen = length(w1range)) xhypo ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{r} # Use `cfChange` and loop function to loop through each hypothetical x values for (i in 1:length(w1range)) { xhypo <- cfChange(xscen = xhypo, covname = "w1", x = w1range[i], scen = i) } xhypo ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{r} # Repeat the same procedures for z zhypo <- cfMake(varmodel, data, nscen = length(w1range)) for (i in 1:length(w1range)) { zhypo <- cfChange(zhypo, "w1", x = w1range[i], scen = i) } zhypo ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{r} # Simulate predictive intervals for heteroskedastic linear models # `hetnormsimpv()` is from simcf package simRES.w1 <- hetnormsimpv(xhypo, simbetas, zhypo, simgammas, ci = 0.95, constant = 1, varconstant = 1) simRES.w1 ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{r, fig.asp = 0.618, out.width = '60%', fig.align = "center"} simRES.w1 <- simRES.w1 %>% bind_rows() %>% # Collaspe the list into d.f. bind_cols(w1 = w1range) # Combine with hypo. w1 values simRES.w1 # Inspect ``` # 1.3 Simulating QoI using `simcf` :S1 \scriptsize ```{r, fig.asp = 0.618, out.width = '80%', fig.align = "center"} # ggplot2 ggplot(simRES.w1, aes(x = w1, y = pe, ymax = upper, ymin = lower)) + geom_line() + geom_ribbon(alpha = 0.1) + labs(y = "Predicted Y", x = "Covariate 1") ``` # 1.3 Simulating QoI using `simcf` :S2 Scenario 2: Vary covariate 2; hold covariate 1 constant 1. Create a data frame with a set of hypothetical scenarios for covariate 2 while keeping covariate 1 at its mean 2. Simulate the predicted values using `MASS` and `simcf` 3. Plot the results # 1.3 Simulating QoI using `simcf` :S2 \scriptsize ```{r} # Create hypothetical scenarios for w2 w2range <- seq(0, 1, by = 0.05) xhypo <- cfMake(varmodel, data, nscen = length(w2range)) for (i in 1:length(w2range)) { xhypo <- cfChange(xhypo, "w2", x = w2range[i], scen = i) } zhypo <- cfMake(varmodel, data, nscen = length(w2range)) for (i in 1:length(w2range)) { zhypo <- cfChange(zhypo, "w2", x = w2range[i], scen = i) } ``` # 1.3 Simulating QoI using `simcf` :S2 \scriptsize ```{r} # Simulate the predicted Y's and PI's simRES.w2 <- hetnormsimpv(xhypo, simbetas, zhypo, simgammas, ci = 0.95, constant = 1, varconstant = 1) simRES.w2 <- simRES.w2 %>% bind_rows() %>% bind_cols(w2 = w2range) ``` # 1.3 Simulating QoI using `simcf` :S2 \scriptsize ```{r, fig.asp = 0.618, out.width = '70%', fig.align = "center"} # Plot the predictive intervals for hypothetical w2 ggplot(simRES.w2, aes(x = w2, y = pe, ymax = upper, ymin = lower)) + geom_line() + geom_ribbon(alpha = 0.1) + labs(y = "Predicted Y", x = "Covariate 2") ``` Can we compare them with the prediction intervals generated by `predict()`? # 1.3 Simulating QoI using `simcf` :S2 \scriptsize ```{r} # Combine two dataframes simPI.w2 <- simPI.w2 %>% rename(pe = fit, lower = lwr, upper = upr) simALL.w2 <- bind_rows( simRES.w2 %>% mutate(method = "sim"), simPI.w2 %>% mutate(method = "lm") ) ``` # 1.3 Simulating QoI using `simcf` :S2 \scriptsize ```{r, fig.asp = 0.618, out.width = '70%', fig.align = "center"} # Plot the predictive intervals for hypothetical w2 ggplot(simALL.w2, aes(x = w2, y = pe, ymax = upper, ymin = lower)) + geom_line() + geom_ribbon(alpha = 0.1) + labs(y = "Predicted Y", x = "Covariate 2") + facet_grid(~ method) ``` # 2 Logit model: Voting example Let's open \texttt{RStudio} and \href{http://staff.washington.edu/kamano/MLE/LabMaterials/Lab04/nesLogitPlot.r}[**\underline{nesLogitPlot.r}**] # 3. Homework: Recap HW2 Problem2: Which one do you have more confidence? ```{r, echo=F} pe1 <- c(-0.046, 0.979, 2.096) se1 <- c(0.010, 0.029, 0.030) pe2 <- c(0.088, 0.857, 2.019) se2 <- c(0.055, 0.095, 0.027) sample <- tibble( beta = c("beta0", "beta1", "beta2"), pe1 = pe1, se1 = se1, pe2 = pe2, se2 = se2 ) sample %>% pander::pander() ``` # 3. Homework: Recap HW2 How about this? ```{r, echo=F, fig.asp = 0.618, out.width = '90%', fig.align = "center"} dta <- tibble(pe = c(pe1,pe2), se = c(se1,se2), trueBeta = rep(c(0,1,2),2), beta = rep(c("beta0", "beta1", "beta2"),2), type = c(rep("pe1", 3), rep("pe2",3))) %>% mutate(lower = pe -1.96*se, upper = pe +1.96*se, Conf95 = case_when(lower<=trueBeta & upper >= trueBeta ~ 1, TRUE ~0) %>% factor()) dta %>% ggplot(aes(x = beta, y = pe, ymin = lower, ymax = upper, shape = Conf95, fill = Conf95))+ geom_point(size = 2,show.legend = F)+ geom_linerange()+ geom_hline(yintercept = c(0, 1, 2), linetype = "dotted")+ labs(y = "Point estimates", x = "")+ scale_shape_manual(values = c(1,16))+ scale_fill_manual(values = c("white","black"))+ facet_grid(~type)+ theme_bw() ``` # 3. Homework: Recap HW2 Lazy coder \scriptsize ```{r, fig.asp = 0.618, out.width = '40%', fig.align = "center"} y <- c(1, 0, 0, 1, 0, 0, 0, 0) fun1 <- function(pi){ (log(pi)*sum(y) + log(1-pi)*sum(1-y)) %>% exp() } ggplot(data = data.frame(x = 0), mapping = aes(x = x))+ stat_function(fun = fun1)+ xlim(0,1)+ labs(x = "pi", y = "Likelihood")+ theme_bw() ``` # 3. Homework: Question HW3 + One common problem when knitting: the math mode environment doesn't like white space or empty line + Try `\begin{alinged}` instead of `\begin{split}` + `R Markdown` guide is [\underline{here}](https://bookdown.org/yihui/rmarkdown) + Email subject: **MLE510HW3** + File name: **MLE510HW3KenyaAmano** + Your feedback is valuable: [\underline{Evaluation URL}](https://uw.iasystem.org/survey/230028)