--- title: "lab2_code" author: "Kai Ping (Brian) Leung" date: "10/4/2019" output: pdf_document editor_options: chunk_output_type: console --- ```{r setup, include=FALSE} knitr::opts_chunk\$set(echo = TRUE) ``` ## Prerequisite ```{r} # Good practice to remove all objects from the workspace rm(list = ls()) # Use library() for packages you need, or source() for other R files. library(tidyverse) # Setting the seed ensures that we get the same random draw over and over again. set.seed(20191004) rnorm(5) # Check ``` ## 1. Build a Bernoulli distribution using the sample() function, where the probability of "success" is 0.7. ```{r} # Create an imaginary guy to flip the coin once for you sample(c(0, 1), size = 1, prob = c(0.3, 0.7) ) ``` ## 2. How do you know if it is working properly? Conduct simulation to check if the assigned probabilities are matached with the empirics ```{r} # Specify the number of simulations sims <- 10000 # Create an empty vector as "container" bern_result <- vector(mode = "numeric", length = sims) # For loop for (i in 1:sims) { bern_result[i] <- sample(c(0, 1), size = 1, prob = c(0.3, 0.7) ) } mean(bern_result) ## Faster way w/o loop...but we want to build the intuition # sample(c(0, 1), # size = sims, # replace = TRUE, # prob = c(0.3, 0.7) # ) ## Or use rbinom (rmb: bernoulli is just a special case of binomial when M = 1) # rbinom(sims, # size = 1, # prob = 0.7) ``` ## 3. Plot the above Bernoulli distribution ```{r} # Base graphics hist(bern_result) # ggplot2 bern_table <- tibble(outcome = bern_result) ggplot(bern_table, aes(x = outcome)) + geom_histogram() + # try adding: aes(y = stat(count / sum(count))) ; What does it do? scale_x_continuous(breaks = c(0, 1), expand = c(1, 0)) + labs(y = "Prob", x = "x") ``` ## 4. Based on the above, generate a binomial distribution, with number of trials equal to 10, without using rbinom() ```{r} # Create an imaginary guy to flip the coin ten times for you # Let's test it outside of the loop: sample(c(0, 1), size = 10, replace = TRUE, prob = c(0.3, 0.7) ) # Create number of simulations and an empty vector as container binom_result <- vector(mode = "numeric", length = sims) for (i in 1:sims) { # Create an imaginary guy to flip the coin ten times for you flips <- sample(c(0, 1), size = 10, replace = TRUE, prob = c(0.3, 0.7) ) # Sum up the number of "success" for that guy count <- sum(flips) # Store it into the container; repeat 10,000 binom_result[i] <- count } # Faster way w/o loop and sample()...but we want to build the intuition rbinom(n = sims, size = 10, prob = 0.7) ``` ## 5. Plot the above binomial distribution ```{r} # Base graphics hist(binom_result) # ggplot2 binom_table <- tibble(outcome = binom_result) ggplot(binom_table, aes(x = outcome)) + geom_histogram(aes(y = stat(count/sum(count)))) + scale_x_continuous(breaks = 1:10) + labs(y = "Prob", x = "x") ``` ## 6. Explore the rbinom, dbinom, pbinom functions. What do they do? Answer the following questions: a. The probability of a coin landing on head is 0.7. If you were to flip the coin 10 times, what is the probability of getting exactly 7 heads? b. What is the probability of getting 7 heads or less? c. How do you know (b) is true? ```{r} # Pr(exactly 7 heads) dbinom(x = 7, size = 10, prob = 0.7) # Pr(7 heads or less) pbinom(q = 7, size = 10, prob = 0.7) # Double check dbinom(x = c(1:7), size = 10, prob = 0.7) %>% sum() ```