---
title: "lab2_code"
author: "Kai Ping (Brian) Leung"
date: "10/4/2019"
output: pdf_document
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Prerequisite
```{r}
# Good practice to remove all objects from the workspace
rm(list = ls())
# Use library() for packages you need, or source() for other R files.
library(tidyverse)
# Setting the seed ensures that we get the same random draw over and over again.
set.seed(20191004)
rnorm(5) # Check
```
## 1. Build a Bernoulli distribution using the sample() function, where the probability of "success" is 0.7.
```{r}
# Create an imaginary person to flip the coin once for you
sample(c(0, 1),
size = 1,
prob = c(0.3, 0.7)
)
```
## 2. How do you know if it is working properly? Conduct simulation to check if the assigned probabilities are matached with the empirics
```{r}
# Specify the number of simulations
sims <- 10000
# Create an empty vector as "container"
bern_result <- vector(mode = "numeric", length = sims)
# For loop
for (i in 1:sims) {
bern_result[i] <- sample(c(0, 1),
size = 1,
prob = c(0.3, 0.7)
)
}
mean(bern_result)
## Faster way w/o loop...but we want to build the intuition
# sample(c(0, 1),
# size = sims,
# replace = TRUE,
# prob = c(0.3, 0.7)
# )
## Or use rbinom (rmb: bernoulli is just a special case of binomial when M = 1)
# rbinom(sims,
# size = 1,
# prob = 0.7)
```
## 3. Plot the above Bernoulli distribution
```{r}
# Base graphics
hist(bern_result)
# ggplot2
bern_table <- tibble(outcome = bern_result)
ggplot(bern_table, aes(x = outcome)) +
geom_histogram() + # try adding: aes(y = stat(count / sum(count))) ; What does it do?
scale_x_continuous(breaks = c(0, 1), expand = c(1, 0)) +
labs(y = "Prob", x = "x")
```
## 4. Based on the above, generate a binomial distribution, with number of trials equal to 10, without using rbinom()
```{r}
# Create an imaginary person to flip the coin ten times for you
# Let's test it outside of the loop:
sample(c(0, 1),
size = 10,
replace = TRUE,
prob = c(0.3, 0.7)
)
# Create number of simulations and an empty vector as container
binom_result <- vector(mode = "numeric", length = sims)
for (i in 1:sims) {
# Create an imaginary person to flip the coin ten times for you
flips <- sample(c(0, 1),
size = 10,
replace = TRUE,
prob = c(0.3, 0.7)
)
# Sum up the number of "success" for that person
count <- sum(flips)
# Store it into the container; repeat 10,000
binom_result[i] <- count
}
# Faster way w/o loop and sample()...but we want to build the intuition
# rbinom(n = sims,
# size = 10,
# prob = 0.7)
```
## 5. Plot the above binomial distribution
```{r}
# Base graphics
hist(binom_result)
# ggplot2
binom_table <- tibble(outcome = binom_result)
ggplot(binom_table, aes(x = outcome)) +
geom_histogram(aes(y = stat(count/sum(count)))) +
scale_x_continuous(breaks = 1:10) +
labs(y = "Prob", x = "x")
```
## 6. Explore the rbinom, dbinom, pbinom functions. What do they do? Answer the following questions:
a. The probability of a coin landing on head is 0.7. If you were to flip the coin 10 times, what is the probability of getting exactly 7 heads?
b. What is the probability of getting 7 heads or less?
c. How do you know (b) is true?
```{r}
# Pr(exactly 7 heads)
dbinom(x = 7, size = 10, prob = 0.7)
# Pr(7 heads or less)
pbinom(q = 7, size = 10, prob = 0.7)
# Double check
dbinom(x = c(1:7), size = 10, prob = 0.7) %>% sum()
```