Hypothesis Testing Intro- Suggested Answers

Packages

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(tidymodels)

── Attaching packages ────────────────────────────────────── tidymodels 1.1.0 ──
✔ broom        1.0.5     ✔ rsample      1.1.1
✔ dials        1.2.0     ✔ tune         1.1.1
✔ infer        1.0.4     ✔ workflows    1.1.3
✔ modeldata    1.1.0     ✔ workflowsets 1.0.1
✔ parsnip      1.1.0     ✔ yardstick    1.2.0
✔ recipes      1.0.6     
── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ scales::discard() masks purrr::discard()
✖ dplyr::filter()   masks stats::filter()
✖ recipes::fixed()  masks stringr::fixed()
✖ dplyr::lag()      masks stats::lag()
✖ yardstick::spec() masks readr::spec()
✖ recipes::step()   masks stats::step()
• Use suppressPackageStartupMessages() to eliminate package startup messages

library(openintro)

Loading required package: airports
Loading required package: cherryblossom
Loading required package: usdata

Attaching package: 'openintro'

The following object is masked from 'package:modeldata':

    ames

Bumba or Kiki

How well can humans distinguish one “Martian” letter from another? In today’s activity, we’ll find out. When shown the two Martian letters, kiki and bumba, answer the poll: https://app.sli.do/event/etoay5PwN5Mg5qiYg6BnDf/embed/polls/647dc94f-b22d-491d-b0ad-35552b27d01f

– Option 1: 75

– Option 2: 8

The question is: “Which letter is Bumba”?

Option 1

Once it’s revealed which option is correct, please write our sample statistic below:

\(\hat{p}\) = .904

Option 1

Let’s write out the null and alternative hypotheses below

Ho: \(\pi = 0.5\)

Ha: \(\pi > 0.5\)

Now, let’s quickly make a data frame of the data we just collected as a class. Replace the … with the number of correct and incorrect guesses.

class_data <- tibble(
  correct_guess = c((rep("Correct" , 75)), rep("Incorrect" , 8)

))

Capture Variability

Now let’s simulate our null distribution by filling in the blanks. First, detail how this distribution is created?

We can use a data generating mechanism, such as a spinner or a coin. We will use a spinner, with 50% “correct” and 50% “incorrect”. Then, we spin the spinner “n” number of times (83). Then, we take the proportion of correct guesses of the 83. This is one simulated observation. We do this process many many times to create a null distribution.

set.seed(333)

null_dist <- class_data |> 
  specify(response = correct_guess, success = "Correct") |>
  hypothesize(null = "point", p = .5) |> #fill in the blank
  generate(reps = 1000, type = "draw") |> #fill in the blank
  calculate(stat = "prop") #fill in the blank

Helpful Hint: Remember that you can use ? next to the function name to pull up the help file!

Calculate and visualize the distribution below.

visualize(null_dist) +
  shade_p_value(0.835, direction = "right") #fill in the blank

Warning in min(diff(unique_loc)): no non-missing arguments to min; returning
Inf

null_dist |>
  get_p_value(0.835, direction = "right") #fill in the blank

Warning: Please be cautious in reporting a p-value of 0. This result is an
approximation based on the number of `reps` chosen in the `generate()` step.
See `?get_p_value()` for more information.

# A tibble: 1 × 1
  p_value
    <dbl>
1       0

What did we just calculate?

A p-value!

So, can we read Martian?

Ted Talk

http://www.ted.com/talks/vilayanur_ramachandran_on_your_mind