Interaction Models + Intro Model Selection

Lecture 16

Dr. Elijah Meyer

Duke University
STA 199 - Fall 2023

2023-10-24

Checklist

– Clone ae-15

– HW-5 Due out on Tuesday

– Project proposal due Wednesday (Nov 1st)

– HW-6 is out now (Due last day of class)

Project Proposal

Your team should have a project repo

– This is where you will complete all components of your project

– It will be hosted on a website

– Turned in via GitHub

– Feedback will be within GitHub

Project Grading (Teamwork)

– Each person in your group should be contributing to the project

– Group feedback survey for lab leaders / TAs

– Reach out to instructor / lab leader if an individual is hard to communicate with, and we will remind the entire group of project expectations

– Project work day attendance; GitHub commits will be used to adjust grade accordingly if there is an ongoing issue

Project Tips

Own your work vs “that part wasn’t mine”

Research Question

– Can you answer it?

Introduction

– Do not copy + paste the description given on the website

– Do some digging

– Write a brief description of the observations.

Address ethical concerns about the data, if any.

Put in the effort so there are pieces to give feedback on

Homework 6

A Statistics Experience: The goal of the statistics experience assignments is to help you engage with the statistics and data science communities outside of the classroom

– Can be found in the last row of our schedule

– No GitHub repo for HW-6

Examples

– Attend a talk or conference

– Talk with a statistician/ data scientist (myself and TAs do not count)

– Listen to a podcast / watch video

– Participate in a data science competition or challenge

– Read a book on statistics/data science

– TidyTuesday challenges

– Coding out loud project

Warm Up

– Are the intercepts different?

– Does the relationship between body mass and flipper length change based on which island the penguin is on?

Interaction Model

Interaction vs Additive

– Interaction model allows the slopes to differ based on other covariates

– Interaction model is the more complicated model

Goals

– Describe evidence for an interaction model

– Fit an interaction model in R

– Model Selection

ae-15

Model Selection

We have fit many models to analyze the body mass of penguins. Let’s go over strategies to figure out which model is “the best”

Occam’s Razor

In philosophy, Occam’s razor is the problem-solving principle that recommends searching for explanations constructed with the smallest possible set of elements. It is also known as the principle of parsimony

Model Selection

The best model is not always the most complicated:

– R-squared (why we shouldn’t use this)

– Adjusted R-squared

– AIC (Next Time)

– Stepwise selection (Next Time)

R-squared

tells us the proportion of variability in the data our model explains.

  • why might this not be the best model selection tool?

demo - ae-15

Adjusted R-squared

– What is it?

– How is it different than R-squared?

Adjusted R-squared

Adjusted R-squared

– Does not have the same interpretation as R-squared

– Generally defined as strength of model fit

– Look for higher adjusted R-squared

ae-15

Something to think about

We want our covariates to do a good job at modeling our response y. Is the goal for \(R^2\) = 1? Is the goal to have a perfect fitting model?

Review

– SLR

– MLR - Additive Case

– MLR - Interaction Case

What can we use these models for?