Applied Data Analysis

Course Introduction

Professor Benjamin Noble

“[This class] will acquaint students with statistical methodology as it is used in the social sciences.”

Expectation…

Silver Bulletin election forecast chart of midterm win probabilities.
via Silver Bulletin
Steve Kornacki at the MSNBC big board on election night.
via ABC

Reality…

Cartoon of a frantic researcher holding red string next to a corkboard covered with sticky-note pictures of an ice cream cone, sun, bar chart, ballot box, and thermometer all connected by tangled red string, illustrating how messy causal inference can be.

Causal Questions in Political Science

  • Does ideological extremism cause a candidate to win an election?
  • Does inflation cause presidential approval to decrease?
  • Does democratization cause countries to become more peaceful?

Machine Learning

Which factors predict the outcome?

Candidate party Fundraising Candidate race The economy Hair color Win election

Causal Inference

Does this factor cause the outcome?

Candidate party Candidate race The economy Hair color Fundraising Win election ?

Correlation and Causation

XKCD comic: correlation does not imply causation, but it does waggle its eyebrows suggestively.

via XKCD

An Example

Does Eating Ice Cream Cause Sunburn?

Weather Ice cream sales Sunburn Ice cream sales Sunburn

Does Being an Extremist Increase Your Election Odds?

Marjorie Taylor Greene in a red jacket wearing a black face mask reading TRUMP WON, seated in the House chamber.
Marjorie Taylor Greene, via NPR
Wall Street Journal headline: Marjorie Taylor Greene Easily Wins Re-Election in Georgia.
Washington Post headline: The media can't ignore Marjorie Taylor Greene. Can they figure out how to cover her?
via Washington Post

Why Not Just Control for Everything?

  • What is “everything?”
  • Can you measure “everything?”
  • Controls can change what we’re estimating.
Education MEDIATOR Political knowledge Turnout DIRECT EFFECT INDIRECT PATH

About Me

Professor Benjamin Noble headshot.

  • From St. Louis, MO.
  • My research: congressional and presidential rhetoric, polarization.
  • I enjoy yoga, surfing, and cooking.
  • My favorite San Diego coffee shop: Lovesong (North Park).

Our Class

  • Our course syllabus is linked on Canvas.
  • Two meetings per week, fully remote over Zoom (link on Canvas).
    • Monday & Wednesday, 11am–1:50pm.
    • Each session mixes lecture and an in-class group lab.
  • Lectures are recorded and posted to Canvas. Please keep your camera on.

Our Textbooks

  • Our textbooks →
  • Both free online (see links on syllabus).
  • Reading can be done before/after lecture, should be done before class.
Cover of Causal Inference: The Mixtape by Scott Cunningham. Cover of The Effect by Nick Huntington-Klein.

Your Grade: Attendance and Participation (15%)

  • Attend lecture synchronously.
  • Keep your camera on.
  • Participate actively in labs and discussion.
  • One excused class absence, no questions asked.

Your Grade: Labs (25%)

  • In-class group work, submit your own files on Canvas.
  • Graded on participation and effort; you don’t need to finish.
  • Everyone will be called on at least once to walk the class through their solutions; answers will be posted after class.
  • One excused absence drops that day’s lab.

Your Grade: Homework (30%)

  • Three individual homework assignments with conceptual and applied questions.
  • Completed on Canvas; you run provided R code in DataHub and enter your answers.
  • Must be done individually.
  • Late submissions minus one letter grade per 24 hours; no late submissions after solutions are posted.

Your Grade: Oral Exam (25%)

  • One-on-one conversation with me over Zoom (about 8-10 minutes), during the last two class days.
  • I give you a short scenario; you talk me through how you’d answer a causal question about it.
  • No coding or analysis. It’s about reasoning, not execution.
  • You’ll get practice scenarios ahead of time; more details closer to the date.

Your Grade: Final (5%)

  • Asychronous final between Friday 7/24 through Friday 7/31.
  • Must be completed in one sitting (budget 2-3 hours).
  • Similar to the homeworks, but longer and more comprehensive.
  • Completed alone; open book, notes, internet, etc.

Your Grade

  • Attendance and participation (15%)
  • Labs (25%)
  • Homework (30%)
  • Oral Exam (25%)
  • Final (5%)

A Brief Note on R

  • I’ll be teaching this class using the R programming language.
  • You’ll mostly run code I provide and read the output.
  • No prior programming experience needed; you won’t write R from scratch.
  • Today’s lab and homework get you set up.
if (my_grade >= 90) {
  print('POLI 170 is the best!')
} else {
  print('POLI 170 stinks!')
}

A Brief Note on AI/LLMs

  • Feel free to use AI.
  • But, use it responsibly.

Office Hours

  • Low stakes opportunity to get help or ask questions about the course/assignments.
  • Remote on Zoom (same link as class), Monday & Wednesday, 2:30–3:30pm.

Have a Question?

  • See if the answer is on the syllabus!
  • Attend office hours.
  • Email me (responses within 24-48hr, or longer on weekends).

How to Succeed

In this class (and in any stats class)

  • You are a math person!
  • (Obviously) do the reading, come to class, participate in lab.
  • Iterate, you don’t have to understand everything on the first go.
  • Get help.