POLS 2580: Revised Research Questions

Overview

Based on my feedback to assignment 1, your task for this assignment is to find a paper on topic of interest to you which you can replicate and extend in your final project for the course.

For this assignment, please upload to Canvas an html file named a2_your_last_name.html produced from an Qmd file called a2_your_last_name.qmd (changing your_last_name to your last name (e.g. a2_testa_data.html)):

A summary of the paper you will be replicating
A description of the paper’s empirical design (regression, experiment, diff in diff, RDD)
The core results you will replicate (Table 2, Figure 3, not the whole paper)
One possible extension (different modeling strategy, additional data, subgroup analyses)
Code to load the replication data into R

Finding articles to replicate:

I will supply you with some possible papers for replication in my comments, but you can also find your own. Most empirical papers published in the discipline’s general interest journals (APSR, AJPS and JOP) are required to make the data and code to replicate their results publicly available, generally through Harvard’s dataverse

Tip

Try using Google Scholar’s advanced search to search for articles published in a particular journal within the past five years on a topic of interest to you. In the given article, do a command/contrl+F search for “replication” to find a link for the replication files. You can also search a particular journal’s dataverse if they have one.

Replication Summary

Please write a brief summary of the paper you will be replicating for the course (essentially an abstract in your own words). Include the full citation and links to:

the published article
the article’s replication files

Empirical Design:

Please tell me if the study you are replicating is an observational or experimental design.

If it is observational, does it make causal claims? If it makes causal claims, what is the identification strategy (RDD, Diff-in-Diff, Instrumental Variable, Regression with some kind of selection on observables claim?)

If the study is experiemntal, please describe the treatments and design (simple treatment and control, multiple treatments, factorial, conjoint, etc.)

Then describe conceptually the primary outcome (dependent variable) and key predictor(s) (independent variable/treatment) as well as any relevant covariates (typically moderators and/or mediators).

Core result

For this project you need not replicate every finding from the paper (although you might). Instead, find at least one core result in the paper that you think is a central finding and/or something you want to understand better.

Tell your reader (me), what this result is, where to find it in the text (e.g. Table 2 or Figure 3) and your current understanding of what how it is produced and interpreted (It’s ok if at present your answer is I’m not totally sure, or I think it means this, but it might mean that. Part of why we do a replication is to deepen our understanding of these topics and methods)

Possible Extension

Write down a few ideas for how you might extend the papers analysis, perhaps by:
- Using a different coding of the outcome or key predictors
- Incorporating additional data
- Using a different modeling strategy
- Including an interaction term
This may be hard to understand now, but will become clearer as you delve deeper into the project. Basically, the goal is to think of some other way of analyzing the data or additional data that might yield further insights into the paper’s broader question.

Code to load the data into R.

All of you should be replicating papers that have replication archives, most of which I believe are available through Harvard’s dataverse.

There are two approaches to loading data. Following the example from the labs, you can use the dataverse package to load the files directly from the web.

I would probably download the full set of files onto your computer and save them in a folder (or subfolder!) for your final paper in this course.

Next, save your qmd file for Assignment 2 in (or near) the replication folder and write code that directs R to load the files from where they are saved on your computer.

It’s hard to describe this process in words, which is why I’m having you do it for this assignment. Once you’ve done it a couple of times it becomes second nature, but when you’re first navigating code and data, getting file paths correct can cause confusion. If you’re having trouble loading the data, or it works when you’re coding live but not when you knit, that is OK and exactly the outcome this portion of the assignment is designed to create. Portals of discovery and what not.

Just include some code that loads the data, and maybe does some initial HLO type stuff.