As a scholars in the "Peace Science" tradition, I work with quantitative conflict data to conduct much of my substantive research. Rather than simply using the available data (or even collecting my own data), I have also sought to develop new methods and tools for quantitatively studying these data. This entailed thinking about best practices for empirically studying international relations and developing software to assist in this analysis. Overall, I have made four contributions to the quantitative study of international relations.
Rethinking the "Data Generating Process" of Conflict Data
One project is more conceptual, rather than methodological. This project raises questions about how we understand the underlying processes that generate our data on war and conflict. Erik Gartzke and I, on the basis of a number of workshops we hosted (including one co-hosted with Kris Ramsay at Princeton University in 2015), wrote contribution for the Oxford Encyclopedia of Empirical International Relations Research in which we raised questions about the feasiblity of directly testing the core components of the bargaining model of war. Tanisha Fazal and I wrote a piece in Foreign Affairs that calls into question how scholars interpret the trends in war data, and whether these data point to a decline in the propensity of military violence in the world. And a new book project explores how many of the events that comprise the data on conflict used by IR scholars is driven by the behavior of, or reactions to, the foreign policy of Russia.
My efforts to rethink the processes generating conflict data began with a piece published in political analysis. In this piece, I pioneered a new unit of analysis: the k-ad. This unit of analysis is for scholars studying multilateral events, such as alliance formation and the creation of nonaggression pacts. A k-ad is an observation that captures the characteristics of k number of actors (where k is greater than or equal to 2). This means k can be equal to 2 (a "dyad"), 3 (a "triad"), or larger.
Creating datasets with k-ads can be computationally intensive due to the potentiallly immense size of the dataset. For example, a data set with all possible dyadic, triadic, and quadratic combinations of 100 countries contains (4,950 + 161,700 + 3,921,225) 4,087,775 observations (and that would only be for a single year)! Hence, I show how choice-based sampling methods can reduce the number of k-ads one must include in a dataset.
Easing the difficulties scholars face in creating and working with k-adic data was a core motivation for creating the NewGene software (see below). I also created a command for the Stata statistical software to help scholars convert a dyadic dataset into a k-adic dataset (kadcreate_dytok.ado & kadcreate_dytok.hlp) and to add "non-event" observations to a dataset already in k-adic format (kadcreate_zeros.ado & kadcreate_zeros.hlp).
Because of my research on k-adic data, I was invited to write a piece in International Studies Quarterly discussing the questions and theoretical claims for which it is still appropriate to use state-to-state dyadic data (where the dyad is perhaps the most commonly used unit of analysis in international relations research). The editors of International Studies Quarterly then created an online symposium where several quantitative IR scholars responded to the points raised by my piece (and pieces by Skylar Cranmer & Bruce Desmarais and by Paul Diehl and Thorin Wright).
Several of my projects required collecting new data or working with a wide number of existing datasets. In order to assist scholars in using these and other data to quantitatively study international politics, I collaborated with Scott Bennett and Allan Stam to create the NewGene Data Management Software. With the financial support of the National Science Foundation, we developed NewGene as a Windows and OsX based software that manages, organizes, and merges the various datasets produced by international relations scholars, public institutions, and private organizations. This Journal of Conflict Resolution piece describes the software and our motivation for creating it.
The software was officially released in July 2017 and I marked its release with a public lecture in October 2017. The lecture was sponsored by the Center for International Social Science Research and the Division of the Social Sciences (this news story describes the public lecture, the NewGene software, and the software's place in the history of quantitative IR research at the University of Chicago). Work to improve and update this software is ongoing.
Because NewGene is intended to ease the ability of scholars to access and use the various datasets created by international relations scholars, it contributes to the broader issue of research transparency and replication in the Social Sciences. For this reason, I helped organize a conference at the University of Chicago titled "Replication and Transparency in Social Science: Crisis or Crossroads" . The conference was brought together scholars from across the social sciences to discuss how to respond to concerns regarding the fragility of empirical findings in the social sciences (including psychology, political science, economics). An excellent summary of the conference was posted on the Nature blog.
Effect Bounds in International Relations Research
In another piece published in political analysis and co-authored with Walter Mebane, I encourage quantitative international relations researches to "embrace uncertainty" by adopting variations of Manksi bounds. Whether working with experimental or observation data, scholars must make assumptions (sometimes rather strong assumptions) in order to draw an inference about how an explanatory variable, X, influences an outcome variable, Y. Manski bounds offer an assumption free approach for defining the plausible range for the effect of a binary variable (e.g. a country being democratic or not; two countries having a territorial dispute or not) on a binary outcome (e.g. the two countries going to war or remaining at peace).
I applied this method in a piece on alliance politics published in the journal International Organization. I was able to apply the technique in that piece because I was working with a binary variable (the negotiation witnessing an offer of trade cooperation) and a binary outcome variable (the negotiation ending in agreement).