In the course of conducting my substantive research, I had to develop new methods and tools for quantitatively studying international relations. This entailed thinking about best practices for empirically studying international relations and developing software to assist in this analysis (see my ``Statistics Software" page). Overall, I have made three contributions to the quantitative study of international relations.
First, several of my projects required collecting new data or working with a wide number of existing datasets. In order to assist scholars in using these and other data to quantitatively study international politics, I collaborated with Scott Bennett and Allan Stam to create the NewGene Data Management Software. With the financial support of the National Science Foundation, we developed NewGene as a Windows and OsX based software that manages, organizes, and merges the various datasets produced by international relations scholars, public institutions, and private organizations.
The software was officially released in July 2017 and I marked its release with a public lecture in October 2017. The lecture was sponsored by the Center for International Social Science Research and the Division of the Social Sciences (this news story describes the public lecture, the NewGene software, and the software's place in the history of quantitative IR research at the University of Chicago). Work to improve and update this software is ongoing.
Because NewGene is intended to ease the ability of scholars to access and use the various datasets created by international relations scholars, it contributes to the broader issue of research transparency and replication in the Social Sciences. For this reason, I helped organize a conference at the University of Chicago titled "Replication and Transparency in Social Science: Crisis or Crossroads" . The conference was brought together scholars from across the social sciences to discuss how to respond to concerns regarding the fragility of empirical findings in the social sciences (including psychology, political science, economics). An excellent summary of the conference was posted on the Nature blog.
Second, in a piece published in political analysis, I pioneered a new unit of analysis: the k-ad. This unit of analysis is for scholars studying multilateral events, such as alliance formation and the creation of nonaggression pacts. A k-ad is an observation that captures the characteristics of k number of actors (where k is greater than or equal to 2). This means k can be equal to 2 (a "dyad"), 3 (a "triad"), or larger.
Creating datasets with k-ads can be computationally intensive due to the potentiallly immense size of the dataset. For example, a data set with all possible dyadic, triadic, and quadratic combinations of 100 countries contains (4,950 + 161,700 + 3,921,225) 4,087,775 observations (and that would only be for a single year)! Hence, I show how choice-based sampling methods can reduce the number of k-ads one must include in a dataset.
Easing the difficulties scholars face in creating and working with k-adic data was a core motivation for creating the NewGene software discussed above. I also created a command for the Stata statistical software to help scholars convert a dyadic dataset into a k-adic dataset (kadcreate_dytok.ado & kadcreate_dytok.hlp) and to add "non-event" observations to a dataset already in k-adic format (kadcreate_zeros.ado & kadcreate_zeros.hlp).
Because of my research on k-adic data, I was invited to write a piece in International Studies Quarterly discussing the questions and theoretical claims for which it is still appropriate to use state-to-state dyadic data (where the dyad is perhaps the most commonly used unit of analysis in international relations research). The editors of International Studies Quarterly then created an online symposium where several quantitative IR scholars responded to the points raised by my piece (and pieces by Skylar Cranmer & Bruce Desmarais and by Paul Diehl and Thorin Wright).
Presently, I am working on using a k-adic approach to study war onset. This is part of a broader project I am leading with Erik Gartzke to encourage scholars to empirically test the bargaining model of war, the core model presently used to understand the onset of war. We refer to this initiative as Empirical Implications of Bargaining Theory. In addition to hosting holding a number of workshops on the topic (such as workshop co-host by Kris Ramsay at Princeton University in 2015), we were invited to write a piece detailing this project for the Oxford Encyclopedia of Empirical International Relations Research.
Effect Bounds in International Relations Research
Third, in another piece published in political analysis and co-authored with Walter Mebane, I encourage quantitative international relations researches to "embrace uncertainty" by adopting variations of Manksi bounds. Whether working with experimental or observation data, scholars must make assumptions (sometimes rather strong assumptions) in order to draw an inference about how an explanatory variable, X, influences an outcome variable, Y. Manski bounds offer an assumption free approach for defining the plausible range for the effect of a binary variable (e.g. a country being democratic or not; two countries having a territorial dispute or not) on a binary outcome (e.g. the two countries going to war or remaining at peace).
I applied this method in a piece on alliance politics published in the journal International Organization. I was able to apply the technique in that piece because I was working with a binary variable (the negotiation witnessing an offer of trade cooperation) and a binary outcome variable (the negotiation ending in agreement).