Predicting House and Senate Votes: How it Works

My model for predicting congressional votes is fairly simple.  It includes two steps: The first predicts the probability a given lawmaker will vote in favor of a bill based on peer announcements; the second runs a Monte Carlo simulation using the resultant “yes” probabilities to determine the likelihood that the given bill earns 218 yeas.

Logistic Regression

Dependent variable: Announced yes/no votes, found from available whip counts for high-profile bills — these counts come either from engaged Twitter users who actively track lawmaker statements or from news organizations, such as The Hill, the New York Times, and the Washington Post.

Independent variables: First and second DW-NOMINATE scores and various caucus membership (ie, House Freedom Caucus, Congressional Black Caucus, etc).  I’ll be testing other variables to see whether they add significant explanatory/predictive value.

I then run 1,000 bootstrapped predictions and, for lawmakers undecided or otherwise mum on their support for a bill, use the mean probability of the results for the ensuing step.

Monte Carlo Simulation

With the probabilities and presumed actuals (lawmakers could change their minds or vote against their public stance at the behest of party leadership), I simulate the congressional vote 100,000 times to determine the bill’s likelihood of passing (or attaining x numbers of votes).


Leave a Reply

Your email address will not be published. Required fields are marked *