1. 11/25/2016

    I really liked the state-level plot, especially the overlapping density plots of predicted values. I’ve seen Nate Silver reporting that electronic voting is significant when NOT controlled for other factors. Is this because electronic voting is correlated with poor or rich / white or black districts? This doesn’t affect the validity of the finding, but provides interesting context.

  2. Tracy Carr

    This is a very helpful post and an interesting and thorough analysis. Two comments to consider here: (1) (the easier one) a logistic regression model is more appropriate than a linear regression model if you are using a proportion as your outcome of interest and therefore your current p-value is somewhat suspect and (2) (the harder one) the model is fit at the national level and I highly doubt that the hackers would have attacked all e-voting machines nationwide. It makes more sense to attack vulnerable machines in swing states only and in districts with a very high density of democratic voters. Moreover, if a hacker collaborated with a statistician, such as myself, I could take available polling data, and run simulations to find out where making votes disappear would have the most impact. I hypothesize that you could run simulations on late October polling data and show that by attacking vulnerable districts in Milwaukee County, Wayne County and Philadelphia county, make some of those votes disappear, it could potentially swing the entire electoral college.

    • 11/29/2016

      Hi Tracy – thank you for your comments! I like your suggestions and have tried to take the time to do them justice.
      With regard to comment #1: I have updated the article to include a replication of the original models assuming a quasi-binomial response. I elected to use quasi-binomial over a logistic model due to observed under-dispersion in the data. A quasi-binomial likelihood allows for the estimation of a dispersion parameter to account for this. You’ll find the new plots and tables at the end of the article. They’re very similar to the results from the linear models which should be expected given the concentration of observed proportions near 0.5 and away from the 0 and 1 extremes.
      With regard to comment #2: Not all of the models are fit at the national level. 22 state-level models are fit and shown in the state-wise figures (hence the caution about a multiple comparisons correction). Two national models are fit and shown in tables (one bivariate model and one multiple regression model). I agree that it would be simple to simulate “interventions” or to estimate optimal interventions such that hackers might minimally influence returns to achieve a desired national-level outcome while avoiding detection. I suspect that we will require sub-county level returns data, which is unavailable, to identify this behavior.

  3. Tracy Carr

    I just went to the Verified Voting website and noticed that Milwaukee County was paper ballot only. So that avenue of attack was not available in Wisconsin. I withdraw my hypothesis above. It is possible that other avenues for cherry-picking districts are available, but I certainly haven’t done that analysis myself.

Comments are closed.