The Palm Beach Story?

Robert Max Jackson

Extending the Analysis: Is the Buchanan Vote Anomalous When Examined on a National Scale?

In response to some thoughtful questions raised by a number of people, I have explored the possibilities for testing the peculiarity of the Palm Beach Buchanan vote on a national scale. Here are the most essential results of that analysis.

When we raise the comparative standard from Florida to the United States, we are making a demanding test. Several obstacles cloud efforts to reproduce at the national level the state level results that suggest the Palm Beach vote was anomalous. First, numbers weigh more against the possibility. Essentially, the units being analyzed are increasing fifty fold, and, of course, the larger the number of units, the more likely it is to include extreme values. Second, and more important, the states are highly diverse, not only in their demographic and political compositions, but also in terms of the actually political contests being held. Thus, we are asking if we can identify 3,000 votes cast for the President as being extreme in the one county in which they occurred compared to about 3,000 other counties which together accounted for 100,000,000 votes, and in which the actual political contests differed considerably across the fifty states. This is a modern version of the princess and the pea.

Given the numbers with which we begin, it would be unreasonable to expect unequivocal results. Still, it is interesting to try, if only to get a measure of just how odd those Palm Beach County votes really were. People have fairly and reasonably called into question analyses based only on comparing Palm Beach with other counties in Florida, implying that being an oddity among sixty-seven counties is not enough, it must be odd among 3,000. Is the Palm Beach vote for Buchanan a clear oddity at this standard? To answer this question, let us look at the national data in a couple ways.


First analysis: Expanding the Predictors for the Buchanan Vote to the National Level

First, let us look at a chart showing the all counties in the United States (except those in Michigan, where Buchanan did not appear as a candidate) that displays the actual vote for Buchanan compared to the predicted vote. The prediction for each county was based on these: 1) the degree of support for Buchanan in other counties in the same state which were similarly urban or nonurban (more on this below, after the charts), 2) prior political leanings in the county as indicated by support for Dole in the ‘96 election and by the degree to which the county gave more or less support for Perot in the ‘96 election than did other similarly urban or nonurban counties in their state, and 3) demographic characteristics indicated by the number of blacks and the proportion of college educated in the county. Here are the results (all values were transformed by natural logs before regression to lessen heteroskedasticity issues; the predicted values were transformed back to represent votes for simpler display).

What does this show? Well, obviously there are counties with more votes for Buchanan, but the difference between the Buchanan vote and what we would predict according to the criteria we applied is greater for Palm Beach than anywhere else. This portrait is confirmed by a chart of the Studentized residuals showing how much the (logged) vote counts deviated from the prediction after statistical adjustment. The deviations are charted against the population of a county (for simplicity, the display is limited to those counties with more than 75,000 votes in the Presidential election).

Second Analysis: Prediction of the Percent Voting for Buchanan

While the preceding charts make a fairly strong case, they do not respond directly to the objection that we should be looking at the proportion of votes received by Buchanan rather than the absolute number. This is inherently a much harder test because it suggests we place large counties like Palm Beach on a level with the far more numerous small, rural counties. Compared to all counties, Palm Beach has the 7th largest vote count for Buchanan, but ranks only 859th in the percentage vote it gave him. On the face of it, this seems to make it implausible that we can show the percentage is an outlier, but realistically, this depends on the quality of our prediction. Let us see what we can do.

This analysis looks at the proportion of votes received by Buchanan. To see if Palm Beach looks anomalous compared to all other counties in the nation, this percentage was predicted through standard regression procedures using these predictors: 1) the degree of support for Buchanan in other counties in the same state which were similarly urban or nonurban (more on this below, after the charts), 2) prior political leanings in the county as indicated by support for Dole in the ‘96 election, support for Perot in the ‘96 election, and by the degree to which the county gave more or less support for Perot in the ‘96 election than did other similarly urban or nonurban counties in their state. (To conform with statistical concerns, the actual numbers used were log odds ratios.) The results of this analysis appear in the following two charts.

Given the difficulty of the analysis being attempted, these charts show a surprising degree of evidence supporting the anomalous character of the Palm Beach Buchanan vote. The first chart shows the actual percentage vote compared to the predicted percentage for all counties in with more than 100,000 votes cast for the President (the predictions are based on the regression using all counties). This chart shows how the Palm Beach Buchanan vote percentage is exceptionally large compared to what we predict, in contrast to all others but one. By restricting this chart to the larger counties, we have focused our attention on those where the small percentages received by Buchanan actually make a difference, and where we can expect to have greater accuracy in our predictions. But, one could still argue that if we look at this chart with all the counties, Palm Beach would disappear as an exceptional case, as it is overcome by the mass of small counties where the Buchanan vote percentage is higher. To offset this assessment, consider the second chart. It displays the Studentized residuals (of the log odds ratios) for all counties in the United States against the size of those counties. This shows how the difficulty of predicting rises as the size of the counties decrease. It also shows how much of a standout is the error in predicting the vote percentage for Buchanan in Palm Beach, an error only exceeded in small counties where the percentage becomes harder to predict.

Further Notes on the Data Used

The data used here are the original vote counties for all counties in the United States as reported by ABC and placed into a data set by Professor Robert Shimer of Princeton (http://www.princeton.edu/~shimer/election.html). With this were merged data on the results of the '96 Presidential election (derived from data available at http://www.iosphere.net.au/~lance/pres1996index.html and http://h0040055bf148.ne.mediaone.net/~dave/POL/pe1996data.html); double counting of areas as both cities and counties and differences in spellings were resolved. Then, county level demographic data from the 1990 census was also merged, again resolving differences in the identification of counties (the census data were obtained from http://fisher.lib.virginia.edu/ccdb/county94.html). The resulting data set includes 3111 counties with all of Alaska being represented as one county (as their votes are presented that way) and several minor counties being lost because they could not be properly matched.

Further Notes on the Definition of "Context"

By defining all counties to be urban or non-urban and cross-cutting that dimension with states, 99 geographical contexts were defined (Alaska lacked county distinctions). While the boundaries of states are obvious, those of urban areas are not. What distinguishes urban from non-urban counties is clear in the extremes but vague at the boundaries. Big cities clearly identify the counties they cover as urban. The great majority of counties in the United States have a relatively low population and density. The areas in between–small cities, suburbs, and the like–have a murkier status (and, of course, we can refine our notions into many more categories based on diverse criteria). For this purpose, I sought only to apply a reasonable standard that would identify as urban or big those counties that were large compared to others in their state (a local relative standard) or to others in the nation. Specifically, these were defined as including the top three counties in any state with six or more counties and any other county that had substantial size in a national ranking (specifically defined having a population in the top five percent or having both the population and population density in the top ten percent). Thus, the more rural states have their largest counties defined as their urban context even if they are not populous by national standards, while highly populated counties are counted as urban even if they do not stand out in their state because it has a surplus of urban areas. Of the 3110 counties present in the data (ignoring Alaska), 291 were defined as urban by these means.

Once the contexts were defined, it was possible to use them in a simple way to help predict the votes in each county. The contextual preference for Buchanan was defined as the percentage vote given to Buchanan by all counties in a particular context omitting the county for which the context was being calculated. Note that this type of predictor would not be appropriate for many problems we might wish to examine, but it has considerable value for pursuing the question whether a particular county has anomalous outcomes. In particular, it would not make sense to use this approach if we thought a number of counties sharing a context might have similar anomalous outcomes, as the context defined this way could cloak rather than reveal the problem. It also would not make sense if we were seeking to identify the demographic, cultural, and political causes of voting outcomes, as it would submerge much of these effects into the regional context. The way that context is defined and used here is specifically aimed at the problem being pursued.


Copyright ©  2000 NYU/Robert Max Jackson. All rights reserved.
Last Revised:  12/4/00
 

Return to the main page