This post is an attempt to map the political space in each state (and territory) based on the above-the-line Senate votes. I am not overly satisfied with the results. Principal component analysis (at least as I implement it – it's not obvious what to do about exhausting votes), tends to isolate the major party how-to-vote cards and calls them left and right, and then sorts the other parties by ideology on the vertical axis. My hope at improving on that with force-directed graphs was half-realised: the left-right sorting looks much better, not being dominated by the HTV cards, but there is usually no clear meaning to the vertical axis.
I thought this post would take a day, and it's instead taken me a week, as I've barked up wrong trees and gone down rabbit holes. I don't really know what I'm doing, and perhaps there are obvious improvements to make. A page with lots of code is here, and all code and images can be downloaded here (5.4 MB). I go through a fairly long description of why I wanted something better than PCA before getting into the results, but you can skip to NSW, Vic, Qld, WA, SA, Tas, ACT, NT.
I have a reasonable understanding of PCA, but I don't understand it deeply, nor have the ability to predict details of the results. So while I was happily tweeting PCA correlation circles, I wasn't really sure if preference numbers are an appropriate sort of variable for this procedure. I mean, preferences are these things that go up in increments of 1, but in PCA you convert the data to z-scores; it feels a bit dimensionally wrong.
So I made a toy dataset to investigate, where parties were arranged on a spectrum left to right on the ballot paper, voters have some set ideology and preference the closest parties, and the population of voters was uniformly distributed across the spectrum. Specifically, each party got two votes, one where the second preference went immediately to the right, and one where the second preference went immediately to the left. So, for instance, if there are ten groups, then the twenty votes would be (the following numbers are preferences as they'd appear on the ballot paper):
1,2,3,4,5,6,7,8,9,10 1,2,3,4,5,6,7,8,9,10 2,1,3,4,5,6,7,8,9,10 3,1,2,4,5,6,7,8,9,10 4,2,1,3,5,6,7,8,9,10 5,3,1,2,4,6,7,8,9,10 6,4,2,1,3,5,7,8,9,10 7,5,3,1,2,4,6,8,9,10 8,6,4,2,1,3,5,7,9,10 9,7,5,3,1,2,4,6,8,10 10,8,6,4,2,1,3,5,7,9 10,9,7,5,3,1,2,4,6,8 10,9,8,6,4,2,1,3,5,7 10,9,8,7,5,3,1,2,4,6 10,9,8,7,6,4,2,1,3,5 10,9,8,7,6,5,3,1,2,4 10,9,8,7,6,5,4,2,1,3 10,9,8,7,6,5,4,3,1,2 10,9,8,7,6,5,4,3,2,1 10,9,8,7,6,5,4,3,2,1
This is a perfectly linear political spectrum (albeit with a bit of an irregularity with the extreme parties whose two votes are identical), so it would be convenient if the first principal component ordered the parties as such. But that is not what happens. Here are the results (for 38 groups, because I was using the Qld ballot paper as a template):
The problem is that while you can clearly trace out the spectrum, the horizontal axis isn't enough – if you just look at PC1, it looks like the most left-wing party is around party 9 or 10 instead of 1 (and similarly on the right). That's pretty bad! Maybe with real-world preferences it wouldn't be quite so jumbled at the extremes, but it doesn't inspire confidence.
Also of note is that despite being a one-dimensional political spectrum, the first principal component only explains 70% of the variance. It gets worse if these perfectly one-dimensional voters stop preferencing at 6, as most real above-the-line voters do – only 18.5% of the variance is explained by the first principal component, and there are really weird edge effects:
(Here I've replaced all blank preferences with 38. Doing PCA on data with missing values wasn't something I wanted to learn, so I had to replace the blanks with something. Replacing with 7 doesn't qualitatively change these toy results much. I did a quick test on the Victorian results. Replacing blanks with the next preference (e.g. a 7 if the voter wrote 1-6) drastically distorted the axes, but the effect of the HTV cards was still clear. Replacing blanks with the number half-way between the next preference and the number of groups (e.g., (7 + 38)/2 = 22.5) gave very similar results to what is shown below.)
The small fraction of variance explained with the exhausting votes means that I'm not sure what to do with them in the real data. I haven't quoted them in the results below, though you can dig them out of my tweet thread. Maybe I could create another one-dimensional dataset, based on the results below. i.e., take a calculated political spectrum, then replace all preferences from 2 onwards by the closest parties to whoever got the primary vote. Do PCA on that dataset, and use the percentages of variance explained as an ugly but maybe useful benchmark. But if I do that, it'll be on another day.
OK, so the first principal component of voting preferences doesn't necessarily give us the main political spectrum in as much detail as we'd like. My next idea was to use gradient descent (or something related): define some penalty function that is high when unlike parties are close together, and evolve until the penalty is low (and unlike parties are far away). My attempts on that front generally failed, as I instead discovered lots of useless local minima when the apparent spectrum doubled back on itself.
Eventually I decided to borrow from force-directed graph drawing, in particular the algorithm of Kamada and Kawai. A distance is defined between each pair of parties (the distance between A and B will be low when voters for A usually give an early preference to B, and vice versa), and then the system is modelled as a set of springs, each spring having an equilibrium length equal to the distance just described (the "graph-theoretic distance").
How to properly define the graph-theoretic distance? I tried a few approaches (once again, there are choices on how to deal with exhaustion). The basic idea is that distances get defined for each vote between the part that got the primary vote, and all of the subsequent preferences. So if, say, a vote went 1 ALP, 2 Grn, 3 NXT before exhausting, then that would contribute a distance of 1 between ALP and Grn, a distance of 2 between ALP and NXT, and then some choice for the rest. (In my early tests, including those in the paragraphs that follow, these distances were symmetric right the way through the calculation. i.e, I'd also define Grn to ALP as 1, NXT to ALP as 2, etc. For the main results on this page, I stored the directional distances separately, and took an average at the end.)
When the distance between two parties is greater than the graph-theoretic distance between them, the spring pulls them towards each other; when they're too close, the spring pushes them apart. After randomising the starting positions, it doesn't take long for the system to evolve into an equilibrium. (Often this equilibrium will only be a local optimum, and so for the main results I ran 50 trials with different randomised starting locations and take the average of the final locations.)
So, how does that idea do on the toy one-dimensional-spectrum dataset?
Badly. OK, what if we square all the distances?
(Note that the orientation of the line is completely arbitrary.) That's better, but the parties should be evenly spaced, and the above plot has the parties in the middle further apart than at the extremes. So, try cubing the distances:
That looks OK. And with that level of rigour, I cube all the distances for the results in this page. (A full set of images, with various parameter choices, is in the download.)
What happens when preferences exhaust after 6?
Maybe it's asking too much to get a one-dimensional spectrum popping out when voters only preference 6 out of 38 parties. As far as the votes are saying, the most left-wing party is as far away from the 7th-most left-wing party as it is from the most right-wing party. A not-quite-complete circle is a pretty valid way to resolve that tension. So let's carry on.
This algorithm does not isolate preferred directions (as PCA does), so I arbitarily translate and rotate the final positions so that Labor and the main Coalition party are equidistant either side of the origin at y = 0, and then flip the y-axis if necessary so that One Nation is in the upper half-plane. (I use the same left-right/up-down conventions for the PCA plots, where axes can be flipped arbitrarily.)
I don't think my choices of parameters and algorithms are in any way definitive, and in that spirit I've put some semi-cryptic titles in the plots below. The 'max_asym' is my term for replacing blanks with the number of groups (the maximum possible preference), and using asymmetric distances: working out the average distance from, e.g., NXT primary votes to Liberal preferences separately from Liberal primary votes to NXT preferences, and then averaging the two at the end. (Using symmetric distances usually results in very circular arrays of points, with popular parties at the bottom and micro-parties at the top.) The d^3 indicates that I cubed all distances.
(In all plots below, the legend labels the Coalition party or group as "Lib", because I didn't work out how to programmatically change legend names in ggplot.)
In the more populous states, the PCA clearly generates clusters based on the major parties' HTV cards. Part of this might be an artefact of me setting blank preferences to the number of groups (for NSW, 41). This number is big when there are lots of groups on the ballot paper, and perhaps this exaggerates the effect of the HTV's. There's also a lot more ways for preferences to scatter when a voter deviates from the card, which should make for weaker relationships between a major party and parties not on the HTV.
The second principal component neatly separates the non-major-HTV parties into a spectrum with One Nation at one end and the Arts Party and Sustainable Australia at the other. The LDP was on Labor's HTV card but there's only the barest suggestion of it in these statistics. The close preferencing link between the Liberals and the Liberal Democrats is also shown by the raw preferencing numbers: less than 30% of (ATL) Labor voters put the LDP in their first 6, but almost two thirds of Coalition voters did. (The converse preferencing rates were similar: 31.5% of LDP voters preferenced Labor in the first 6, and 65.7% preferenced the Coalition in the first 6.)
The results here look like a much nicer sorting into left and right than PCA, but apart from the parties in the top-right quadrant – One Nation, Rise Up Australia, ... – it's hard to discern much meaning in the vertical axis (and judging by the other states, I think even that apparent grouping is a coincidence).
The major party HTV cards generate clusters, as for NSW, with Hinch replacing the LDP as the party on both cards. His party was more preferenced by Coalition voters than Labor voters, but not to as lopsided an extent as the LDP in NSW, and the DHJP dot is accordingly out on its lonesome. The second principal component again separates the non-HTV parties somewhat neatly along a spectrum from One Nation to progressive parties, albeit with odd results such as the Socialist Alliance being in the middle of the pack.
Once again, the force-directed graph feels like a nice organisation into left and right, but to me it looks like the vertical axis is just being used as space to plot parties in, rather than having any ideological meaning. Unlike for the NSW plot, the various racist parties are smeared out rather than clustered.
The PCA results are similar to those of NSW and Victoria, but Lazarus and JLN shift the clusters around relative to one another. I don't want to interpret the results with too much confidence (it's easy to come up with reasons to explain just about anything, even wrong results), but the natural explanation is that GLT and JLN votes and preferences are combining a rural vote and a Labor vote in a way that distorts what the principal components would otherwise be.
The relatively high popularity of One Nation may also be a factor in its position here relative to the HTV clusters: about 23% of Labor and LNP voters preferenced PHON in their top 6, more than NSW (17-18%) and Victoria (9-10%).
It's a little surprising to see KAP to the right of the LNP. But in the force-directed graphs, the majors' HTV cards don't dominate the statistics as much as they do in PCA (roughly, since the distance between each pair of parties is weighted equally), and presumably Katter is a lot more popular amongst voters for right-wing parties.
A fairly straightforward set of results, with Hinch not having enough of a presence in WA for DHJP to stand out from the Labor HTV card, and enough Labor voters preferencing the Shooters and the Nationals to drag them a little away from the Liberal cluster.
Another adequate layout left-right.
The correlation between the second principal component and group column on the ballot paper is 0.85, far higher than for any other state (even including the third PC). Looking at the group order, it's easy to see Labor, NXT, and Liberal in that order in the first third of the ballot paper, but a scatterplot shows that most of the correlation is driven by the parties to the right of the Liberals as printed.
The plot of PC1 and PC3 shows the usual pattern of these plots, with the main South Australian variant being NXT, which ends up in the middle of the non-HTV pile. (Note that the LDP is coloured in black here, but it was on a minority of Liberal HTV cards.)
The LDP more clearly in the Liberal HTV cluster.
The nicest plot of the whole page: three distinct arms coming out from the centre, one for Labor, one for Liberal, one for Lambie.
I don't think I'm imagining that this is closer to the PCA plot than in the other states, just with the order of the parties in the three arms shuffled around.
For the territories I've only highlighted the first three parties on the HTV cards.