Geospatial Analysis of the Social and
Economic Reasons for the Scottish Independence Referendum 2014 Result, at Local
Authority Level.
1 Introduction
In
Scotland on the 18th September 2014 a referendum on Scottish
Independence from the rest of the UK took place. The voters were asked “Should
Scotland be an independent country?” which can be answered with a “Yes” or
“No”. The result was a victory for the “No” side with 55.3% of the vote, with a
very large turnout of 84.6%. Of the 32 local authorities, there was only 4
where the “Yes” voters won. Many studies have looked at the statistics of
different groups (such as age, gender and prosperity) drawn from the whole
population and the conclusion was drawn that the “Yes” vote was won by an
alliance of groups including Protestants, the very young, women and average
earners.
With
only 4 local authorities (LA) voting “Yes” the map of the results looks very
‘one-sided’ however, this is misleading because it mainly reflects Scotland’s
lopsided population distribution: most Scots live in the central belt between
the Forth and the Clyde, with Glasgow accounting for 11% of the total
population. These 4 LAs have been described as areas where heavy industry and
mining were the main occupations, and areas of increased deprivation [1].
This
analysis uses data aggregated by Scottish LAs and so this paper will
investigate the results at the LA level. The data set consists of a row for
each of the 32 LAs together with 72 census variables aggregated to LA level
from the 2011 Census data set, which provides a rich source of social and
economic statistics for each population. To normalize the data set, the
proportion of vote “No” was calculated and used for the analysis and the census
variables were normalized to the various appropriate totals to produce
percentage figures for easy comparison.
The
aim of this paper is to find the discriminating attributes of the census data
that appear to show the spatial differences in the “No” voting outcomes. Then
to interpret the results in terms of current political, social and economic sciences
and the implications for future referendums. The key analytical research
questions are:
What
attributes are significant in the geospatial differences in voting behaviour?
Are
these attributes consistent with the current geospatial differences in
Scotland?
2 Tasks and Approach
Explaining
the geospatial differences in voting preference at LA level is the object of
this paper. I begin by using choropleth maps with 2-colour (diverging) schemes
to visualize the geospatial differences in voting preference. Since LAs are not
a fixed geographical size or have a fixed population size, the issue of
interpretation of the maps is compounded as the size of the LA is not relevant
to the population size. To overcome this issue, I will use population-weighted
maps where LAs sizes are based on the number of votes cast.
I
need to find attributes in the census data set that identify the voting preference.
Academics such as Tom Mullen [2] suggest that the main social and economic
reasons for voting preference are affluent group, gender and age, but that
these were not uniform. Attention is draw to the geographical pattern were the
highest No percentages returned were the areas nearest England and the lowest
in the most deprived areas.
I
use the Pearson’s correlation coefficient to assess the attributes correlation
with voting preference and identify the significantly correlated attributes. I investigate
the fit of the attributes using scatter plots and regression lines because to
the know issue of outliers in the Pearson’s coefficient and to assess the
assumptions of normality of residuals and independence. This should lead to a
short list of attributes to geospatially model.
Regression
has long been used in political, social and economic sciences [5,6] to analyse
population statistics using data that has been aggregated to some degree. Regression
as a parametric technique relies on the parameters given to interpret the
explanatory variable (vote share in this case), which I can use the above
exploratory analysis to select the most appropriate attributes to include in
the modelling. I will use multivariate regression techniques to explore potential
models that fit the geographic variations. I will use the residual sum of
squares (RSS) to algorithmically assess the model fit to the data and model
residuals choropleth maps applying a 2-colour (diverging) scheme to visually
assess the geographic variations.
With
many attributes a more automated way of approaching this investigation would be
through a stepwise regression model starting from either the full model and
reducing the number of attributes or from the null model and increasing the
number or attributes. Such an automated regression analysis can be done using a
variety of methods [7] however I will proceed with an exhaustive search using
the leaps package. This package performs an exhaustive search for the best
subsets of a given set of potential regressers, using a branch-and-bound
algorithm.
The
goal of this investigation is to explain the geospatial differences using
social variables from the Census 2011 and SIMD attributes at LA level. It seems
reasonable to evaluate the global model validity at each LA as I am concerned
with broad society issues not very local issues or individuals. Evaluation is
achieved by analysing the global model residuals geospatially using choropleth
maps and summary statistics (p value and RSS) [4].
For more please click the link for the full paper - Link
No comments:
Post a Comment