Skip to content

CampaignLab/2026ElectionGenderAnalysis

Repository files navigation

2026 England Local Elections — Gender Analysis

https://campaignlab.github.io/2026ElectionGenderAnalysis/story.html

https://campaignlab.github.io/2026ElectionGenderAnalysis

An interactive one-page dashboard analysing the gender breakdown of candidates and elected councillors in the May 2026 England local elections.

What it shows

  • Summary statistics — overall % female candidates vs elected; incumbent retention rate
  • Seat changes panel — always-visible panel showing incumbents who stood, re-elected, defeated, and new councillors elected; updates on council or region selection
  • Choropleth map — gender balance by local authority, with a bivariate colour scheme that distinguishes high-confidence areas (strong hue) from areas where many candidates' genders are unresolved (washed-out/grey)
  • Party breakdown — stacked 100% bar charts for all parties with ≥ 30 candidates, candidates and elected side-by-side. Click any bar to open a party detail panel showing:
    • 6-tile stat grid: total candidates, female candidates, seat-slots contested, elected total, female elected (+ win rate), male elected (+ win rate)
    • Win-rate context sentences: female win rate vs other parties; vs national average; female vs male win rate within the same party
    • When a council is selected on the map: a scrollable table of that party's candidates in the council (name, ward, votes, result, gender, confidence)
  • Regional breakdown — same charts by ONS NUTS1 region
  • Council table — sortable, filterable table with candidate counts, % female, elected counts, and turnout; unknown-gender warnings surface where data quality is low

Gender prediction method

~71% of candidates in the source data had no recorded gender. Genders were predicted using a four-tier cascade:

Method Coverage Notes
Existing (from source data) ~29% Taken as ground truth; normalised to male/female/nonbinary
gender_guesser ~53% Open-source Python library; uses a compiled international names database; great_britain locale; mostly_* results accepted at low confidence
ONS baby names ~2% Falls back to the ONS historical top-100 baby names dataset (1904–2024) when gender_guesser returns andy/unknown; birth year is used to select the closest decade, so time-varying names (e.g. "Ashley") are handled appropriately
Claude AI ~11% Names still unresolved after the above steps were sent to Claude (Anthropic) for prediction; stored with claude method tag and a confidence level
Unknown ~<0.3% Primarily non-Western names not covered by any of the above sources

Predictions are stored in genders.csv (keyed by person_id + surname) and are not written back to the source data.

Incumbency method

Incumbent status (whether a 2026 candidate was a sitting councillor in the same ward and council) is determined by matching against 2025 sitting councillor data from opencouncildata.co.uk. Matching requires:

  1. Council name match — prefix/suffix normalisation (strips "London Borough of", "Borough Council" etc.).
  2. Ward name match — exact, then fuzzy (difflib.get_close_matches, cutoff 0.6) with first-word constraint to prevent cross-area false positives.
  3. Full-name fuzzy matchSequenceMatcher ratio ≥ 0.80, to handle middle names / minor spelling differences.

The inc field is omitted from ward JSON when False (not an incumbent) to keep file sizes small. Output stats: inc_total, inc_elected, inc_defeated, new_elected, inc_retention_pct.

Limitation: incumbents who chose not to re-stand cannot be identified — "Defeated" means stood and lost.

Repository layout

dc_data.csv                  Source data from Democracy Club
genders.csv                  Predicted genders (generated)
historicalnames2024.xlsx     ONS baby names historical dataset (local, not committed to git)
LAD_MAY_2025_UK_BUC_*.geojson  ONS LAD boundaries (local, not committed to git)

scripts/
  parse_ons_data.py          Parse historicalnames2024.xlsx → scripts/data/ons_lookup.json
  assign_genders.py          Produce genders.csv from dc_data.csv + ons_lookup.json
  identify_incumbents.py     Match 2026 candidates against 2025 sitting councillors
                             → scripts/data/incumbents.json + scripts/data/ward_fuzzy_log.json
  build_data.py              Aggregate data → docs/data/councils.json + wards/*.json

docs/                        GitHub Pages root
  index.html
  style.css
  app.js
  data/
    councils.json            Pre-aggregated data (generated by build_data.py);
                             includes per-party win rates, seat counts, and
                             comparison stats pre-computed at build time
    LAD_boundaries.geojson   Boundary file (generated by build_data.py)
    wards/
      {slug}.json            One file per council (~156 files); ward-level
                             candidate lists loaded on demand when a council
                             is selected on the map

requirements.txt             Python dependencies

Running locally

# 1. Install Python dependencies
py -m pip install -r requirements.txt

# 2. Build the ONS name lookup (requires historicalnames2024.xlsx in project root)
py scripts/parse_ons_data.py

# 3. Predict genders
py scripts/assign_genders.py

# 4. Build the frontend data files
py scripts/build_data.py

# 5. Serve locally
py -m http.server 8080 --directory docs
# → open http://localhost:8080

Sources & licences

Incumbency data

Election data

  • Democracy Club candidate and results data, May 2026. https://democracyclub.org.uk — used under the terms of the Democracy Club open data licence.

Gender prediction

Boundaries


Built with Leaflet and Chart.js.

About

A datavis tool that breaks down the gender of candidates in the May 2026 Local Elections

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages