| Title: | Access UK Housing Data from Land Registry, EPC, and Planning |
|---|---|
| Description: | Fetch UK housing data from official sources. Access the UK House Price Index and Price Paid Data from 'HM Land Registry' <https://landregistry.data.gov.uk/>, domestic and non-domestic Energy Performance Certificates from the 'MHCLG' Open Data service <https://epc.opendatacommunities.org/>, and planning application, brownfield land, and local plan data from 'planning.data.gov.uk' <https://www.planning.data.gov.uk/>. Data covers all 441 UK local authorities from 1995 to the present. Functions accept flexible filters (postcode, local authority, property type, date range) and return tidy data frames. Downloaded data is cached locally for subsequent calls. |
| Authors: | Charles Coverdale [aut, cre] |
| Maintainer: | Charles Coverdale <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-03 06:42:33 UTC |
| Source: | https://github.com/charlescoverdale/ukhousing |
Returns information about the local cache: where it lives, how many
files it contains, and how much disk space they take. Useful when
debugging stale results or deciding whether to call ukh_clear_cache().
ukh_cache_info()ukh_cache_info()
A list with elements dir, n_files, size_bytes,
size_human, and files (a data frame).
Other configuration:
ukh_clear_cache(),
ukh_epc_set_key()
op <- options(ukhousing.cache_dir = tempdir()) ukh_cache_info() options(op)op <- options(ukhousing.cache_dir = tempdir()) ukh_cache_info() options(op)
Deletes all locally cached UK housing data files. The next call to any data function will re-download fresh data.
ukh_clear_cache()ukh_clear_cache()
Invisibly returns NULL.
Other configuration:
ukh_cache_info(),
ukh_epc_set_key()
op <- options(ukhousing.cache_dir = tempdir()) ukh_clear_cache() options(op)op <- options(ukhousing.cache_dir = tempdir()) ukh_clear_cache() options(op)
Downloads and extracts the MHCLG bulk ZIP containing all domestic
certificates and recommendations for one local authority. The ZIP
is typically 5 to 50 MB and contains certificates.csv,
recommendations.csv, and a licence file.
ukh_epc_bulk( la, refresh = FALSE, type = c("domestic", "non-domestic", "display") )ukh_epc_bulk( la, refresh = FALSE, type = c("domestic", "non-domestic", "display") )
la |
Character. Local authority GSS code (e.g. |
refresh |
Logical. Re-download even if cached? Default |
type |
Character. Register: |
A list with elements certificates and recommendations,
each giving the path to the extracted CSV.
Other energy performance certificates:
ukh_epc_certificate(),
ukh_epc_recommendations_summary(),
ukh_epc_search(),
ukh_epc_summary()
## Not run: paths <- ukh_epc_bulk("E09000033") certs <- read.csv(paths$certificates) ## End(Not run)## Not run: paths <- ukh_epc_bulk("E09000033") certs <- read.csv(paths$certificates) ## End(Not run)
Returns all available fields and any improvement recommendations for one Energy Performance Certificate identified by its LMK key.
ukh_epc_certificate(lmk_key, type = c("domestic", "non-domestic", "display"))ukh_epc_certificate(lmk_key, type = c("domestic", "non-domestic", "display"))
lmk_key |
Character. The certificate's LMK key (found in the
|
type |
Character. Register: |
A list with two elements:
A one-row data frame with all certificate fields.
A data frame of improvement recommendations (may be empty).
Other energy performance certificates:
ukh_epc_bulk(),
ukh_epc_recommendations_summary(),
ukh_epc_search(),
ukh_epc_summary()
## Not run: cert <- ukh_epc_certificate("0000-0000-0000-0000-0000") cert$certificate cert$recommendations ## End(Not run)## Not run: cert <- ukh_epc_certificate("0000-0000-0000-0000-0000") cert$certificate cert$recommendations ## End(Not run)
Aggregates the improvement recommendations across certificates in a local authority, returning the frequency of each recommendation and the mean estimated cost and savings where available.
ukh_epc_recommendations_summary( la, type = c("domestic", "non-domestic", "display"), refresh = FALSE )ukh_epc_recommendations_summary( la, type = c("domestic", "non-domestic", "display"), refresh = FALSE )
la |
Character. Local authority GSS code. |
type |
Character. |
refresh |
Logical. Re-download bulk ZIP? Default |
This uses the bulk per-LA ZIP download rather than the paginated API, which is much faster for this aggregation. Requires EPC API credentials.
A data frame with columns improvement_id,
improvement_summary, count, mean_indicative_cost (where
numeric costs are reported), ordered by count descending.
Other energy performance certificates:
ukh_epc_bulk(),
ukh_epc_certificate(),
ukh_epc_search(),
ukh_epc_summary()
## Not run: recs <- ukh_epc_recommendations_summary("E09000033") head(recs) ## End(Not run)## Not run: recs <- ukh_epc_recommendations_summary("E09000033") head(recs) ## End(Not run)
Queries the MHCLG EPC Open Data service for domestic certificates
matching the given filters. Results are paginated automatically
using search-after tokens (not the from parameter, which caps
at 10,000 records).
ukh_epc_search( postcode = NULL, la = NULL, property_type = NULL, rating = NULL, built_form = NULL, from = NULL, to = NULL, size = 1000L, max_records = 10000L, type = c("domestic", "non-domestic", "display") )ukh_epc_search( postcode = NULL, la = NULL, property_type = NULL, rating = NULL, built_form = NULL, from = NULL, to = NULL, size = 1000L, max_records = 10000L, type = c("domestic", "non-domestic", "display") )
postcode |
Optional character. Postcode or partial postcode. |
la |
Optional character. Local authority GSS code (e.g.
|
property_type |
Optional character. |
rating |
Optional character. Current energy rating ( |
built_form |
Optional character. |
from, to
|
Optional. Lodgement date range (YYYY-MM-DD). |
size |
Integer. Results per page, max 5000. Default 1000. |
max_records |
Integer. Maximum total records to fetch across pages. Default 10000. Set higher for bulk analysis. |
type |
Character. Register to query: |
Registration at https://epc.opendatacommunities.org/ is required and free.
A data frame of certificates. Columns include lmk_key,
address, postcode, uprn, local_authority, constituency,
property_type, built_form, inspection_date, lodgement_date,
current_energy_rating, current_energy_efficiency,
potential_energy_rating, potential_energy_efficiency,
total_floor_area, co2_emissions_current, and more.
Other energy performance certificates:
ukh_epc_bulk(),
ukh_epc_certificate(),
ukh_epc_recommendations_summary(),
ukh_epc_summary()
## Not run: # Requires EPC API credentials certs <- ukh_epc_search(postcode = "SW1A 1AA") head(certs) # All E-rated flats in Westminster lodged since 2020 wm <- ukh_epc_search( la = "E09000033", property_type = "Flat", rating = "E", from = "2020-01-01" ) ## End(Not run)## Not run: # Requires EPC API credentials certs <- ukh_epc_search(postcode = "SW1A 1AA") head(certs) # All E-rated flats in Westminster lodged since 2020 wm <- ukh_epc_search( la = "E09000033", property_type = "Flat", rating = "E", from = "2020-01-01" ) ## End(Not run)
Stores the email and API key used to authenticate requests to the
MHCLG Energy Performance Certificate Open Data service. Credentials
persist for the current R session. Alternatively, set the
EPC_EMAIL and EPC_API_KEY environment variables in your
.Renviron file.
ukh_epc_set_key(email, key)ukh_epc_set_key(email, key)
email |
Character. The email you registered with. |
key |
Character. The API key. |
Register for a free API key at https://epc.opendatacommunities.org/.
Invisible NULL.
Other configuration:
ukh_cache_info(),
ukh_clear_cache()
## Not run: ukh_epc_set_key("[email protected]", "your_api_key_here") ## End(Not run)## Not run: ukh_epc_set_key("[email protected]", "your_api_key_here") ## End(Not run)
Summarises the distribution of current energy ratings (A-G) for domestic certificates in a local authority. Useful for area-level comparisons of housing stock efficiency.
ukh_epc_summary( la, from = NULL, to = NULL, type = c("domestic", "non-domestic", "display") )ukh_epc_summary( la, from = NULL, to = NULL, type = c("domestic", "non-domestic", "display") )
la |
Character. Local authority GSS code. |
from, to
|
Optional. Lodgement date range. |
type |
Character. Register: |
A data frame with one row per rating (A-G) and columns
rating, count, percentage, mean_floor_area,
mean_co2_emissions.
Other energy performance certificates:
ukh_epc_bulk(),
ukh_epc_certificate(),
ukh_epc_recommendations_summary(),
ukh_epc_search()
## Not run: ukh_epc_summary(la = "E09000033") ## End(Not run)## Not run: ukh_epc_summary(la = "E09000033") ## End(Not run)
Fetches monthly UK House Price Index data for a region from the HM Land Registry linked data service. Coverage: 441+ areas (countries, regions, counties, local authorities) from January 1995 (England and Wales), January 2004 (Scotland), January 2005 (Northern Ireland).
ukh_hpi(region, from = NULL, to = NULL, refresh = FALSE)ukh_hpi(region, from = NULL, to = NULL, refresh = FALSE)
region |
Character. Region slug, GSS code, or common name (e.g.
|
from, to
|
Optional character or Date. Start and end dates in ISO format (YYYY-MM-DD). Default returns the full available history. |
refresh |
Logical. Re-download even if cached? Default |
The returned data frame includes the headline index, average price, monthly and annual percentage change, sales volume, and breakdowns by property type (detached, semi-detached, terraced, flat) and by buyer type (cash, mortgage, first-time buyer, former owner occupier, new build, existing property).
Sales volumes lag the headline index by approximately five months because the Land Registry needs the full transaction set to settle.
A data frame with one row per month. Columns include date,
region, hpi, avg_price, pct_change_monthly,
pct_change_annual, sales_volume, plus property-type and
buyer-type average prices.
Other house price index:
ukh_hpi_compare(),
ukh_transactions()
op <- options(ukhousing.cache_dir = tempdir()) # All UK average house prices uk <- ukh_hpi("united-kingdom") head(uk) # London, last 10 years london <- ukh_hpi("london", from = "2016-01-01") tail(london) options(op)op <- options(ukhousing.cache_dir = tempdir()) # All UK average house prices uk <- ukh_hpi("united-kingdom") head(uk) # London, last 10 years london <- ukh_hpi("london", from = "2016-01-01") tail(london) options(op)
Fetches the UK House Price Index for several regions and returns a wide data frame with one measure (e.g. average price or annual % change) as a column per region. Useful for regional comparison charts.
ukh_hpi_compare( regions, measure = "avg_price", from = NULL, to = NULL, refresh = FALSE )ukh_hpi_compare( regions, measure = "avg_price", from = NULL, to = NULL, refresh = FALSE )
regions |
Character vector of region slugs, GSS codes, or names. |
measure |
Character. Which measure to return. One of
|
from, to
|
Optional character or Date. Date range. |
refresh |
Logical. Re-download even if cached? Default |
A wide data frame with a date column and one column per
region.
Other house price index:
ukh_hpi(),
ukh_transactions()
op <- options(ukhousing.cache_dir = tempdir()) prices <- ukh_hpi_compare( c("london", "manchester", "newcastle-upon-tyne"), measure = "avg_price", from = "2015-01-01" ) head(prices) options(op)op <- options(ukhousing.cache_dir = tempdir()) prices <- ukh_hpi_compare( c("london", "manchester", "newcastle-upon-tyne"), measure = "avg_price", from = "2015-01-01" ) head(prices) options(op)
Fetches the UK's official private-rental price index from the ONS Beta API. The dataset replaced the Index of Private Housing Rental Prices (IPHRP) with the Price Index of Private Rents (PIPR) in January 2024, but the same dataset slug covers both.
ukh_pipr(refresh = FALSE)ukh_pipr(refresh = FALSE)
refresh |
Logical. Re-download even if cached? Default |
Coverage: UK aggregate plus countries (England, Scotland, Wales, Northern Ireland) and English regions, monthly from January 2015.
A data frame with columns date, region, and index.
op <- options(ukhousing.cache_dir = tempdir()) rents <- ukh_pipr() head(rents) options(op)op <- options(ukhousing.cache_dir = tempdir()) rents <- ukh_pipr() head(rents) options(op)
Fetches records from the Digital Land planning data platform, which hosts over 100 datasets on planning applications, brownfield land, local plans, conservation areas, listed buildings, flood risk zones, and more.
ukh_planning( dataset, la = NULL, limit = 1000L, format = c("data.frame", "raw", "sf") )ukh_planning( dataset, la = NULL, limit = 1000L, format = c("data.frame", "raw", "sf") )
dataset |
Character. Dataset slug. Common values:
|
la |
Optional character. Local authority name (matched
case-insensitively against the |
limit |
Integer. Maximum records to return. Default |
format |
Character. Response format. |
A data frame (or list if format = "raw").
Other planning:
ukh_planning_datasets()
op <- options(ukhousing.cache_dir = tempdir()) bf <- ukh_planning("brownfield-land", limit = 100) head(bf) options(op)op <- options(ukhousing.cache_dir = tempdir()) bf <- ukh_planning("brownfield-land", limit = 100) head(bf) options(op)
Returns a data frame of datasets hosted by planning.data.gov.uk, with their slugs, names, and descriptions.
ukh_planning_datasets()ukh_planning_datasets()
A data frame with slug, name, typology, and
record_count.
Other planning:
ukh_planning()
op <- options(ukhousing.cache_dir = tempdir()) ds <- ukh_planning_datasets() head(ds) options(op)op <- options(ukhousing.cache_dir = tempdir()) ds <- ukh_planning_datasets() head(ds) options(op)
Fetches individual property transaction records from the HM Land Registry Price Paid Data, filtered by local authority, postcode, property type, and other criteria.
ukh_ppd( year = as.integer(format(Sys.Date(), "%Y")), la = NULL, postcode = NULL, property_type = NULL, new_build = NULL, tenure = NULL, from = NULL, to = NULL, refresh = FALSE )ukh_ppd( year = as.integer(format(Sys.Date(), "%Y")), la = NULL, postcode = NULL, property_type = NULL, new_build = NULL, tenure = NULL, from = NULL, to = NULL, refresh = FALSE )
year |
Integer. Year of transactions to fetch. Defaults to the current calendar year. |
la |
Optional character. Local authority name (matched
case-insensitively against the |
postcode |
Optional character. Postcode or postcode prefix. |
property_type |
Optional character. One of |
new_build |
Optional logical. If |
tenure |
Optional character. |
from, to
|
Optional character or Date. Date range within the year (YYYY-MM-DD). |
refresh |
Logical. Re-download the yearly file even if cached?
Default |
The full yearly CSV is ~150 MB (about 900,000 transactions). This
function downloads the yearly file, caches it, and then filters in
memory. Memory footprint during the call is roughly 1 GB because R
data frames are considerably larger than the source CSV; on
memory-constrained machines, prefer ukh_ppd_address() for
postcode lookups or ukh_ppd_summary() for aggregated stats.
Subsequent queries against the same year use the cache. For
multi-year queries, call the function once per year or use
ukh_ppd_years().
A data frame of individual transactions with columns
transaction_id, price, date, postcode, property_type,
new_build, tenure, paon, saon, street, locality,
town, district, county, category, record_status.
Other price paid data:
ukh_ppd_address(),
ukh_ppd_bulk(),
ukh_ppd_summary(),
ukh_ppd_transaction(),
ukh_ppd_years()
op <- options(ukhousing.cache_dir = tempdir()) # Westminster flats sold in 2024 wm <- ukh_ppd(2024, la = "Westminster", property_type = "flat") head(wm) options(op)op <- options(ukhousing.cache_dir = tempdir()) # Westminster flats sold in 2024 wm <- ukh_ppd(2024, la = "Westminster", property_type = "flat") head(wm) options(op)
Uses the Land Registry linked data address lookup to find all transactions at a given postcode. Faster than downloading the yearly bulk file when you only want a single postcode.
ukh_ppd_address(postcode)ukh_ppd_address(postcode)
postcode |
Character. Full UK postcode (e.g. |
A data frame of transactions at addresses matching the postcode.
Other price paid data:
ukh_ppd(),
ukh_ppd_bulk(),
ukh_ppd_summary(),
ukh_ppd_transaction(),
ukh_ppd_years()
op <- options(ukhousing.cache_dir = tempdir()) tx <- ukh_ppd_address("SW1A 1AA") head(tx) options(op)op <- options(ukhousing.cache_dir = tempdir()) tx <- ukh_ppd_address("SW1A 1AA") head(tx) options(op)
Downloads and caches a yearly (or the complete) Price Paid Data CSV from HM Land Registry. Returns the path to the cached file. Yearly files are typically 100-200 MB; the complete file (1995-present) is approximately 5.3 GB.
ukh_ppd_bulk( year = as.integer(format(Sys.Date(), "%Y")), full = FALSE, refresh = FALSE )ukh_ppd_bulk( year = as.integer(format(Sys.Date(), "%Y")), full = FALSE, refresh = FALSE )
year |
Integer. Year to download. Ignored when |
full |
Logical. If |
refresh |
Logical. Re-download even if cached? Default |
Character. The path to the cached CSV file.
Other price paid data:
ukh_ppd(),
ukh_ppd_address(),
ukh_ppd_summary(),
ukh_ppd_transaction(),
ukh_ppd_years()
op <- options(ukhousing.cache_dir = tempdir()) path <- ukh_ppd_bulk(2025) options(op)op <- options(ukhousing.cache_dir = tempdir()) path <- ukh_ppd_bulk(2025) options(op)
Returns summary statistics from Price Paid Data for a given year, aggregated by month, property type, or local authority. Useful when the user wants counts and median prices without loading 150 MB of individual transactions.
ukh_ppd_summary( year = as.integer(format(Sys.Date(), "%Y")), by = c("month", "property_type", "la"), la = NULL, refresh = FALSE )ukh_ppd_summary( year = as.integer(format(Sys.Date(), "%Y")), by = c("month", "property_type", "la"), la = NULL, refresh = FALSE )
year |
Integer. Year to summarise. |
by |
Character. Aggregation dimension: |
la |
Optional character. Restrict to one local authority. |
refresh |
Logical. Re-download even if cached? Default |
A data frame with one row per group and columns n,
median_price, mean_price, total_value.
Other price paid data:
ukh_ppd(),
ukh_ppd_address(),
ukh_ppd_bulk(),
ukh_ppd_transaction(),
ukh_ppd_years()
op <- options(ukhousing.cache_dir = tempdir()) # Monthly transaction counts and medians, nationally s <- ukh_ppd_summary(2025, by = "month") head(s) options(op)op <- options(ukhousing.cache_dir = tempdir()) # Monthly transaction counts and medians, nationally s <- ukh_ppd_summary(2025, by = "month") head(s) options(op)
Fetches the full metadata for one transaction from the Land Registry linked data service by its transaction unique identifier.
ukh_ppd_transaction(id)ukh_ppd_transaction(id)
id |
Character. A transaction unique identifier (GUID, with or without curly braces). |
A one-row data frame with the transaction fields, or an empty data frame if the transaction is not found.
Other price paid data:
ukh_ppd(),
ukh_ppd_address(),
ukh_ppd_bulk(),
ukh_ppd_summary(),
ukh_ppd_years()
op <- options(ukhousing.cache_dir = tempdir()) tx <- ukh_ppd_transaction("{A4C5B0C6-4D5D-47E2-E053-6C04A8C07E7C}") tx options(op)op <- options(ukhousing.cache_dir = tempdir()) tx <- ukh_ppd_transaction("{A4C5B0C6-4D5D-47E2-E053-6C04A8C07E7C}") tx options(op)
Convenience wrapper that calls ukh_ppd() for each year in a
vector and row-binds the results. Caches each year independently.
ukh_ppd_years(years, ...)ukh_ppd_years(years, ...)
years |
Integer vector. Years to fetch. |
... |
Additional arguments passed to |
A single data frame combining transactions from all requested years.
Other price paid data:
ukh_ppd(),
ukh_ppd_address(),
ukh_ppd_bulk(),
ukh_ppd_summary(),
ukh_ppd_transaction()
op <- options(ukhousing.cache_dir = tempdir()) five_year <- ukh_ppd_years(2020:2024, la = "Westminster", property_type = "flat") nrow(five_year) options(op)op <- options(ukhousing.cache_dir = tempdir()) five_year <- ukh_ppd_years(2020:2024, la = "Westminster", property_type = "flat") nrow(five_year) options(op)
Returns a data frame of common UK HPI region slugs with their names,
GSS codes, and tier (country, region, county, or local authority).
Useful for looking up the slug to pass to ukh_hpi().
ukh_regions(tier = c("all", "country", "region", "county", "la"))ukh_regions(tier = c("all", "country", "region", "county", "la"))
tier |
Character. Filter by tier: |
This is a selection of the most commonly used regions, not an
exhaustive list. The UK HPI covers 441+ areas total; any valid slug
can be passed to ukh_hpi() directly.
A data frame with columns slug, name, gss_code, and
tier.
Other helpers:
ukh_sparql()
# All regions head(ukh_regions()) # Just the nine English regions ukh_regions(tier = "region") # Country-level series ukh_regions(tier = "country")# All regions head(ukh_regions()) # Just the nine English regions ukh_regions(tier = "region") # Country-level series ukh_regions(tier = "country")
Runs a SPARQL query against one of the supported endpoints and returns the result as a data frame. Useful for queries that aren't covered by the dedicated helpers, including custom HPI aggregations and Price Paid Data lookups by postcode.
ukh_sparql( query, endpoint = c("land-registry", "opendatacommunities"), timeout = 60L )ukh_sparql( query, endpoint = c("land-registry", "opendatacommunities"), timeout = 60L )
query |
Character. A SPARQL query string. |
endpoint |
Character. One of |
timeout |
Integer. Request timeout in seconds. Default |
Both endpoints are free and require no authentication. The Land Registry endpoint covers HPI and Price Paid Data; the Open Data Communities endpoint covers 300+ housing-market datasets published by MHCLG.
A data frame of bindings. Column names match the SELECT variables in the query.
Other helpers:
ukh_regions()
op <- options(ukhousing.cache_dir = tempdir()) # All HPI observations for Westminster in January 2020 q <- ' PREFIX ukhpi: <http://landregistry.data.gov.uk/def/ukhpi/> SELECT ?hpi ?avgPrice WHERE { <http://landregistry.data.gov.uk/data/ukhpi/region/city-of-westminster/month/2020-01> ukhpi:housePriceIndex ?hpi ; ukhpi:averagePrice ?avgPrice . }' ukh_sparql(q) options(op)op <- options(ukhousing.cache_dir = tempdir()) # All HPI observations for Westminster in January 2020 q <- ' PREFIX ukhpi: <http://landregistry.data.gov.uk/def/ukhpi/> SELECT ?hpi ?avgPrice WHERE { <http://landregistry.data.gov.uk/data/ukhpi/region/city-of-westminster/month/2020-01> ukhpi:housePriceIndex ?hpi ; ukhpi:averagePrice ?avgPrice . }' ukh_sparql(q) options(op)
Returns monthly residential transaction counts for a UK region.
This is a thin wrapper over ukh_hpi() that extracts the
sales_volume column with its date. Transaction volumes lag the
headline index by approximately five months because the Land
Registry needs the full transaction set to settle.
ukh_transactions(region, from = NULL, to = NULL, refresh = FALSE)ukh_transactions(region, from = NULL, to = NULL, refresh = FALSE)
region |
Character. Region slug, GSS code, or common name. |
from, to
|
Optional. Date range (YYYY-MM-DD). |
refresh |
Logical. Re-download? Default |
A data frame with date, region, and sales_volume.
Other house price index:
ukh_hpi(),
ukh_hpi_compare()
op <- options(ukhousing.cache_dir = tempdir()) tx <- ukh_transactions("england", from = "2020-01-01") head(tx) options(op)op <- options(ukhousing.cache_dir = tempdir()) tx <- ukh_transactions("england", from = "2020-01-01") head(tx) options(op)