| Title: | Download and Tidy Australian Taxation Office Data |
|---|---|
| Description: | Fetch Australian Taxation Office (ATO) Taxation Statistics and related datasets via the data.gov.au Comprehensive Knowledge Archive Network ('CKAN') API <https://data.gov.au/data/api/3/>. Provides tidy access to individual, company, superannuation, goods and services tax (GST), fringe benefits tax (FBT), Voluntary Tax Transparency Code (VTTC), Pay As You Go (PAYG) withholding, charity, excise, and Corporate Tax Transparency data, plus Division 293, Petroleum Resource Rent Tax, Medicare Levy Surcharge, fuel tax credits, compliance, and Working Holiday Maker aggregates. Includes reproducibility helpers (snapshot pinning, SHA-256 cache integrity, session manifest, optional 'Zenodo' deposit), classification crosswalks (ANZSIC 2006 to 2020, ANZSCO 2013 to 2021), panel harmonisation, reconciliation against Final Budget Outcome totals, and real-terms and per-capita helpers backed by bundled Australian Bureau of Statistics (ABS) Consumer Price Index and Estimated Resident Population series. Bridges to the 'taxstats' 2 per cent microdata sample via column-schema mapping. Data is published by the Australian Taxation Office under Creative Commons Attribution 2.5 Australia or 3.0 Australia licences (dataset-dependent). |
| Authors: | Charles Coverdale [aut, cre] |
| Maintainer: | Charles Coverdale <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-03 06:45:35 UTC |
| Source: | https://github.com/charlescoverdale/ato |
Inspect the local ato cache
ato_cache_info()ato_cache_info()
A list with dir, n_files, size_bytes,
size_human, and files.
Other configuration:
ato_clear_cache(),
ato_meta()
op <- options(ato.cache_dir = tempdir()) ato_cache_info() options(op)op <- options(ato.cache_dir = tempdir()) ato_cache_info() options(op)
Returns a summary of all datasets published by the Australian Taxation Office on data.gov.au. Each row is a CKAN "package" with an id (slug), title, licence, modification date, and resource count.
ato_catalog(q = NULL)ato_catalog(q = NULL)
q |
Optional free-text filter (CKAN Solr query). |
An ato_tbl with one row per dataset.
'data.gov.au' CKAN endpoint https://data.gov.au/data/organization/australiantaxationoffice.
Other discovery:
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ cat <- ato_catalog() head(cat[, c("id", "title", "licence")]) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ cat <- ato_catalog() head(cat[, c("id", "title", "licence")]) }) options(op)
Returns the ATO's data on income tax-exempt entities and Deductible Gift Recipients (DGRs): entity counts, income, expenditure, and gift deductions by charity subtype and state. Covers public benevolent institutions, health promotion charities, environmental organisations, and other DGR categories.
ato_charities(year = "latest")ato_charities(year = "latest")
year |
Income year in |
Used by Treasury (charity tax expenditure estimates), researchers studying the non-profit sector, and civil society policy analysts.
An ato_tbl. Monetary values in nominal AUD.
Australian Taxation Office charity statistics on data.gov.au. Licensed CC BY 2.5 AU.
Other discovery:
ato_catalog(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ ch <- ato_charities(year = "2021-22") head(ch) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ ch <- ato_charities(year = "2021-22") head(ch) }) options(op)
Returns a citation suitable for footnotes, papers, and
Treasury-grade briefs. Uses the provenance attributes
attached to every ato_tbl: source URL, licence, retrieval
date, title, snapshot pin, and SHA-256 digest.
ato_cite(x, style = c("text", "bibtex", "apa"), doi = NULL)ato_cite(x, style = c("text", "bibtex", "apa"), doi = NULL)
x |
Either an |
style |
One of |
doi |
Optional DOI (e.g. from |
BibTeX output includes the SHA-256 digest (first 12 hex chars)
and snapshot pin (when set via ato_snapshot()) in the
note field, which is what research reviewers need to verify
the provenance of a downstream result.
A character string. For style = "bibtex", a complete
@misc{} entry.
Other discovery:
ato_catalog(),
ato_charities(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
x <- data.frame(a = 1) x <- structure(x, ato_source = "https://data.gov.au/data/dataset/example.xlsx", ato_licence = "CC BY 2.5 AU", ato_retrieved = as.POSIXct("2026-04-23 00:00:00", tz = "UTC"), ato_title = "ATO individuals 2022-23", ato_sha256 = "abc123def456", ato_snapshot_date = "2026-04-23", class = c("ato_tbl", "data.frame")) ato_cite(x) ato_cite(x, style = "bibtex") # DOI style: supply any minted DOI (Zenodo, DataCite, etc.). # The placeholder below is illustrative only. ato_cite(x, style = "apa", doi = "10.5281/zenodo.XXXXXXXX")x <- data.frame(a = 1) x <- structure(x, ato_source = "https://data.gov.au/data/dataset/example.xlsx", ato_licence = "CC BY 2.5 AU", ato_retrieved = as.POSIXct("2026-04-23 00:00:00", tz = "UTC"), ato_title = "ATO individuals 2022-23", ato_sha256 = "abc123def456", ato_snapshot_date = "2026-04-23", class = c("ato_tbl", "data.frame")) ato_cite(x) ato_cite(x, style = "bibtex") # DOI style: supply any minted DOI (Zenodo, DataCite, etc.). # The placeholder below is illustrative only. ato_cite(x, style = "apa", doi = "10.5281/zenodo.XXXXXXXX")
Deletes all locally cached files. The next call to any data function will re-download.
ato_clear_cache()ato_clear_cache()
Invisibly returns NULL.
Other configuration:
ato_cache_info(),
ato_meta()
op <- options(ato.cache_dir = tempdir()) ato_clear_cache() options(op)op <- options(ato.cache_dir = tempdir()) ato_clear_cache() options(op)
Returns the annual Company Taxation Statistics tables. The Company release ships tables covering entity type, turnover band, industry, taxable status, source of income, and expense deductions. Pick the table that matches your question:
ato_companies( year = "latest", table = c("industry", "snapshot", "key_items_by_size", "entity_type", "industry_by_size", "sub_industry", "taxable_status", "source", "expenses"), industry = NULL )ato_companies( year = "latest", table = c("industry", "snapshot", "key_items_by_size", "entity_type", "industry_by_size", "sub_industry", "taxable_status", "source", "expenses"), industry = NULL )
year |
|
table |
One of |
industry |
Optional substring filter on industry name (applied only when the fetched table has an industry column). |
snapshot (T1): aggregate counts, total income, net tax across all companies (~1m entities)
key_items_by_size (T2): net tax by company size band
entity_type (T3): split by public/private/co-operative
industry (T4, default): key items by 2-digit ANZSIC subdivision
industry_by_size (T5): industry x turnover band
sub_industry (T6): 4-digit ANZSIC class detail
taxable_status (T7): items by taxable status
source (T8): source of income
expenses (T9): expense and deduction categories
Classification break. Releases from 2022-23 onwards use ANZSIC 2020; earlier releases use ANZSIC 2006. A warning is emitted when the requested year(s) are at or after this boundary, or when a multi-year request spans it.
An ato_tbl. Monetary values in nominal AUD of the
reporting year.
Australian Taxation Office Taxation Statistics Company Tables. Licensed CC BY 2.5 AU.
Australian Taxation Office (annual). Taxation Statistics: Company tables explanatory notes. Methodology notes on lodgement cut-off, entity-type definitions, and turnover-band thresholds. Accessible from https://www.ato.gov.au/about-ato/research-and-statistics/in-detail/taxation-statistics/.
Australian Bureau of Statistics (2020). Australian and New Zealand Standard Industrial Classification (ANZSIC), 2006 revision with 2020 update. Catalogue 1292.0.
op <- options(ato.cache_dir = tempdir()) try({ s <- ato_companies(year = "2022-23", table = "snapshot") head(s) m <- ato_companies(year = "2022-23", industry = "mining") head(m) # Multi-year industry panel panel <- ato_companies(year = c("2021-22", "2022-23")) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ s <- ato_companies(year = "2022-23", table = "snapshot") head(s) m <- ato_companies(year = "2022-23", industry = "mining") head(m) # Multi-year industry panel panel <- ato_companies(year = c("2021-22", "2022-23")) }) options(op)
Returns the ATO's annual compliance program outcomes: audit yield (tax raised from audits), settled disputes, collectable debt, and compliance cost recovery. These appear in the ATO annual report and related data.gov.au releases.
ato_compliance(year = "latest", metric = c("overview", "debt", "audit"))ato_compliance(year = "latest", metric = c("overview", "debt", "audit"))
year |
|
metric |
One of |
An ato_tbl.
Australian Taxation Office annual report data. Licensed CC BY 3.0 AU.
Other specialist:
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_compliance(year = "2022-23", metric = "debt")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_compliance(year = "2022-23", metric = "debt")) options(op)
Returns one of the bundled classification crosswalks. Used
internally by ato_harmonise() and available for user-level
panel work.
ato_crosswalk(name = c("anzsic", "anzsco", "postcode", "cpi", "erp", "budget"))ato_crosswalk(name = c("anzsic", "anzsco", "postcode", "cpi", "erp", "budget"))
name |
One of |
Bundled crosswalks (at division/major-group level):
"anzsic": ANZSIC 2006 to 2020 (19 divisions, complete)
"anzsco": ANZSCO 2013 to 2021 (8 major groups, complete)
"postcode": postcode first-digit to state anchors
"cpi": ABS CPI annual, base 2011-12 = 1.0
"erp": ABS Estimated Resident Population, June 30 annual
"budget": Final Budget Outcome reference totals
For 4-digit ANZSIC, 6-digit ANZSCO, or postcode-to-SA2/LGA/CED crosswalks, fetch the full tables from ABS. The bundled division/major-group level covers cross-year ATO Taxation Statistics joins at the industry headings used in all ATO tables.
A data frame.
Australian Bureau of Statistics (2006). Australian and New Zealand Standard Industrial Classification (ANZSIC). Catalogue 1292.0.
Australian Bureau of Statistics (2020). ANZSIC 2006 Update, cat. 1292.0, divisional structure. Used by ATO Taxation Statistics from 2022-23.
Australian Bureau of Statistics (2013). Australian and New Zealand Standard Classification of Occupations (ANZSCO). Catalogue 1220.0.
Australian Bureau of Statistics (2022). ANZSCO Revised Edition, cat. 1220.0. Used by ATO Taxation Statistics from 2022-23 onward.
Other harmonisation:
ato_deflate(),
ato_harmonise(),
ato_per_capita(),
ato_reconcile(),
ato_schema_map(),
ato_to_taxstats()
ato_crosswalk("anzsic") ato_crosswalk("cpi")ato_crosswalk("anzsic") ato_crosswalk("cpi")
Converts a numeric vector of nominal AUD figures indexed by
financial year to real AUD of a chosen base year using the
bundled ABS CPI series (annual, All Groups Australia,
2011-12 = 1.0). For the user's inflateR workflow in
non-Australian contexts, bundle a matching CPI series and
call this with a custom cpi = argument.
ato_deflate(x, year, base = "2022-23", cpi = NULL)ato_deflate(x, year, base = "2022-23", cpi = NULL)
x |
Numeric vector of nominal AUD values. |
year |
Character vector of financial years for each entry
in |
base |
Base financial year for real terms (default
|
cpi |
Optional override: a data frame with columns
|
Uses proportional (Laspeyres-style) adjustment:
. The
bundled CPI is the ABS annual All Groups Australia index
published in cat. 6401.0, rebased so that 2011-12 = 1.000.
This is the standard rebasing used in most Australian
time-series work and is consistent with ABS System of National
Accounts methodology (cat. 5204.0).
The formula is exact for a chain-linked index after 1949 (when
the ABS CPI was introduced) and approximate for earlier values
that rely on Commonwealth Statistician retail-price series. Use
a custom cpi = argument if you need a different deflator
(e.g. GDP deflator, wage price index, or industry-specific PPI).
Numeric vector of real AUD values in base-year prices.
Australian Bureau of Statistics (2024). Consumer Price Index, Australia: Concepts, Sources and Methods. Catalogue 6461.0.
Australian Bureau of Statistics (2024). Consumer Price Index, Australia. Catalogue 6401.0.
Diewert, W.E. (1998). "Index Number Issues in the Consumer Price Index." Journal of Economic Perspectives, 12(1), 47-58. doi:10.1257/jep.12.1.47
Other harmonisation:
ato_crosswalk(),
ato_harmonise(),
ato_per_capita(),
ato_reconcile(),
ato_schema_map(),
ato_to_taxstats()
ato_deflate(c(100, 100, 100), year = c("2012-13", "2017-18", "2022-23"), base = "2022-23")ato_deflate(c(100, 100, 100), year = c("2012-13", "2017-18", "2022-23"), base = "2022-23")
Builds the JSON metadata payload Zenodo expects for a data
deposit, using the current ato_manifest() and the snapshot
pin set via ato_snapshot(). The function does NOT upload by
default; it returns the payload and saved manifest path so
you can inspect before calling with upload = TRUE.
ato_deposit_zenodo( title = NULL, description = NULL, creators = list(list(name = "Anonymous")), keywords = c("ATO", "taxation", "Australia", "reproducibility"), upload = FALSE, sandbox = FALSE, token = Sys.getenv("ZENODO_TOKEN") )ato_deposit_zenodo( title = NULL, description = NULL, creators = list(list(name = "Anonymous")), keywords = c("ATO", "taxation", "Australia", "reproducibility"), upload = FALSE, sandbox = FALSE, token = Sys.getenv("ZENODO_TOKEN") )
title |
Deposit title. Defaults to "ATO data snapshot YYYY-MM-DD" using the current snapshot pin. |
description |
Free-text description. Defaults to a short auto-generated note listing the datasets fetched. |
creators |
List of creator records. Each should be a list
with |
keywords |
Character vector of keywords. Defaults to
|
upload |
Logical; if |
sandbox |
Logical; if |
token |
Zenodo personal access token. Defaults to
|
To upload, supply a Zenodo personal access token via the
ZENODO_TOKEN environment variable (or the token argument).
Tokens can be generated at
https://zenodo.org/account/settings/applications/.
A list with payload (the JSON metadata), manifest_path
(where the CSV manifest was staged), and if upload = TRUE,
deposit_id, doi_prereserve, and url.
Other reproducibility:
ato_manifest(),
ato_manifest_clear(),
ato_manifest_write(),
ato_sha256(),
ato_snapshot()
ato_snapshot("2026-04-24") ato_deposit_zenodo( title = "ATO data snapshot for working paper v1", creators = list(list(name = "Coverdale, Charles")), upload = FALSE )ato_snapshot("2026-04-24") ato_deposit_zenodo( title = "ATO data snapshot for working paper v1", creators = list(list(name = "Coverdale, Charles")), upload = FALSE )
Returns Division 293 tax data: number of assessments, average Division 293 liability, and distribution by income band. Division 293 applies an extra 15% tax on concessional super contributions for individuals with combined income plus low-tax super contributions above AUD 250,000. Central to retirement-income reform analysis (e.g. Grattan's "Better Super" proposals).
ato_division293(year = "latest")ato_division293(year = "latest")
year |
|
Published as part of the Individuals Taxation Statistics (Table 3b in recent releases).
An ato_tbl.
Australian Taxation Office Taxation Statistics Individuals. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Income Tax Assessment Act 1997, Division 293. Extra 15 per cent tax on concessional super contributions for high-income earners.
Daley, J., Coates, B. and Wood, D. (2018). Money in retirement: more than enough. Grattan Institute. Uses Division 293 distributional data in reform analysis.
Other specialist:
ato_compliance(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_division293(year = "2022-23")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_division293(year = "2022-23")) options(op)
Low-level helper for arbitrary CKAN resources. Resolves the
package by id (slug) and picks the first resource matching
pattern, or the first resource if pattern is NULL.
ato_download( id, pattern = NULL, parse = c("auto", "csv", "xlsx", "none"), sheet = 1 )ato_download( id, pattern = NULL, parse = c("auto", "csv", "xlsx", "none"), sheet = 1 )
id |
CKAN package id (e.g. |
pattern |
Optional regex applied to the resource filename and name. |
parse |
One of |
sheet |
For XLSX resources: sheet index or name. |
Either a file path (parse = "none") or an ato_tbl.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ cat <- ato_download("corporate-transparency", pattern = "2023", parse = "csv") }) options(op)op <- options(ato.cache_dir = tempdir()) try({ cat <- ato_download("corporate-transparency", pattern = "2023", parse = "csv") }) options(op)
Returns ATO excise data, covering four sub-releases:
beer : beer clearances summary (volumes by product class)
spirits : spirits and other excisable beverages clearances
excise_rates : historical excise rate schedule (all excise categories, quarterly indexed rates)
ftc_rates : historical Fuel Tax Credit rates
ato_excise(table = c("excise_rates", "ftc_rates", "beer", "spirits"))ato_excise(table = c("excise_rates", "ftc_rates", "beer", "spirits"))
table |
One of |
An ato_tbl. Rates are in AUD per litre (or per kg
for tobacco); volumes are in megalitres or similar.
Australian Taxation Office excise data. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Excise Act 1901; Excise Tariff Act 1921; Fuel Tax Act 2006.
Australian Taxation Office (annual). Excise data: methodology and indexation notes. Excise rates are indexed to the Consumer Price Index twice a year (February and August) for most commodities.
Productivity Commission (2016). Migrant Intake into Australia (for tobacco excise distributional analysis); Harmful Drinking inquiry (for alcohol excise distributional analysis).
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ rates <- ato_excise("excise_rates") head(rates) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ rates <- ato_excise("excise_rates") head(rates) }) options(op)
Returns the ATO's annual Fringe Benefits Tax (FBT) Taxation Statistics: employer counts, gross taxable value, FBT payable, and employee benefit counts by benefit type and industry. Used by Treasury, PBO, and researchers evaluating the FBT concession system (electric vehicles, remote area exemptions, novated leases).
ato_fbt(year = "latest")ato_fbt(year = "latest")
year |
Income year in |
An ato_tbl. Monetary values in nominal AUD.
Australian Taxation Office FBT Taxation Statistics on data.gov.au. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Fringe Benefits Tax Assessment Act 1986. Substantive FBT law; ATO rulings (TR series) elaborate taxable-value methodology.
Australian Taxation Office (annual). FBT explanatory notes. Definitions of reportable benefits, gross-up factors (Type 1 and Type 2), and otherwise-deductible rule.
Treasury (2022). Electric Car Discount Bill. Explanatory memorandum for the EV FBT exemption introduced 1 July 2022.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ fbt <- ato_fbt(year = "2022-23") head(fbt) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ fbt <- ato_fbt(year = "2022-23") head(fbt) }) options(op)
Returns the Fuel Tax Credits scheme data: entitlement rates by fuel type, claim totals by industry. FTC is a major implicit fossil-fuel subsidy and is a key lens for decarbonisation policy cost-benefit analysis.
ato_fuel_tax_credits(year = "latest", by = c("industry", "fuel", "period"))ato_fuel_tax_credits(year = "latest", by = c("industry", "fuel", "period"))
year |
|
by |
One of |
The ATO publishes FTC data as part of the Excise Data release and in standalone FTC tables.
An ato_tbl.
Australian Taxation Office Excise and Fuel Tax Credit data. Licensed CC BY 3.0 AU.
Commonwealth of Australia. Fuel Tax Act 2006; Fuel Tax (Consequential and Transitional Provisions) Act 2006.
Denniss, R. and Grudnoff, M. (2021). Fossil fuel subsidies in Australia. The Australia Institute. FTC-as- subsidy framing used in decarbonisation policy analysis.
Intergovernmental Panel on Climate Change (2022). Climate Change 2022: Mitigation of Climate Change. Chapter 13 covers fossil-fuel subsidy reform.
Other specialist:
ato_compliance(),
ato_division293(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(head(ato_fuel_tax_credits(year = "latest", by = "industry"))) options(op)op <- options(ato.cache_dir = tempdir()) try(head(ato_fuel_tax_credits(year = "latest", by = "industry"))) options(op)
Returns the Taxation Statistics GST tables (T1-T5) or the Activity Statement Ratios (A1-A5) for the requested year.
ato_gst(year = "latest", table = c("overview", "state", "industry", "ratios"))ato_gst(year = "latest", table = c("overview", "state", "industry", "ratios"))
year |
|
table |
One of |
An ato_tbl.
Australian Taxation Office Taxation Statistics. Licensed CC BY 2.5 AU.
Australian Taxation Office (annual). Taxation Statistics: GST and Activity Statement Ratios explanatory notes.
Commonwealth of Australia. A New Tax System (Goods and Services Tax) Act 1999. Enabling legislation for the 10 per cent value-added tax introduced 1 July 2000.
Productivity Commission (2018). Horizontal Fiscal Equalisation. Background reference on the GST distribution formula across states.
Other gst:
ato_industry()
op <- options(ato.cache_dir = tempdir()) try({ g <- ato_gst(year = "2022-23", table = "industry") head(g) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ g <- ato_gst(year = "2022-23", table = "industry") head(g) }) options(op)
ATO renames columns across annual releases; a stacked panel
from ato_individuals_postcode(year = c("2020-21", "2021-22"))
may have inconsistent names like total_income vs
total_income_or_loss. ato_harmonise() renames columns to
the first variant in ATO_COL_VARIANTS so panels are join-ready.
ato_harmonise(df)ato_harmonise(df)
df |
A data frame (typically an |
Unknown columns are left alone. Columns that collide after renaming (because two variants map to the same canonical name) emit a warning; the first column wins.
A data frame with harmonised names. ato_tbl class
and provenance attributes are preserved.
Other harmonisation:
ato_crosswalk(),
ato_deflate(),
ato_per_capita(),
ato_reconcile(),
ato_schema_map(),
ato_to_taxstats()
df <- data.frame(postcode = "2000", total_income_or_loss = 100, state_territory = "NSW") ato_harmonise(df)df <- data.frame(postcode = "2000", total_income_or_loss = 100, state_territory = "NSW") ato_harmonise(df)
Returns aggregate statistics on Australia's three main education-loan schemes:
HELP (Higher Education Loan Program, ~3m borrowers, AUD 80bn+ outstanding debt)
AASL (Australian Apprenticeship Support Loans, previously Trade Support Loans)
VSL (VET Student Loans, vocational education loans)
ato_help(scheme = c("help", "aasl", "vsl"))ato_help(scheme = c("help", "aasl", "vsl"))
scheme |
One of |
Headline covers: new loans by income range, outstanding debt by age and gender, repayment rates, median debt on entry. Used by Treasury (PBO costings of HELP indexation changes) and education policy researchers.
An ato_tbl. All dollar values in nominal AUD.
Australian Taxation Office Study and Training Support Loans statistics. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Higher Education Support Act 2003; VET Student Loans Act 2016.
Australian Department of Education (annual). Higher Education Statistics: HELP statistics collection.
Norton, A. and Cherastidtham, I. (2018). Mapping Australian higher education. Grattan Institute. Methodology reference for HELP repayment projections.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ help <- ato_help(scheme = "help") head(help) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ help <- ato_help(scheme = "help") head(help) }) options(op)
Returns the Individuals Table 1 snapshot: aggregate counts, total income, taxable income, tax payable, and deductions across all individual returns (roughly 14 million per year). The snapshot is the headline table; for finer cuts use the dedicated functions:
ato_individuals_postcode() for geographic breakdowns,
ato_individuals_occupation() for occupation × sex × income-range
detail, or
ato_download() with a custom pattern for specific
Tables 2 to 27 (age, sex, state, industry, source of income,
deductions, offsets, CGT, non-residents).
ato_individuals(year = "latest")ato_individuals(year = "latest")
year |
Year in |
Monetary values are nominal AUD of the reporting year. Use
inflateR::inflate() or the ABS CPI series if you need
real-term comparisons.
An ato_tbl with one row per aggregate line-item and
columns for count and amount in nominal AUD.
Australian Taxation Office Taxation Statistics https://www.ato.gov.au/about-ato/research-and-statistics/. Licensed CC BY 2.5 AU.
Other individuals:
ato_individuals_age(),
ato_individuals_occupation(),
ato_individuals_postcode(),
ato_individuals_sex(),
ato_individuals_state()
op <- options(ato.cache_dir = tempdir()) try({ ind <- ato_individuals(year = "2022-23") head(ind) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ ind <- ato_individuals(year = "2022-23") head(ind) }) options(op)
Returns Taxation Statistics Individuals Table 2 (approximately): counts, total income, taxable income, and tax payable by age range and (usually) sex. Age ranges are 5-year bands for most of working life plus wider bands at the tails.
ato_individuals_age(year = "latest", sex = c("all", "male", "female"))ato_individuals_age(year = "latest", sex = c("all", "male", "female"))
year |
|
sex |
One of |
An ato_tbl.
Australian Taxation Office Taxation Statistics Individuals. Licensed CC BY 2.5 AU.
Australian Taxation Office (annual). Taxation Statistics: Individuals explanatory notes. Age-range breakdowns use the taxpayer's reported date of birth at lodgement; sex is self-reported on the return.
Other individuals:
ato_individuals(),
ato_individuals_occupation(),
ato_individuals_postcode(),
ato_individuals_sex(),
ato_individuals_state()
op <- options(ato.cache_dir = tempdir()) try(ato_individuals_age(year = "2022-23", sex = "female")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_individuals_age(year = "2022-23", sex = "female")) options(op)
Returns the Individuals Table 14 (occupation by sex by taxable income range). Around 1,000 occupations classified by ANZSCO with aggregate counts, total income, taxable income, and tax payable. The ATO migrated from ANZSCO 2013 to ANZSCO 2021 across the 2022-23 release; cross-year joins on occupation name or code must account for the recode.
ato_individuals_occupation( year = "latest", occupation = NULL, sex = c("all", "male", "female", "m", "f") )ato_individuals_occupation( year = "latest", occupation = NULL, sex = c("all", "male", "female", "m", "f") )
year |
|
occupation |
Optional substring filter (case-insensitive) applied to the occupation description column. |
sex |
One of |
Classification break. Releases from 2022-23 onwards use ANZSCO 2021; earlier releases use ANZSCO 2013. A warning is emitted when the requested year(s) are at or after this boundary, or when a multi-year request spans it.
An ato_tbl with one row per occupation-sex-income
combination. Multi-year queries add a year column.
Monetary values in nominal AUD of the reporting year.
Australian Taxation Office Taxation Statistics. Licensed CC BY 2.5 AU.
Other individuals:
ato_individuals(),
ato_individuals_age(),
ato_individuals_postcode(),
ato_individuals_sex(),
ato_individuals_state()
op <- options(ato.cache_dir = tempdir()) try({ occ <- ato_individuals_occupation(year = "2022-23", occupation = "economist") head(occ) # Multi-year panel panel <- ato_individuals_occupation(year = c("2021-22", "2022-23"), occupation = "nurse") }) options(op)op <- options(ato.cache_dir = tempdir()) try({ occ <- ato_individuals_occupation(year = "2022-23", occupation = "economist") head(occ) # Multi-year panel panel <- ato_individuals_occupation(year = c("2021-22", "2022-23"), occupation = "nurse") }) options(op)
Returns the Individuals Table 6 (or standalone postcode dataset): taxable income, tax payable, and return counts by 4-digit postcode. Headline dataset for income-distribution journalism.
ato_individuals_postcode(year = "latest", state = NULL, postcode = NULL)ato_individuals_postcode(year = "latest", state = NULL, postcode = NULL)
year |
|
state |
Optional character vector of state codes (e.g.
|
postcode |
Optional character vector of 4-digit postcodes. |
Privacy suppression. The ATO suppresses postcodes with
fewer than 50 returns; those cells are returned as NA after
parsing (the package maps "np", "*", and similar tokens
to NA so numeric columns stay numeric). Small or remote
postcodes will be silently missing from the output.
Monetary values are nominal AUD of the reporting year. Use
inflateR::inflate() for real-term series.
An ato_tbl with one row per postcode (or per postcode
per year for multi-year queries), including state, return
count, total income, taxable income, and tax payable. Schema
drifts year to year (SA3/SA4 columns present from 2017
onwards).
Australian Taxation Office Taxation Statistics postcode release. Licensed CC BY 2.5 AU.
Atkinson, A.B. and Leigh, A. (2007). "The Distribution of Top Incomes in Australia." Economic Record, 83(262), 247-261. doi:10.1111/j.1475-4932.2007.00412.x
Burkhauser, R.V., Hahn, M.H. and Wilkins, R. (2015). "Measuring top incomes using tax record data: a cautionary tale from Australia." Journal of Economic Inequality, 13(2), 181-205. doi:10.1007/s10888-014-9281-z
Other individuals:
ato_individuals(),
ato_individuals_age(),
ato_individuals_occupation(),
ato_individuals_sex(),
ato_individuals_state()
op <- options(ato.cache_dir = tempdir()) try({ # Single year p <- ato_individuals_postcode(year = "2022-23", state = "NSW") head(p) # Multi-year stack with year column panel <- ato_individuals_postcode(year = c("2020-21", "2021-22"), state = "NSW") }) options(op)op <- options(ato.cache_dir = tempdir()) try({ # Single year p <- ato_individuals_postcode(year = "2022-23", state = "NSW") head(p) # Multi-year stack with year column panel <- ato_individuals_postcode(year = c("2020-21", "2021-22"), state = "NSW") }) options(op)
Returns counts and aggregates split by sex. Thin wrapper around the ATO "Selected items by sex" table.
ato_individuals_sex(year = "latest")ato_individuals_sex(year = "latest")
year |
|
An ato_tbl.
Australian Taxation Office Taxation Statistics Individuals. Licensed CC BY 2.5 AU.
Australian Taxation Office (annual). Taxation Statistics: Individuals explanatory notes. Age-range breakdowns use the taxpayer's reported date of birth at lodgement; sex is self-reported on the return.
Other individuals:
ato_individuals(),
ato_individuals_age(),
ato_individuals_occupation(),
ato_individuals_postcode(),
ato_individuals_state()
op <- options(ato.cache_dir = tempdir()) try(ato_individuals_sex(year = "2022-23")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_individuals_sex(year = "2022-23")) options(op)
Returns counts and aggregates by state. Thin wrapper around the ATO "Selected items by state/territory" table.
ato_individuals_state(year = "latest")ato_individuals_state(year = "latest")
year |
|
An ato_tbl.
Australian Taxation Office Taxation Statistics Individuals. Licensed CC BY 2.5 AU.
Australian Taxation Office (annual). Taxation Statistics: Individuals explanatory notes. Age-range breakdowns use the taxpayer's reported date of birth at lodgement; sex is self-reported on the return.
Other individuals:
ato_individuals(),
ato_individuals_age(),
ato_individuals_occupation(),
ato_individuals_postcode(),
ato_individuals_sex()
op <- options(ato.cache_dir = tempdir()) try(ato_individuals_state(year = "2022-23")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_individuals_state(year = "2022-23")) options(op)
Derived helper that returns an ANZSIC industry breakdown based on either individual, company, or all entities for the year.
ato_industry( year = "latest", entity = c("company", "individual", "all"), anzsic = NULL )ato_industry( year = "latest", entity = c("company", "individual", "all"), anzsic = NULL )
year |
|
entity |
One of |
anzsic |
Optional substring filter on industry name. |
An ato_tbl.
Australian Taxation Office Taxation Statistics. Licensed CC BY 2.5 AU.
Other gst:
ato_gst()
op <- options(ato.cache_dir = tempdir()) try({ i <- ato_industry(year = "2022-23", entity = "company", anzsic = "manufacturing") head(i) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ i <- ato_industry(year = "2022-23", entity = "company", anzsic = "manufacturing") head(i) }) options(op)
Fetches OECD Revenue Statistics for cross-country tax-to-GDP benchmarking. Returns tax revenue as percent of GDP by tax category. Use to contextualise Australian ATO aggregates in cross-country policy arguments (e.g. OECD average corporate tax-to-GDP, international ranks for personal income tax).
ato_international(country = "AUS", year = "latest")ato_international(country = "AUS", year = "latest")
country |
Country ISO code or name (default |
year |
Four-digit year or |
Thin wrapper pointing users to readoecd:: for full OECD API
access; returns a minimal tax-to-GDP slice here for convenience.
An ato_tbl with columns country, year, tax,
pct_gdp.
OECD Revenue Statistics https://www.oecd.org/tax/tax-policy/revenue-statistics.htm.
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_international(country = "AUS")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_international(country = "AUS")) options(op)
Returns the ATO's International Related Party Dealings data, which captures intra-group cross-border payments and receivables reported by Australian corporate taxpayers. Core dataset for BEPS and transfer-pricing research, transfer pricing risk assessment, and multinational tax analysis.
ato_irpd(year = "latest", table = 1L)ato_irpd(year = "latest", table = 1L)
year |
Income year in |
table |
Integer 1, 2, or 3. Default |
The IRPD data is published as a separate CKAN package per income year (2019-20 through 2023-24). Each annual package contains three tables:
Table 1 : IRPD totals from 2015-16 to the current year
Table 2 : IRPDs by jurisdiction
Table 3 : Index of chart data
An ato_tbl. Monetary values in nominal AUD.
Australian Taxation Office International Related Party Dealings release. Licensed CC BY 2.5 AU.
Organisation for Economic Co-operation and Development (2015). Transfer Pricing Documentation and Country-by-Country Reporting, Action 13: 2015 Final Report. OECD/G20 Base Erosion and Profit Shifting Project, Paris. doi:10.1787/9789264241480-en
Commonwealth of Australia. Income Tax Assessment Act 1997, Subdivision 815-B (Transfer Pricing); Multinational Anti-Avoidance Law (MAAL) and Diverted Profits Tax.
Australian Taxation Office (annual). International Dealings Schedule (IDS) instructions. Reporting framework underlying the IRPD dataset.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ by_jurisdiction <- ato_irpd(year = "2023-24", table = 2) head(by_jurisdiction) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ by_jurisdiction <- ato_irpd(year = "2023-24", table = 2) head(by_jurisdiction) }) options(op)
Every call to a data function (ato_individuals(),
ato_companies(), etc.) appends one row to the session
manifest, recording URL, dataset title, CKAN resource and
package IDs where resolvable, SHA-256 of the cached file,
size, retrieval timestamp, and the snapshot pin set via
ato_snapshot(). Duplicate URLs within a session are
deduplicated (last fetch wins).
ato_manifest(format = c("df", "yaml", "json"))ato_manifest(format = c("df", "yaml", "json"))
format |
One of |
Attach the output to your paper's appendix, deposit it to
Zenodo with ato_deposit_zenodo() to mint a DOI, or export
with ato_manifest_write() for CI artefacts.
A data frame, YAML string, or JSON string depending on
format.
Other reproducibility:
ato_deposit_zenodo(),
ato_manifest_clear(),
ato_manifest_write(),
ato_sha256(),
ato_snapshot()
op <- options(ato.cache_dir = tempdir()) ato_manifest_clear() ato_snapshot("2026-04-24") try(ato_individuals(year = "2022-23")) ato_manifest() options(op)op <- options(ato.cache_dir = tempdir()) ato_manifest_clear() ato_snapshot("2026-04-24") try(ato_individuals(year = "2022-23")) ato_manifest() options(op)
Clear the session manifest
ato_manifest_clear()ato_manifest_clear()
Invisibly NULL. Useful at the top of a script when
running repeatedly.
Other reproducibility:
ato_deposit_zenodo(),
ato_manifest(),
ato_manifest_write(),
ato_sha256(),
ato_snapshot()
ato_manifest_clear()ato_manifest_clear()
Writes the manifest to a file in the requested format. Call at the end of an analysis script; commit the manifest alongside the paper for full reproducibility.
ato_manifest_write(path, format = c("auto", "csv", "yaml", "json"))ato_manifest_write(path, format = c("auto", "csv", "yaml", "json"))
path |
Output file path. Extension determines format if
|
format |
One of |
Invisibly, the absolute path to the written file.
Other reproducibility:
ato_deposit_zenodo(),
ato_manifest(),
ato_manifest_clear(),
ato_sha256(),
ato_snapshot()
p <- tempfile(fileext = ".csv") ato_manifest_clear() ato_manifest_write(p)p <- tempfile(fileext = ".csv") ato_manifest_clear() ato_manifest_write(p)
Returns aggregate Medicare Levy and MLS data from Taxation Statistics Individuals. The 2% Medicare Levy is on most taxable income; MLS is an additional 1.0 to 1.5% on high-income earners without adequate private hospital cover. Used in private health insurance reform analysis.
ato_medicare_levy(year = "latest", component = c("levy", "surcharge"))ato_medicare_levy(year = "latest", component = c("levy", "surcharge"))
year |
|
component |
One of |
An ato_tbl.
Australian Taxation Office Taxation Statistics Individuals. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Medicare Levy Act 1986; A New Tax System (Medicare Levy Surcharge – Fringe Benefits) Act 1999.
Productivity Commission (2015). Efficiency in Health. Analysis of Medicare Levy and MLS distributional effects.
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_medicare_levy(year = "2022-23", component = "surcharge")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_medicare_levy(year = "2022-23", component = "surcharge")) options(op)
Returns structured metadata for any ATO dataset on data.gov.au: title, notes, licence, last-modified timestamp, resource count, and all resource URLs. Useful for detecting silent updates before clearing the cache, or for auditing what version of data you have.
ato_meta(x)ato_meta(x)
x |
Either an |
A list with elements:
id: CKAN package slug
title: human-readable title
notes: dataset description (truncated to 400 chars)
licence: licence title
metadata_modified: ISO timestamp of last CKAN update
n_resources: number of downloadable files
resource_urls: character vector of all resource URLs
Other configuration:
ato_cache_info(),
ato_clear_cache()
op <- options(ato.cache_dir = tempdir()) try({ # By package ID m <- ato_meta("taxation-statistics-2022-23") m$metadata_modified # From an ato_tbl tbl <- ato_individuals(year = "2022-23") ato_meta(tbl) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ # By package ID m <- ato_meta("taxation-statistics-2022-23") m$metadata_modified # From an ato_tbl tbl <- ato_individuals(year = "2022-23") ato_meta(tbl) }) options(op)
Returns the ATO's Pay As You Go (PAYG) withholding data: employer counts, total withholding amounts, and employee counts by industry and state. Used by researchers studying labour market taxation, wage growth, and employer compliance.
ato_payg(year = "latest")ato_payg(year = "latest")
year |
Income year in |
An ato_tbl. Monetary values in nominal AUD.
Australian Taxation Office PAYG withholding data on data.gov.au. Licensed CC BY 2.5 AU.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ payg <- ato_payg(year = "2022-23") head(payg) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ payg <- ato_payg(year = "2022-23") head(payg) }) options(op)
Express an aggregate per capita using ABS ERP
ato_per_capita(x, year, erp = NULL)ato_per_capita(x, year, erp = NULL)
x |
Numeric vector of aggregate values (same length as
|
year |
Character vector of financial years. |
erp |
Optional override: data frame with columns
|
Divides the input by Estimated Resident Population at 30 June
of the financial year's end (a stock measure). For flow-style
measures where a mid-year-average population is preferable,
substitute a custom erp = argument. ERP is ABS's preferred
population-denominator concept for per-capita economic
statistics (see cat. 3101.0 methodology).
Numeric vector of per-capita values (same units as x
per person).
Australian Bureau of Statistics (2024). National, State and Territory Population. Catalogue 3101.0.
Other harmonisation:
ato_crosswalk(),
ato_deflate(),
ato_harmonise(),
ato_reconcile(),
ato_schema_map(),
ato_to_taxstats()
# Income tax per person, 2022-23 FBO headline ato_per_capita(316.4e9, "2022-23")# Income tax per person, 2022-23 FBO headline ato_per_capita(316.4e9, "2022-23")
Returns PRRT revenue and assessments. PRRT is a 40% tax on the profits of offshore petroleum projects; revenues are volatile and project-specific. Key dataset for resource-tax reform analysis.
ato_prrt(year = "latest")ato_prrt(year = "latest")
year |
|
An ato_tbl.
Australian Taxation Office Taxation Statistics Company. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Petroleum Resource Rent Tax Assessment Act 1987. Enabling legislation for the 40 per cent rent tax on offshore petroleum projects.
Callaghan, M. (2017). Review of the Petroleum Resource Rent Tax. Treasury-commissioned review; reference for PRRT-reform analysis.
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_prrt(year = "2022-23")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_prrt(year = "2022-23")) options(op)
Pointer to the RBA's H1 series on Commonwealth receipts for long-run time series. RBA compiles since 1959-60, filling gaps in ATO Taxation Statistics which start 1994-95.
ato_rba(series = c("receipts", "income_tax"))ato_rba(series = c("receipts", "income_tax"))
series |
One of |
The RBA publishes H1 as an XLSX with stable URL. This function fetches it and returns a tidy tibble.
An ato_tbl.
Reserve Bank of Australia Statistical Tables H1 https://www.rba.gov.au/statistics/tables/.
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_state_tax(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_rba(series = "receipts")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_rba(series = "receipts")) options(op)
Returns the annual "Report of data about Research and Development Tax Incentive entities": claimants, claimed expenditure, refundable and non-refundable tax offsets by industry and company size. Treasury and DISR use this series to evaluate the R&D Tax Incentive programme, which is the largest single element of Australia's business innovation policy (AUD 2 billion+ per year).
ato_rdti(year = "latest")ato_rdti(year = "latest")
year |
Income year in |
An ato_tbl with one row per entity (or aggregated
cell, depending on the release schema). Monetary values in
nominal AUD.
Australian Taxation Office Research and Development Tax Incentive report. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Income Tax Assessment Act 1997, Division 355 (Research and Development Tax Incentive).
Department of Industry, Science and Resources and Australian Taxation Office (annual). R&DTI Transparency Report. Jointly administered programme methodology.
Ferris, B., Finkel, A. and Fraser, J. (2016). Review of the R&D Tax Incentive. Australian Government review (the "Three Fs review") informing subsequent programme design.
Organisation for Economic Co-operation and Development (annual). R&D Tax Incentives Database. International comparator data for R&D tax expenditures.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ rdti <- ato_rdti(year = "2022-23") head(rdti) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ rdti <- ato_rdti(year = "2022-23") head(rdti) }) options(op)
Compares a scalar (or data frame total) against the published Final Budget Outcome figure for the same year and revenue line. Useful as a sanity check on an ATO Taxation Statistics sum before reporting it in a paper or brief.
ato_reconcile(value, year, measure, sum_column = NULL)ato_reconcile(value, year, measure, sum_column = NULL)
value |
Numeric; the figure to check, in AUD (not AUD
billions). An |
year |
Financial year, e.g. |
measure |
One of the measure codes in |
sum_column |
Column name to sum when |
Discrepancies between ATO Taxation Statistics aggregates and the Final Budget Outcome (FBO) are expected and meaningful:
Taxation Statistics are based on assessments made by a cut-off date (usually October of the following calendar year) and may exclude late-lodging returns.
FBO figures are cash-basis Commonwealth receipts; Taxation Statistics are accrual-basis tax assessed.
GST, excise, and fuel credits have timing and refund effects that further distort the cash-vs-assessment gap.
A 1-3 per cent gap is consistent with the accrual-to-cash
reconciliation Treasury publishes in the FBO statement of
revenues; larger gaps warrant investigation. The bundled
reference totals in inst/extdata/budget_reference_totals.csv
are taken from the relevant FBO release, with the precise table
cited in the source column of each row.
A one-row data frame: measure, year,
value_aud, reference_aud, diff_aud, pct_diff,
source. Emits a warning if abs(pct_diff) > 0.05.
Commonwealth of Australia (various years). Final Budget Outcome. The Treasury, Canberra. https://budget.gov.au/content/fbo/index.htm
Australian Bureau of Statistics (various years). Taxation Revenue, Australia. Catalogue 5506.0.
Australian Taxation Office (annual). Australian tax gaps – overview, methodology notes on accrual-vs-cash reconciliation.
Other harmonisation:
ato_crosswalk(),
ato_deflate(),
ato_harmonise(),
ato_per_capita(),
ato_schema_map(),
ato_to_taxstats()
ato_reconcile(value = 316.4e9, year = "2022-23", measure = "individuals_income_tax_net")ato_reconcile(value = 316.4e9, year = "2022-23", measure = "individuals_income_tax_net")
Convenience accessor for the bundled column-name mapping.
ato_schema_map()ato_schema_map()
A data frame with columns ato_aggregate and
taxstats_microdata.
Other harmonisation:
ato_crosswalk(),
ato_deflate(),
ato_harmonise(),
ato_per_capita(),
ato_reconcile(),
ato_to_taxstats()
head(ato_schema_map())head(ato_schema_map())
Wraps tools::md5sum() style behaviour for SHA-256 via the
digest package when available, or falls back to a pure-R
implementation via tools::md5sum() + file length as a weaker
check. For integrity work PBO/Grattan-grade, install the
digest package (Suggests).
ato_sha256(file)ato_sha256(file)
file |
Path to a local file. |
A length-1 character string (hex digest), or NA if the
file does not exist.
Other reproducibility:
ato_deposit_zenodo(),
ato_manifest(),
ato_manifest_clear(),
ato_manifest_write(),
ato_snapshot()
f <- tempfile() writeLines("hello", f) ato_sha256(f)f <- tempfile() writeLines("hello", f) ato_sha256(f)
Returns the ATO's Small Business Benchmarks: industry-specific performance ranges (cost of sales / turnover, total expenses / turnover, labour / turnover, etc.) derived from small-business income tax returns. Used by the ATO to identify outlier taxpayers, by small-business advisors for comparative analysis, and by tax integrity researchers.
ato_sme_benchmarks(year = "latest")ato_sme_benchmarks(year = "latest")
year |
Income year in |
An ato_tbl with one row per (industry, turnover
band, ratio) combination. Ratios are percentages.
Australian Taxation Office Small Business Benchmarks. Licensed CC BY 2.5 AU.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_tax_gaps(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ bm <- ato_sme_benchmarks(year = "2023-24") head(bm) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ bm <- ato_sme_benchmarks(year = "2023-24") head(bm) }) options(op)
Call once at the top of an analysis script to declare the vintage
of ATO data you intend to use. Every subsequent ato_* fetch
records this date in the ato_tbl provenance header, in
ato_manifest() entries, and in ato_cite() output. Combined
with SHA-256 integrity (see ato_sha256() and ato_manifest()),
this gives a reproducible audit trail acceptable for PBO or
Grattan-style published work.
ato_snapshot(date)ato_snapshot(date)
date |
ISO |
If called with no arguments, returns the current pin (or NULL
if unset).
Invisibly, the new pinned date (as Date), or NULL.
Other reproducibility:
ato_deposit_zenodo(),
ato_manifest(),
ato_manifest_clear(),
ato_manifest_write(),
ato_sha256()
ato_snapshot("2026-04-24") ato_snapshot() ato_snapshot(NULL)ato_snapshot("2026-04-24") ato_snapshot() ato_snapshot(NULL)
Fetches the ABS Taxation Revenue collection (cat. 5506.0), which gives land tax, payroll tax, stamp duty, motor vehicle taxes, and other state taxes by jurisdiction. Needed for complete-tax-system analysis alongside ATO Commonwealth data.
ato_state_tax(year = "latest")ato_state_tax(year = "latest")
year |
|
An ato_tbl.
Australian Bureau of Statistics, Taxation Revenue, catalogue 5506.0 https://www.abs.gov.au/statistics/economy/government/taxation-revenue-australia. Licensed CC BY 4.0.
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_tax_expenditures(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(ato_state_tax(year = "latest")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_state_tax(year = "latest")) options(op)
Returns Taxation Statistics Super Funds tables or Self-Managed
Superannuation Fund ('SMSF') aggregates, depending on type.
ato_super_funds(year = "latest", type = c("apra", "smsf", "all"))ato_super_funds(year = "latest", type = c("apra", "smsf", "all"))
year |
|
type |
One of |
An ato_tbl.
Australian Taxation Office Taxation Statistics Super Funds tables + SMSF statistical overview. Licensed CC BY 2.5 AU.
Australian Taxation Office (annual). Taxation Statistics: Super funds and SMSF explanatory notes. Distinguishes reporting populations: APRA-regulated large funds, SMSFs, and Pooled Superannuation Trusts.
Australian Prudential Regulation Authority (annual). Annual Superannuation Bulletin. Complementary APRA-regulated fund statistics.
Commonwealth of Australia. Superannuation Industry (Supervision) Act 1993 (SIS Act); Superannuation Guarantee (Administration) Act 1992 (SGAA).
Productivity Commission (2018). Superannuation: Assessing Efficiency and Competitiveness. Inquiry report.
op <- options(ato.cache_dir = tempdir()) try({ s <- ato_super_funds(year = "2022-23", type = "apra") head(s) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ s <- ato_super_funds(year = "2022-23", type = "apra") head(s) }) options(op)
Returns the Treasury TEIS annual table of concession-by-concession tax expenditure estimates in AUD millions. TEIS is the authoritative cost-of-concessions dataset used in PBO and Grattan tax reform costings.
ato_tax_expenditures(year = "latest")ato_tax_expenditures(year = "latest")
year |
Reference year for the TEIS release, e.g. |
TEIS is published by Treasury, not ATO; the function attempts a CKAN search on data.gov.au for the TEIS release, and falls back to the Treasury web URL if not indexed.
Key concessions covered: CGT main residence exemption, CGT 50% discount, superannuation earnings tax concession, franking credit refundability, work-related deductions, fuel tax credit scheme, R&D tax incentive, GST food exemption, and many more.
An ato_tbl with one row per tax expenditure: label,
category, estimated revenue forgone in AUD millions by year.
Treasury Tax Expenditures and Insights Statement https://treasury.gov.au/publication/p2025-721342.
Commonwealth of Australia (annual). Tax Expenditures and Insights Statement. The Treasury, Canberra. https://treasury.gov.au/publication/p2025-721342
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_whm()
op <- options(ato.cache_dir = tempdir()) try(head(ato_tax_expenditures("latest"))) options(op)op <- options(ato.cache_dir = tempdir()) try(head(ato_tax_expenditures("latest"))) options(op)
Returns the ATO's annual Tax Gap publication: estimates of the difference between the tax theoretically payable under current law and the tax actually collected, across each tax type and taxpayer population (individuals not in business, small business, large corporate, GST, excise, fuel tax credits, PRRT, superannuation guarantee).
ato_tax_gaps(sheet = 1)ato_tax_gaps(sheet = 1)
sheet |
Optional sheet name or index. The workbook
contains separate sheets for each tax-gap population (e.g.
"Large corporate", "Small business", "Individuals"). Pass
the sheet name to extract a specific population. |
The Tax Gap series is used by Treasury (every MYEFO), the Parliamentary Budget Office, and academic researchers as the headline measure of revenue integrity.
An ato_tbl. Tax-gap estimates are in nominal AUD
millions of the reporting year and typically accompanied by
a percentage-gap column.
Australian Taxation Office Tax Gaps publication, CC BY 2.5 AU.
Australian Taxation Office (annual). Australian tax gaps – overview. Methodology notes on bottom-up, top-down, and random-inquiry approaches to the tax-gap estimation.
HMRC (annual). Measuring tax gaps. Sister methodology paper applied by HM Revenue and Customs in the UK; the ATO series was partly inspired by this literature.
Organisation for Economic Co-operation and Development (2017). Shining Light on the Shadow Economy: Opportunities and Threats. Paris. Synthesises tax-gap measurement practice across OECD member countries.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_top_taxpayers(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ gaps <- ato_tax_gaps() head(gaps) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ gaps <- ato_tax_gaps() head(gaps) }) options(op)
Takes an ato_tbl with aggregate column names (produced by any
ato_* function) and renames columns to match the taxstats
(or taxstats2) 2% microdata sample schema used by Hugh
Parsonage's DRAT package. Enables consistent variable
definitions when moving between aggregate views and microdata
prototyping.
ato_to_taxstats(df, direction = c("to_taxstats", "from_taxstats"))ato_to_taxstats(df, direction = c("to_taxstats", "from_taxstats"))
df |
An |
direction |
|
The bundled schema map (ato_schema_map()) mirrors the
column names from Parsonage's taxstats and taxstats2
packages, which in turn use the ATO Individual Sample File
variable names. Because taxstats is DRAT-distributed and not
on CRAN, this function imposes the mapping as a static table
rather than programmatically introspecting the taxstats
namespace. Re-check the bundled map against the taxstats
NAMESPACE when the ATO publishes a revised Sample File schema.
Unknown columns pass through unchanged. Use
ato_harmonise first if the input panel has drift
in source column names.
A data frame with renamed columns. ato_tbl class and
provenance attributes preserved.
Parsonage, H. (2019). taxstats: 2 per cent Individual Sample File from the Australian Taxation Office. R package (DRAT). https://github.com/HughParsonage/taxstats
Parsonage, H. (2024). grattan: Perform Common Quantitative Tasks for Australian Analysts. R package version 2026.1.1. https://cran.r-project.org/package=grattan
Australian Taxation Office (2024). Taxation Statistics: Individual Sample File documentation.
Other harmonisation:
ato_crosswalk(),
ato_deflate(),
ato_harmonise(),
ato_per_capita(),
ato_reconcile(),
ato_schema_map()
df <- data.frame(postcode = "2000", taxable_income = 80000, medicare_levy = 1600) ato_to_taxstats(df)df <- data.frame(postcode = "2000", taxable_income = 80000, medicare_levy = 1600) ato_to_taxstats(df)
Returns the ATO's annual Corporate Tax Transparency release, mandated by Part 5-25 of the Taxation Administration Act 1953. Covers every Australian public company, foreign-owned company, or Australian-owned private company above the AUD 100 million total-income threshold (the private-company threshold was lowered from AUD 200 million to AUD 100 million for the 2022-23 income year onwards, making all three categories uniform). The 2023-24 release was published 1 October 2025 and covered 4,110 entities.
ato_top_taxpayers( year = "latest", entity_type = c("all", "public", "private", "foreign"), sheet = c("income_tax", "prrt") )ato_top_taxpayers( year = "latest", entity_type = c("all", "public", "private", "foreign"), sheet = c("income_tax", "prrt") )
year |
|
entity_type |
One of |
sheet |
One of |
The underlying XLSX has three sheets:
Information (cover/metadata, ~7 rows).
Income tax details (the headline dataset, ~4,000 entities: total income, taxable income, tax payable).
PRRT details (petroleum resource rent tax filers, typically 10-20 entities).
Licensed under CC BY 3.0 Australia (the Corporate Tax Transparency and Voluntary Tax Transparency Code releases use CC BY 3.0 AU; most other Taxation Statistics use CC BY 2.5 AU).
An ato_tbl with one row per disclosed entity. All
monetary values are nominal AUD of the reporting year.
Australian Taxation Office Corporate Tax Transparency release. Licensed CC BY 3.0 AU.
Commonwealth of Australia. Taxation Administration Act 1953, Part 5-25 (Corporate Tax Transparency).
Australian Taxation Office (annual). Report of entity tax information. The statutory Corporate Tax Transparency release.
Commonwealth Treasury (2013). Improving the transparency of Australia's business tax system: Exposure draft explanatory memorandum. Rationale for the Part 5-25 regime.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_vttc()
op <- options(ato.cache_dir = tempdir()) try({ top <- ato_top_taxpayers(year = "2023-24") head(top) # Petroleum resource rent tax sheet prrt <- ato_top_taxpayers(year = "2023-24", sheet = "prrt") head(prrt) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ top <- ato_top_taxpayers(year = "2023-24") head(top) # Petroleum resource rent tax sheet prrt <- ato_top_taxpayers(year = "2023-24", sheet = "prrt") head(prrt) }) options(op)
Returns the ATO's Voluntary Tax Transparency Code (VTTC) disclosures: large private companies that voluntarily publish tax information beyond the Corporate Tax Transparency mandate. Covers total income, taxable income, tax payable, and effective tax rate for each disclosing entity.
ato_vttc(year = "latest")ato_vttc(year = "latest")
year |
Income year in |
The VTTC complements ato_top_taxpayers() (which covers
mandatory CTT disclosures for entities above AUD 100m total
income). VTTC signatories may be below or above the CTT
threshold.
Licensed under CC BY 3.0 Australia (same as CTT data).
An ato_tbl. Monetary values in nominal AUD.
Australian Taxation Office Voluntary Tax Transparency Code disclosures on data.gov.au. Licensed CC BY 3.0 AU.
Other discovery:
ato_catalog(),
ato_charities(),
ato_cite(),
ato_download(),
ato_excise(),
ato_fbt(),
ato_help(),
ato_irpd(),
ato_payg(),
ato_rdti(),
ato_sme_benchmarks(),
ato_tax_gaps(),
ato_top_taxpayers()
op <- options(ato.cache_dir = tempdir()) try({ vttc <- ato_vttc(year = "2022-23") head(vttc) }) options(op)op <- options(ato.cache_dir = tempdir()) try({ vttc <- ato_vttc(year = "2022-23") head(vttc) }) options(op)
Returns aggregate Working Holiday Maker tax data: number of backpackers, total earnings, tax paid. Relevant for migration and labour-market policy analysis.
ato_whm(year = "latest")ato_whm(year = "latest")
year |
|
An ato_tbl.
Australian Taxation Office Taxation Statistics. Licensed CC BY 2.5 AU.
Commonwealth of Australia. Migration Act 1958, visa subclasses 417 and 462; Working Holiday Maker Reform Act 2016. Establishes the 15 per cent flat tax rate from the first dollar of WHM earnings.
Productivity Commission (2016). Migrant Intake into Australia. Includes WHM labour-market analysis.
Other specialist:
ato_compliance(),
ato_division293(),
ato_fuel_tax_credits(),
ato_international(),
ato_medicare_levy(),
ato_prrt(),
ato_rba(),
ato_state_tax(),
ato_tax_expenditures()
op <- options(ato.cache_dir = tempdir()) try(ato_whm(year = "2022-23")) options(op)op <- options(ato.cache_dir = tempdir()) try(ato_whm(year = "2022-23")) options(op)
Prints a provenance header (title, source, licence, retrieval time, dimensions) followed by the data frame.
## S3 method for class 'ato_tbl' print(x, ...)## S3 method for class 'ato_tbl' print(x, ...)
x |
An |
... |
Passed to the next print method. |
Invisibly returns x.
x <- data.frame(postcode = "2000", taxable_income = 82000) x <- structure(x, ato_title = "Demo", ato_source = "https://data.gov.au", ato_licence = "CC BY 2.5 AU", ato_retrieved = Sys.time(), class = c("ato_tbl", "data.frame")) print(x)x <- data.frame(postcode = "2000", taxable_income = 82000) x <- structure(x, ato_title = "Demo", ato_source = "https://data.gov.au", ato_licence = "CC BY 2.5 AU", ato_retrieved = Sys.time(), class = c("ato_tbl", "data.frame")) print(x)