| Title: | Inequality Measurement, Decomposition, and Poverty Analysis |
|---|---|
| Description: | Tools for measuring income and wealth inequality. Computes the Gini coefficient with bootstrap or asymptotic confidence intervals following Davidson (2009) <doi:10.1016/j.jeconom.2008.11.004>, the extended S-Gini family, Theil T and L indices (generalised entropy family), the Atkinson index, the Kolm absolute inequality index, Palma ratio, Hoover index, percentile ratios, and Lorenz curves. Supports between-within group decomposition following Bourguignon (1979) <doi:10.2307/1914138>, income share tabulation, concentration indices for health inequality with Erreygers (2009) correction, Kakwani tax progressivity and Reynolds-Smolensky redistribution indices, Foster-Greer-Thorbecke poverty measures including the Sen index, growth incidence curves following Ravallion and Chen (2003) <doi:10.1016/S0165-1765(02)00205-7>, and Wolfson polarisation. All functions accept optional survey weights and work with data from any source. |
| Authors: | Charles Coverdale [aut, cre] |
| Maintainer: | Charles Coverdale <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-30 15:06:53 UTC |
| Source: | https://github.com/charlescoverdale/inequality |
Computes the Atkinson inequality index, which incorporates an explicit normative judgement about inequality aversion through the parameter epsilon. Higher epsilon gives more weight to transfers at the bottom of the distribution.
iq_atkinson( x, weights = NULL, epsilon = 0.5, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )iq_atkinson( x, weights = NULL, epsilon = 0.5, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )
x |
Numeric vector of incomes (strictly positive). |
weights |
Optional numeric vector of survey weights. |
epsilon |
Numeric. Inequality aversion parameter (> 0). Default |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
The Atkinson index involves either a power transformation
x^(1 - epsilon) or log(x) (when epsilon = 1) and so requires
strictly positive values. Use the Gini, S-Gini, or Kolm index for
distributions that include zeros or negatives.
An S3 object of class "iq_atkinson" with elements:
Numeric. The Atkinson index (0 to 1).
Numeric. The inequality aversion parameter used.
Numeric. The equally distributed equivalent income.
Numeric. The mean income.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
Atkinson, A. B. (1970). "On the Measurement of Inequality." Journal of Economic Theory, 2(3), 244–263.
Biewen, M. and Jenkins, S. P. (2006). "Variance Estimation for Generalized Entropy and Atkinson Inequality Indices: The Complex Survey Data Case." Oxford Bulletin of Economics and Statistics, 68(3), 371–383.
d <- iq_sample_data("income") # Moderate inequality aversion iq_atkinson(d$income, epsilon = 0.5) # With bootstrap CIs iq_atkinson(d$income, epsilon = 0.5, ci = TRUE, R = 200) # High inequality aversion iq_atkinson(d$income, epsilon = 1) # Very high inequality aversion iq_atkinson(d$income, epsilon = 2)d <- iq_sample_data("income") # Moderate inequality aversion iq_atkinson(d$income, epsilon = 0.5) # With bootstrap CIs iq_atkinson(d$income, epsilon = 0.5, ci = TRUE, R = 200) # High inequality aversion iq_atkinson(d$income, epsilon = 1) # Very high inequality aversion iq_atkinson(d$income, epsilon = 2)
Computes all major inequality indices on the same data and returns a summary table for easy comparison.
iq_compare( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )iq_compare( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )
x |
Numeric vector of incomes (strictly positive by default; see
|
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap CIs for every measure in the
table? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
When ci = TRUE the function runs a single bootstrap loop, recomputing
every measure on each resample. This is far cheaper than calling each
measure with its own ci = TRUE and produces a CI for every row of the
table.
By default iq_compare() requires strictly positive values because the
Theil and Atkinson rows are mathematically undefined at zero or below.
Pass negatives = "keep" to permit zero or negative values: the
Theil and Atkinson rows are returned as NA in that case, while the
Gini, S-Gini, Kolm, Wolfson, Palma, Hoover and percentile-ratio rows
are computed using the formulas appropriate for that input.
An S3 object of class "iq_comparison" with elements:
data.frame with columns measure, value, and (when
ci = TRUE) ci_lower and ci_upper.
Integer. Number of observations.
Numeric or NULL. Confidence level.
d <- iq_sample_data("income") iq_compare(d$income) # CIs for every measure in the table (one bootstrap loop, all rows) iq_compare(d$income, ci = TRUE, R = 200) # Wealth distributions can include negatives wealth <- c(-5000, 0, 5000, 20000, 80000, 250000, 1e6) iq_compare(wealth, negatives = "keep")d <- iq_sample_data("income") iq_compare(d$income) # CIs for every measure in the table (one bootstrap loop, all rows) iq_compare(d$income, ci = TRUE, R = 200) # Wealth distributions can include negatives wealth <- c(-5000, 0, 5000, 20000, 80000, 250000, 1e6) iq_compare(wealth, negatives = "keep")
Computes the concentration index, which measures inequality in a health (or other) variable across the income distribution. Unlike the Gini coefficient, the ranking variable and the outcome variable are different.
iq_concentration( x, rank, weights = NULL, correction = c("none", "erreygers", "wagstaff"), bounds = c(0, 1), na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )iq_concentration( x, rank, weights = NULL, correction = c("none", "erreygers", "wagstaff"), bounds = c(0, 1), na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )
x |
Numeric vector of outcome values (e.g. health expenditure). |
rank |
Numeric vector of ranking values (e.g. income). Must be the
same length as |
weights |
Optional numeric vector of survey weights. |
correction |
Character. |
bounds |
Numeric vector of length 2 giving the lower and upper bounds
of |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
A positive value indicates the outcome is concentrated among the better-off; a negative value indicates concentration among the worse-off.
For bounded variables (e.g. binary health indicators), the standard concentration index has bounds that depend on the mean. Two corrections are available:
correction = "erreygers": the Erreygers (2009) corrected index,
E = 4 * mu / (b - a) * C, which has fixed bounds of -1 to 1.
correction = "wagstaff": the Wagstaff (2005) normalised index,
W = C / (1 - mu / b) for variables bounded above at b, which is
the standard normalisation in much of the health-economics
literature.
An S3 object of class "iq_concentration" with elements:
Numeric. The concentration index.
Character. The correction applied.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
Wagstaff, A., Paci, P. and van Doorslaer, E. (1991). "On the Measurement of Inequalities in Health." Social Science and Medicine, 33(5), 545–557.
Erreygers, G. (2009). "Correcting the Concentration Index." Journal of Health Economics, 28(2), 504–515.
Wagstaff, A. (2005). "The Bounds of the Concentration Index when the Variable of Interest is Binary, with an Application to Immunization Inequality." Health Economics, 14(4), 429–432.
set.seed(1) income <- rlnorm(200, 10, 0.8) health_exp <- income * 0.05 + rnorm(200, 500, 100) iq_concentration(health_exp, rank = income) # With bootstrap CIs iq_concentration(health_exp, rank = income, ci = TRUE, R = 200) # Binary outcome with Erreygers correction sick <- as.numeric(income < median(income)) + rbinom(200, 1, 0.1) sick <- pmin(sick, 1) iq_concentration(sick, rank = income, correction = "erreygers")set.seed(1) income <- rlnorm(200, 10, 0.8) health_exp <- income * 0.05 + rnorm(200, 500, 100) iq_concentration(health_exp, rank = income) # With bootstrap CIs iq_concentration(health_exp, rank = income, ci = TRUE, R = 200) # Binary outcome with Erreygers correction sick <- as.numeric(income < median(income)) + rbinom(200, 1, 0.1) sick <- pmin(sick, 1) iq_concentration(sick, rank = income, correction = "erreygers")
Decomposes a generalised entropy index into a between-group component (inequality due to differences in group means) and a within-group component (inequality within each group). The decomposition is exact: between + within = total.
iq_decompose(x, group, weights = NULL, index = "T", na.rm = FALSE)iq_decompose(x, group, weights = NULL, index = "T", na.rm = FALSE)
x |
Numeric vector of incomes (strictly positive). |
group |
Factor or character vector identifying group membership. |
weights |
Optional numeric vector of survey weights. |
index |
Character or numeric. |
na.rm |
Logical. Remove |
An S3 object of class "iq_decomposition" with elements:
Numeric. The total GE index.
Numeric. The between-group component.
Numeric. The within-group component.
data.frame with columns group, n, mean_income,
pop_share, income_share, within_ge.
Character. Name of the index used.
Bourguignon, F. (1979). "Decomposable Income Inequality Measures." Econometrica, 47(4), 901–920.
d <- iq_sample_data("grouped") iq_decompose(d$income, d$group)d <- iq_sample_data("grouped") iq_decompose(d$income, d$group)
Computes the Gini coefficient of a distribution, with optional survey weights and confidence intervals (bootstrap or asymptotic).
iq_gini( x, weights = NULL, na.rm = FALSE, ci = FALSE, method = c("bootstrap", "asymptotic"), R = 1000L, level = 0.95, negatives = c("error", "keep"), normalised = FALSE )iq_gini( x, weights = NULL, na.rm = FALSE, ci = FALSE, method = c("bootstrap", "asymptotic"), R = 1000L, level = 0.95, negatives = c("error", "keep"), normalised = FALSE )
x |
Numeric vector of incomes or values. |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute confidence intervals? Default |
method |
Character. CI method: |
R |
Integer. Number of bootstrap replicates (ignored for
asymptotic). Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
normalised |
Logical. Use the Raffinetti, Siletti and Vernizzi (2017)
normalised Gini? Default |
For a strictly non-negative distribution the Gini ranges from 0 (perfect equality) to 1 (perfect inequality) and equals twice the area between the Lorenz curve and the 45-degree line.
Following feedback from Cowell and Flachaire (personal communication, 2026)
the package permits negative values via negatives = "keep". Two policies
are then available:
normalised = FALSE (default): the standard formula is applied. With
negatives present the index is no longer bounded in the unit interval.
When the population mean is non-positive the Gini has no inequality
interpretation and the function returns NA with a warning.
normalised = TRUE: the Raffinetti, Siletti and Vernizzi (2017)
normalised Gini, which rescales the index back into the unit interval
for distributions containing negatives. The denominator is replaced
by mean(|x|) so the index is well-defined whenever any observation
is non-zero.
An S3 object of class "iq_gini" with elements:
Numeric. The Gini coefficient (or NA when undefined).
Integer. Number of observations.
Numeric or NULL. Standard error.
Numeric or NULL. Lower bound of the CI.
Numeric or NULL. Upper bound of the CI.
Numeric or NULL. Confidence level.
Character or NULL. CI method used.
Logical. Whether the input contained negatives.
Logical. Whether the Raffinetti et al. normalisation was applied.
Gini, C. (1912). "Variabilita e mutabilita." Reprinted in Memorie di metodologica statistica (Ed. Pizetti E, Salvemini, T). Rome: Libreria Eredi Virgilio Veschi.
Davidson, R. (2009). "Reliable Inference for the Gini Index." Journal of Econometrics, 150(1), 30–40.
Raffinetti, E., Siletti, E. and Vernizzi, A. (2017). "Analyzing the Effects of Negative and Non-negative Values on Income Inequality: Evidence from the Survey of Household Income and Wealth of the Bank of Italy (2012)." Social Indicators Research, 133(1), 185–207.
d <- iq_sample_data("income") iq_gini(d$income) # Bootstrap CIs iq_gini(d$income, ci = TRUE, R = 500) # Asymptotic CIs (faster for large samples) iq_gini(d$income, ci = TRUE, method = "asymptotic") # Wealth distributions can include negative net worth wealth <- c(-5000, -1000, 0, 5000, 20000, 80000, 250000) iq_gini(wealth, negatives = "keep") # Same data with the Raffinetti et al. (2017) normalisation iq_gini(wealth, negatives = "keep", normalised = TRUE) # Perfect equality iq_gini(rep(100, 50))d <- iq_sample_data("income") iq_gini(d$income) # Bootstrap CIs iq_gini(d$income, ci = TRUE, R = 500) # Asymptotic CIs (faster for large samples) iq_gini(d$income, ci = TRUE, method = "asymptotic") # Wealth distributions can include negative net worth wealth <- c(-5000, -1000, 0, 5000, 20000, 80000, 250000) iq_gini(wealth, negatives = "keep") # Same data with the Raffinetti et al. (2017) normalisation iq_gini(wealth, negatives = "keep", normalised = TRUE) # Perfect equality iq_gini(rep(100, 50))
Computes the growth incidence curve (GIC), showing the annualised or total growth rate at each quantile of the distribution between two time periods.
iq_growth_incidence( x_t0, x_t1, weights_t0 = NULL, weights_t1 = NULL, n_quantiles = 20L, na.rm = FALSE )iq_growth_incidence( x_t0, x_t1, weights_t0 = NULL, weights_t1 = NULL, n_quantiles = 20L, na.rm = FALSE )
x_t0 |
Numeric vector of incomes in period 0. |
x_t1 |
Numeric vector of incomes in period 1. Must be the same
length as |
weights_t0 |
Optional weights for period 0. |
weights_t1 |
Optional weights for period 1. |
n_quantiles |
Integer. Number of quantile bins. Default |
na.rm |
Logical. Remove |
If the GIC is upward-sloping, the rich grew faster and inequality increased. If downward-sloping, growth was pro-poor.
An S3 object of class "iq_growth_incidence" with elements:
data.frame with columns quantile (midpoint), growth
(proportional growth rate at that quantile).
Numeric. Mean growth across all quantiles.
Numeric. Median growth rate.
Integer.
Ravallion, M. and Chen, S. (2003). "Measuring Pro-Poor Growth." Economics Letters, 78(1), 93–99.
d <- iq_sample_data("panel") gic <- iq_growth_incidence(d$income_t0, d$income_t1) plot(gic)d <- iq_sample_data("panel") gic <- iq_growth_incidence(d$income_t0, d$income_t1) plot(gic)
Computes the Hoover index, also known as the Robin Hood index or the Schutz coefficient. It equals the maximum proportion of total income that would need to be redistributed to achieve perfect equality, or equivalently, half the mean absolute deviation divided by the mean.
iq_hoover( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )iq_hoover( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )
x |
Numeric vector of incomes. |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
An S3 object of class "iq_hoover" with elements:
Numeric. The Hoover index (0 to 1 with non-negative input).
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
d <- iq_sample_data("income") iq_hoover(d$income) # With bootstrap CIs iq_hoover(d$income, ci = TRUE, R = 200) # Perfect equality iq_hoover(rep(100, 50))d <- iq_sample_data("income") iq_hoover(d$income) # With bootstrap CIs iq_hoover(d$income, ci = TRUE, R = 200) # Perfect equality iq_hoover(rep(100, 50))
Measures the progressivity of a tax or transfer system. A positive value indicates progressivity (the rich pay a larger share than their income share); a negative value indicates regressivity. Zero means proportional.
iq_kakwani( pre_tax, tax, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )iq_kakwani( pre_tax, tax, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )
pre_tax |
Numeric vector of pre-tax incomes (non-negative by default). |
tax |
Numeric vector of tax payments (same length as |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals on the
Kakwani and Reynolds-Smolensky indices? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
The Kakwani index equals the concentration coefficient of the tax minus the pre-tax Gini coefficient: K = C_T - G_pre.
The post-tax Gini is computed on pre_tax - tax directly. Households
whose post-tax income is negative are kept as-is, so the post-tax Gini
may exceed 1 in distributions with extreme tax burdens. Pass
negatives = "error" to abort on negative pre-tax incomes.
An S3 object of class "iq_kakwani" with elements:
Numeric. The Kakwani index (-1 to 1).
Numeric. The pre-tax Gini coefficient.
Numeric. The concentration coefficient of taxes.
Numeric. The Reynolds-Smolensky index (pre-tax Gini minus post-tax Gini).
Numeric. The post-tax Gini coefficient.
Numeric. Average effective tax rate.
Integer. Number of observations.
Lists with lower and upper (or NULL).
Kakwani, N. C. (1977). "Measurement of Tax Progressivity: An International Comparison." The Economic Journal, 87(345), 71–80.
Reynolds, M. and Smolensky, E. (1977). Public Expenditures, Taxes, and the Distribution of Income. New York: Academic Press.
set.seed(1) pre <- iq_sample_data("income")$income # Progressive tax: higher rate for higher incomes tax <- pre * (0.10 + 0.15 * (pre / max(pre))) iq_kakwani(pre, tax) # With bootstrap CIs iq_kakwani(pre, tax, ci = TRUE, R = 200)set.seed(1) pre <- iq_sample_data("income")$income # Progressive tax: higher rate for higher incomes tax <- pre * (0.10 + 0.15 * (pre / max(pre))) iq_kakwani(pre, tax) # With bootstrap CIs iq_kakwani(pre, tax, ci = TRUE, R = 200)
Computes the Kolm index, the only standard inequality measure that is translation-invariant (absolute). Adding the same amount to every income leaves the index unchanged. All other indices in this package are scale-invariant (relative): multiplying every income by the same factor leaves them unchanged.
iq_kolm( x, weights = NULL, alpha = 1, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )iq_kolm( x, weights = NULL, alpha = 1, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )
x |
Numeric vector of incomes. |
weights |
Optional numeric vector of survey weights. |
alpha |
Numeric. Inequality aversion parameter (> 0). Default |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
Higher alpha gives more weight to inequality at the bottom of the distribution. The index is always non-negative and equals zero only under perfect equality. The Kolm index is well-defined for any real values, including negatives.
An S3 object of class "iq_kolm" with elements:
Numeric. The Kolm index.
Numeric. The inequality aversion parameter used.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
Kolm, S.-C. (1976). "Unequal Inequalities II." Journal of Economic Theory, 13(1), 82–111.
d <- iq_sample_data("income") iq_kolm(d$income, alpha = 1) # With bootstrap CIs iq_kolm(d$income, alpha = 1, ci = TRUE, R = 200) # Higher aversion to inequality at the bottom iq_kolm(d$income, alpha = 2)d <- iq_sample_data("income") iq_kolm(d$income, alpha = 1) # With bootstrap CIs iq_kolm(d$income, alpha = 1, ci = TRUE, R = 200) # Higher aversion to inequality at the bottom iq_kolm(d$income, alpha = 2)
Computes the Lorenz curve: the cumulative share of income held by the
cumulative share of the population, ordered from poorest to richest.
The result can be plotted with plot().
iq_lorenz(x, weights = NULL, na.rm = FALSE)iq_lorenz(x, weights = NULL, na.rm = FALSE)
x |
Numeric vector of incomes (non-negative). |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
An S3 object of class "iq_lorenz" with elements:
data.frame with columns cum_pop and cum_income
(both 0 to 1). Starts at (0, 0) and ends at (1, 1).
Numeric. The Gini coefficient (twice the area between the curve and the diagonal).
Integer. Number of observations.
Lorenz, M. O. (1905). "Methods of Measuring the Concentration of Wealth." Publications of the American Statistical Association, 9(70), 209–219.
d <- iq_sample_data("income") lc <- iq_lorenz(d$income) plot(lc)d <- iq_sample_data("income") lc <- iq_lorenz(d$income) plot(lc)
Computes the Palma ratio: the share of total income received by the top 10 percent divided by the share received by the bottom 40 percent.
iq_palma( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )iq_palma( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )
x |
Numeric vector of incomes. |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
The Palma ratio is motivated by Palma's (2011) observation that the "middle" 50 percent (deciles 5–9) tends to capture a remarkably stable share of income across countries, so inequality is driven by what happens at the tails. A Palma ratio of 1 means the top 10 percent and bottom 40 percent receive equal shares.
Distributions containing negative values may produce a non-positive
bottom-40 share, in which case the Palma ratio is undefined. The
function returns NA with a warning rather than aborting.
An S3 object of class "iq_palma" with elements:
Numeric. The Palma ratio.
Numeric. Share of income held by the top 10 percent.
Numeric. Share of income held by the bottom 40 percent.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
Palma, J. G. (2011). "Homogeneous Middles vs. Heterogeneous Tails, and the End of the 'Inverted-U': It's All About the Share of the Rich." Development and Change, 42(1), 87–153.
d <- iq_sample_data("income") iq_palma(d$income) # With bootstrap CIs iq_palma(d$income, ci = TRUE, R = 200) # Equal distribution: Palma = 0.25/0.40 = 0.625 iq_palma(rep(100, 100))d <- iq_sample_data("income") iq_palma(d$income) # With bootstrap CIs iq_palma(d$income, ci = TRUE, R = 200) # Equal distribution: Palma = 0.25/0.40 = 0.625 iq_palma(rep(100, 100))
Computes the ratio of two percentiles of the distribution. Common choices include P90/P10 (interdecile ratio), P80/P20, and P50/P10.
iq_percentile_ratio( x, weights = NULL, upper = 90, lower = 10, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )iq_percentile_ratio( x, weights = NULL, upper = 90, lower = 10, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )
x |
Numeric vector of incomes. |
weights |
Optional numeric vector of survey weights. |
upper |
Numeric. Upper percentile (0 to 100). Default |
lower |
Numeric. Lower percentile (0 to 100). Default |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
An S3 object of class "iq_percentile_ratio" with elements:
Numeric. The percentile ratio.
Numeric. The value at the upper percentile.
Numeric. The value at the lower percentile.
Numeric. The upper percentile used.
Numeric. The lower percentile used.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
d <- iq_sample_data("income") # P90/P10 (interdecile ratio) iq_percentile_ratio(d$income) # With bootstrap CIs iq_percentile_ratio(d$income, ci = TRUE, R = 200) # P80/P20 iq_percentile_ratio(d$income, upper = 80, lower = 20)d <- iq_sample_data("income") # P90/P10 (interdecile ratio) iq_percentile_ratio(d$income) # With bootstrap CIs iq_percentile_ratio(d$income, ci = TRUE, R = 200) # P80/P20 iq_percentile_ratio(d$income, upper = 80, lower = 20)
Computes the Wolfson bipolarisation index, which measures the extent to which a distribution is bimodal (clustering at the tails) rather than unimodal. Higher values indicate more polarisation.
iq_polarisation( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )iq_polarisation( x, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )
x |
Numeric vector of incomes. |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
An S3 object of class "iq_polarisation" with elements:
Numeric. The Wolfson polarisation index.
Numeric. The Gini coefficient.
Numeric. The weighted median income.
Numeric. The weighted mean income.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
Wolfson, M. C. (1994). "When Inequalities Diverge." American Economic Review, 84(2), 353–358.
Foster, J. E. and Wolfson, M. C. (2010). "Polarization and the Decline of the Middle Class: Canada and the US." Journal of Economic Inequality, 8(2), 247–273.
d <- iq_sample_data("income") iq_polarisation(d$income) # With bootstrap CIs iq_polarisation(d$income, ci = TRUE, R = 200)d <- iq_sample_data("income") iq_polarisation(d$income) # With bootstrap CIs iq_polarisation(d$income, ci = TRUE, R = 200)
Computes the Foster-Greer-Thorbecke (FGT) family of poverty measures, plus the Sen index and the Watts index. All measures require a poverty line.
iq_poverty( x, line, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )iq_poverty( x, line, weights = NULL, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )
x |
Numeric vector of incomes (non-negative). |
line |
Numeric. The poverty line. Required. |
weights |
Optional numeric vector of survey weights. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals on the
headcount, gap, severity, and Sen indices? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
An S3 object of class "iq_poverty" with elements:
Numeric. FGT(0): proportion below the poverty line.
Numeric. FGT(1): average normalised gap.
Numeric. FGT(2): average squared normalised gap.
Numeric. Sen index.
Numeric. Watts index.
Numeric. The poverty line used.
Integer. Number of observations.
Integer. Number of observations below the line.
Optional list of bootstrap CIs for the four standard FGT/Sen
measures (each a list with lower and upper).
Numeric or NULL. Confidence level.
Foster, J., Greer, J. and Thorbecke, E. (1984). "A Class of Decomposable Poverty Measures." Econometrica, 52(3), 761–766.
Sen, A. (1976). "Poverty: An Ordinal Approach to Measurement." Econometrica, 44(2), 219–231.
d <- iq_sample_data("income") # Poverty line at the 20th percentile p20 <- quantile(d$income, 0.20) iq_poverty(d$income, line = p20) # With bootstrap CIs iq_poverty(d$income, line = p20, ci = TRUE, R = 200)d <- iq_sample_data("income") # Poverty line at the 20th percentile p20 <- quantile(d$income, 0.20) iq_poverty(d$income, line = p20) # With bootstrap CIs iq_poverty(d$income, line = p20, ci = TRUE, R = 200)
Creates synthetic data for testing and demonstrating inequalitykit functions. Three types are available: individual incomes, a two-period panel for growth incidence analysis, and grouped incomes for decomposition.
iq_sample_data(type = c("income", "panel", "grouped"))iq_sample_data(type = c("income", "panel", "grouped"))
type |
Character. One of |
A data.frame.
"income"1000 rows with columns income and weight.
Drawn from a lognormal distribution (mean log 10.5, sd log 0.8),
producing realistic income-like data centred around 40,000.
"panel"1000 rows with columns income_t0, income_t1,
weight. Two periods with heterogeneous growth (bottom grows
slower than top, mimicking rising inequality).
"grouped"1000 rows with columns income, group,
weight. Three groups (A, B, C) with different mean incomes
for between/within decomposition.
d <- iq_sample_data("income") head(d) panel <- iq_sample_data("panel") head(panel) grouped <- iq_sample_data("grouped") head(grouped)d <- iq_sample_data("income") head(d) panel <- iq_sample_data("panel") head(panel) grouped <- iq_sample_data("grouped") head(grouped)
Computes the S-Gini coefficient, a one-parameter generalisation of the
Gini that allows the user to specify how much weight to give different
parts of the distribution. The standard Gini is the special case
delta = 2.
iq_sgini( x, weights = NULL, delta = 2, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )iq_sgini( x, weights = NULL, delta = 2, na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95, negatives = c("error", "keep") )
x |
Numeric vector of incomes. |
weights |
Optional numeric vector of survey weights. |
delta |
Numeric. Inequality aversion parameter (> 1). Default |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
negatives |
Character. |
Lower delta (approaching 1) gives equal weight everywhere; higher delta gives more weight to the bottom of the distribution. The standard Gini (delta = 2) weights by rank position. Delta = 3 or 4 places even more emphasis on the poorest.
Like the standard Gini, the S-Gini is well-defined for distributions
containing negative values via negatives = "keep", though the
resulting index is no longer bounded in the unit interval.
An S3 object of class "iq_sgini" with elements:
Numeric. The S-Gini coefficient.
Numeric. The inequality aversion parameter used.
Integer. Number of observations.
Bootstrap CI fields, NULL unless
ci = TRUE.
Logical. Whether the input contained negatives.
Donaldson, D. and Weymark, J. A. (1980). "A Single-Parameter Generalization of the Gini Indices of Inequality." Journal of Economic Theory, 22(1), 67–86.
Yitzhaki, S. (1983). "On an Extension of the Gini Inequality Index." International Economic Review, 24(3), 617–628.
d <- iq_sample_data("income") # Standard Gini (delta = 2) iq_sgini(d$income, delta = 2) # More weight on the bottom of the distribution iq_sgini(d$income, delta = 3) # With bootstrap CIs iq_sgini(d$income, delta = 3, ci = TRUE, R = 200)d <- iq_sample_data("income") # Standard Gini (delta = 2) iq_sgini(d$income, delta = 2) # More weight on the bottom of the distribution iq_sgini(d$income, delta = 3) # With bootstrap CIs iq_sgini(d$income, delta = 3, ci = TRUE, R = 200)
Computes the Theil T index (GE(1)), Theil L / mean log deviation (GE(0)), or a generalised entropy index GE(alpha) for any non-negative alpha.
iq_theil( x, weights = NULL, index = "T", na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )iq_theil( x, weights = NULL, index = "T", na.rm = FALSE, ci = FALSE, R = 1000L, level = 0.95 )
x |
Numeric vector of incomes (strictly positive). |
weights |
Optional numeric vector of survey weights. |
index |
Character or numeric. |
na.rm |
Logical. Remove |
ci |
Logical. Compute bootstrap confidence intervals? Default |
R |
Integer. Number of bootstrap replicates. Default |
level |
Numeric. Confidence level. Default |
Generalised entropy indices are the only class of inequality measures that are both decomposable by population subgroups and satisfy the transfer principle. Higher values indicate more inequality.
Theil T (GE(1)) and Theil L (GE(0)) involve log(x) and so require
strictly positive values. GE(alpha) for alpha > 1 is well-defined for
non-negative x but is highly sensitive to small or zero values. For
wealth or income net of taxes/transfers (which can be zero or negative)
use the Gini, S-Gini, or Kolm index instead.
Note on cross-validation against ineq: this package uses the
textbook GE(alpha) convention, where index = "T" is GE(1) (Theil T)
and index = "L" is GE(0) (mean log deviation). The legacy
ineq package uses the opposite indexing, so
ineq::Theil(x, parameter = 0) matches iq_theil(x, "T") and
ineq::Theil(x, parameter = 1) matches iq_theil(x, "L").
An S3 object of class "iq_theil" with elements:
Numeric. The index value.
Numeric. The alpha parameter used.
Character. Human-readable name of the index.
Integer. Number of observations.
Numeric or NULL. Bootstrap standard error.
Numeric or NULL. Lower bound of the CI.
Numeric or NULL. Upper bound of the CI.
Numeric or NULL. Confidence level.
Theil, H. (1967). Economics and Information Theory. Amsterdam: North-Holland.
Cowell, F. A. (2011). Measuring Inequality. 3rd edition. Oxford University Press.
Shorrocks, A. F. (1980). "The Class of Additively Decomposable Inequality Measures." Econometrica, 48(3), 613–625.
d <- iq_sample_data("income") # Theil T (GE(1)) iq_theil(d$income, index = "T") # With bootstrap CIs iq_theil(d$income, index = "T", ci = TRUE, R = 200) # Mean log deviation (GE(0)) iq_theil(d$income, index = "L") # GE(2): half the squared coefficient of variation iq_theil(d$income, index = 2)d <- iq_sample_data("income") # Theil T (GE(1)) iq_theil(d$income, index = "T") # With bootstrap CIs iq_theil(d$income, index = "T", ci = TRUE, R = 200) # Mean log deviation (GE(0)) iq_theil(d$income, index = "L") # GE(2): half the squared coefficient of variation iq_theil(d$income, index = 2)