Quantile Ratio Index Estimator for Grouped Data

Computes the quantile ratio index (QRI) for measuring inequality from grouped frequency data using linear interpolation for quantile estimation. This function is intended to be used for administrative or tax data, which are very often in the form of grouped data. Therefore, sampling weights are not considered.

Usage

qri_grouped(
  freq,
  lower_bounds,
  upper_bounds,
  M = 100,
  midpoints = NULL,
  na.rm = TRUE
)

Arguments

freq: Numeric vector of class frequencies (counts). Must be non-negative.
lower_bounds: Numeric vector of lower class bounds.
upper_bounds: Numeric vector of upper class bounds.
M: Integer; number of quantile ratios to average (default: 100).
midpoints: Optional numeric vector of class midpoints. Used only as fallback when a quantile class has zero frequency.
na.rm: Logical; should missing values in frequencies be removed? (default: TRUE)

Value

A scalar numeric value representing the estimated inequality by the quantile ratio index (QRI) for grouped data.

Details

Consider grouped data divided into $L$ classes with known boundaries and observed frequencies $f_1, \ldots, f_L$. The QRI estimator for grouped data is approximated as:

$${QRI} \approx \frac{1}{M}\sum_{m=1}^{M}\left(1 - \frac{\widetilde{Q}(p_m/2)}{\widetilde{Q}(1 - p_m/2)}\right)$$

where:

$p_m = (m - 0.5)/M$ for $m=1, \ldots, M$
$\widetilde{Q}(p)$ denotes the $p$-th quantile computed from grouped data using linear interpolation (see quantile_grouped)
$M$ is the number of quantile ratios to average (default: 100)

The quantiles $\widetilde{Q}(p)$ are computed via quantile_grouped(), which uses linear interpolation within classes and automatically handles open-ended classes (with -Inf or Inf bounds).

The QRI ranges from 0 (perfect equality) to 1 (maximum inequality). The index measures inequality by averaging the relative differences between symmetric quantiles below and above the median, across the entire distribution.

Comparison with Microdata QRI

When individual-level (microdata) are available, use qri instead, which provides more accurate estimates. The grouped data version qri_grouped should be used when only frequency distributions are available, such as in published statistical tables or administrative aggregates.

The grouped QRI will generally approximate the microdata QRI well when:

Classes are sufficiently narrow
The distribution within classes is approximately uniform
Sample sizes within classes are adequate

References

Prendergast LA, Staudte RG (2018). “A simple and effective inequality measure.” The American Statistician, 72, 328–343.

Examples

# Basic example with closed classes
income_freq <- c(120, 180, 150, 80, 40, 20, 10)
income_lower <- c(0, 15000, 30000, 45000, 60000, 80000, 100000)
income_upper <- c(15000, 30000, 45000, 60000, 80000, 100000, 150000)

qri_grouped(income_freq, income_lower, income_upper)
#> [1] 0.5888192

# Example with open-ended classes (Italian MEF-style data)
wage_freq <- c(150, 200, 180, 220, 180, 50, 15, 5)
wage_lower <- c(-Inf, 0, 10000, 15000, 26000, 55000, 75000, 120000)
wage_upper <- c(0, 10000, 15000, 26000, 55000, 75000, 120000, Inf)

# Compute QRI (automatically handles open classes)
qri_grouped(wage_freq, wage_lower, wage_upper)
#> [1] 0.7268536