Skip to contents

Computes the influence function of the ratio between two quantiles (e.g., P90/P10) for all observations in the sample. See Deville (1999) and Osier (2009) for the definition of influence functions in finite population theory.

Usage

if_ratio_quantiles(
  y,
  weights = NULL,
  type = 6,
  prob_numerator = 0.9,
  prob_denominator = 0.1,
  na.rm = TRUE
)

Arguments

y

A numeric vector of data values.

weights

A numeric vector of sampling weights (optional). If NULL, equal weights are assumed.

type

Quantile estimation type: integer 49 or "HD" for Harrell–Davis (default: 6). See csquantile.

prob_numerator

Numeric in \((0,1)\); order of the quantile at the numerator (default: 0.90).

prob_denominator

Numeric in \((0,1)\); order of the quantile at the denominator (default: 0.10).

na.rm

Logical; remove missing values before computing? Default: TRUE.

Value

A numeric vector of influence function values, one per observation.

Details

The influence function for the ratio \(\widehat{R} = \widehat{Q}(p_n) / \widehat{Q}(p_d)\) is derived via the delta method applied to the quantile influence function of Deville (1999) :

$$ {I}\left(\frac{\widehat{Q}(p_n)}{\widehat{Q}(p_d)}\right)_{k} = \frac{ \left( \frac{p_n - \mathbf{1}(y_k \leq \widehat{Q}(p_n))} {\widehat{f}(\widehat{Q}(p_n)) \, \widehat{N}} \right) \widehat{Q}(p_d) - \left( \frac{p_d - \mathbf{1}(y_k \leq \widehat{Q}(p_d))} {\widehat{f}(\widehat{Q}(p_d)) \, \widehat{N}} \right) \widehat{Q}(p_n) }{ \widehat{Q}(p_d)^2 } $$

where:

  • \(\widehat{Q}(p)\) is the weighted sample quantile of order \(p\), computed via csquantile,

  • \(p_n\) and \(p_d\) are the orders of the numerator and denominator quantiles, respectively,

  • \(\widehat{f}(\cdot)\) is the estimated density function,

  • \(\widehat{N} = \sum_i w_i\) is the estimated population size.

The density \(\widehat{f}(y)\) is estimated via a Gaussian kernel: $$ \widehat{f}(y) = \frac{1}{\widehat{N}\, h \sqrt{2\pi}} \sum_{j \in s} w_j \exp\!\left\{ -\frac{(y - y_j)^2}{2h^2} \right\} $$ with bandwidth \(h = 0.79 \cdot \mathrm{IQR} \cdot \widehat{N}^{-1/5}\).

References

Deville J (1999). “Variance estimation for complex statistics and estimators: linearization and residual techniques.” Survey methodology, 25, 193–204. Osier G (2009). “Variance estimation for complex indicators of poverty and inequality using linearization techniques.” Survey Research Methods, 3, 167–195.

See also

ratio_quantiles, csquantile

Other influence functions: if_gini(), if_qri(), if_quantile(), if_share_ratio()

Examples

# On synthetic data
set.seed(1)
eq_synth <- rlnorm(30, 9, 0.7)
IF_synth <- if_ratio_quantiles(y = eq_synth, prob_numerator = 0.80,
                               prob_denominator = 0.20)

# On survey data
data(synthouse)
eq <- synthouse$eq_income[1:30]
w  <- synthouse$weight[1:30]
IF_vals <- if_ratio_quantiles(y = eq, weights = w, type = 6)