Influence Function for the Ratio Between Quantiles

Computes the influence function of the ratio between two quantiles (e.g., P90/P10) for all observations in the sample. See Deville (1999) and Osier (2009) for the definition of influence functions in finite population theory.

Usage

if_ratio_quantiles(
  y,
  weights = NULL,
  type = 6,
  prob_numerator = 0.9,
  prob_denominator = 0.1,
  na.rm = TRUE
)

Arguments

y: A numeric vector of data values.
weights: A numeric vector of sampling weights (optional). If NULL, equal weights are assumed.
type: Quantile estimation type: integer 4–9 or "HD" for Harrell–Davis (default: 6). See csquantile.
prob_numerator: Numeric in $(0,1)$; order of the quantile at the numerator (default: 0.90).
prob_denominator: Numeric in $(0,1)$; order of the quantile at the denominator (default: 0.10).
na.rm: Logical; remove missing values before computing? Default: TRUE.

Value

A numeric vector of influence function values, one per observation.

Details

The influence function for the ratio $\widehat{R} = \widehat{Q}(p_n) / \widehat{Q}(p_d)$ is derived via the delta method applied to the quantile influence function of Deville (1999) :

$$ {I}\left(\frac{\widehat{Q}(p_n)}{\widehat{Q}(p_d)}\right)_{k} = \frac{ \left( \frac{p_n - \mathbf{1}(y_k \leq \widehat{Q}(p_n))} {\widehat{f}(\widehat{Q}(p_n)) \, \widehat{N}} \right) \widehat{Q}(p_d) - \left( \frac{p_d - \mathbf{1}(y_k \leq \widehat{Q}(p_d))} {\widehat{f}(\widehat{Q}(p_d)) \, \widehat{N}} \right) \widehat{Q}(p_n) }{ \widehat{Q}(p_d)^2 } $$

where:

$\widehat{Q}(p)$ is the weighted sample quantile of order $p$, computed via csquantile,
$p_n$ and $p_d$ are the orders of the numerator and denominator quantiles, respectively,
$\widehat{f}(\cdot)$ is the estimated density function,
$\widehat{N} = \sum_i w_i$ is the estimated population size.

The density $\widehat{f}(y)$ is estimated via a Gaussian kernel: $$ \widehat{f}(y) = \frac{1}{\widehat{N}\, h \sqrt{2\pi}} \sum_{j \in s} w_j \exp\!\left\{ -\frac{(y - y_j)^2}{2h^2} \right\} $$ with bandwidth $h = 0.79 \cdot \mathrm{IQR} \cdot \widehat{N}^{-1/5}$.

References

Deville J (1999). “Variance estimation for complex statistics and estimators: linearization and residual techniques.” Survey methodology, 25, 193–204. Osier G (2009). “Variance estimation for complex indicators of poverty and inequality using linearization techniques.” Survey Research Methods, 3, 167–195.

Examples

# On synthetic data
set.seed(1)
eq_synth <- rlnorm(30, 9, 0.7)
IF_synth <- if_ratio_quantiles(y = eq_synth, prob_numerator = 0.80,
                               prob_denominator = 0.20)

# On survey data
data(synthouse)
eq <- synthouse$eq_income[1:30]
w  <- synthouse$weight[1:30]
IF_vals <- if_ratio_quantiles(y = eq, weights = w, type = 6)