Influence Function for the Gini Coefficient

Computes the influence function for the Gini coefficient, useful for variance estimation and linearization in complex survey designs Langel and Tillé (2013) .

Usage

if_gini(y, weights = NULL, na.rm = TRUE)

Arguments

y: Numeric vector of income or variable of interest.
weights: Numeric vector of sampling weights. If NULL (default), equal weights are assumed (simple random sampling).
na.rm: Logical. Should missing values be removed? Default is TRUE.

Value

A numeric vector of the same length as y containing the influence function values for each observation, returned in the same order as the input y.

Details

The influence function for the Gini coefficient is computed using the linearization method, following Deville (1999) framework and as defined by Langel and Tillé (2013) . The influence function for Gini is:

$${I}(\widehat{G})_{k} = \frac{2W_k(y_k - \bar{Y}_k) + \hat{Y} - \hat{N}y_k - G(\hat{Y} + y_k\hat{N})}{\hat{N}\hat{Y}}$$

where:

$W_k = \sum_{i=1}^k w_i$ is the cumulative sum of weights up to rank $k$
$\bar{Y}_k =\frac{\sum_{l \in S} w_l y_l 1\left(W_l \leqslant W_k\right)}{W_k}$ is the weighted mean of values up to rank $k$
$\hat{N} = \sum_i w_i$ is the total sum of weights
$\hat{Y} = \sum_i w_i y_i$ is the weighted total of the variable
$G$ is the Gini coefficient estimate

References

Deville J (1999). “Variance estimation for complex statistics and estimators: linearization and residual techniques.” Survey methodology, 25, 193–204.

Langel M, Tillé Y (2013). “Variance estimation of the Gini index: revisiting a result several times published.” Journal of the Royal Statistical Society Series A, 176, 521–540.

Examples


data(synthouse)

eq <- synthouse$eq_income # Equivalized disposable income

# Simple example
z <- if_gini(eq)

# With weights
w <- synthouse$weight
z_weighted <- if_gini(y = eq, weights = w)