This helper function can be used to return quantiles of a deprivation index (or any other continuous distribution). This is useful for constructing independent variables for statistical analysis. The function supports splitting a distribution at the median (2 quantiles) through deciles (10 quantiles) if character labels are desired.
dep_quantiles(.data, source_var, new_var, n = 4L, return = "label")
A tibble containing the data to be used for calculating quantiles.
Required; the quoted or unquoted source variable to be divided into quantiles.
Required; the quoted or unquoted name of the new variable to be created containing the quantile values.
Required integer scalar; the number of quantiles to divide the source
variable into. Defaults to 4L
(quartiles), but can be set to any
value appropriate for your data as long as it is greater than or equal
to 2L
.
Required character scalar; one of either "label"
(default)
or "factor"
. If "label"
, the function will return a character
vector of quantile labels. If "factor"
, the function will return
the underlying factor used in the creation of the quantiles measure.
A copy of .data
with a new variable containing the requested
quantile.
## load sample data
ndi_m <- dep_sample_data(index = "ndi_m")
## calculate NDI with sample data
ndi_m <- dep_calc_index(ndi_m, geography = "county", index = "ndi_m", year = 2022,
return_percentiles = TRUE)
#> Warning: The proportion of variance explained by PC1 is less than 0.50.
## calculate quantiles, return label
ndi_m <- dep_quantiles(ndi_m, source_var = NDI_M, new_var = ndi_m_quartiles_l)
unique(sort(ndi_m$ndi_m_quartiles_l))
#> [1] "(1) Lowest Quartile" "(2) Second Quartile" "(3) Third Quartile"
#> [4] "(4) Highest Quartile"
## calculate quantiles, return label
ndi_m <- dep_quantiles(ndi_m, source_var = NDI_M, new_var = ndi_m_quartiles_l6,
n = 6L)
unique(sort(ndi_m$ndi_m_quartiles_l6))
#> [1] "(1) Lowest Sextile" "(2) Second Sextile" "(3) Third Sextile"
#> [4] "(4) Fourth Sextile" "(5) Fifth Sextile" "(6) Highest Sextile"
## calculate quantiles, return factor
ndi_m <- dep_quantiles(ndi_m, source_var = NDI_M, new_var = ndi_m_quartiles_f,
return = "factor")
levels(ndi_m$ndi_m_quartiles_f)
#> [1] "0.00 - 25.00" "25.01 - 50.00" "50.01 - 75.00" "75.01 - 100.00"