deprivateR provides a unified framework for calculating
measures of area-level deprivation in the United States. These measures
are commonly used in social determinants of health research to quantify
neighborhood disadvantage.
The package supports the following indices:
"adi") -
a factor-based measure of socioeconomic deprivation (via
sociome)"gini") - a measure
of income inequality (via tidycensus)"ndi_m") - a factor-based deprivation measure (via
ndi)"ndi_pw") - an alternative NDI formulation (via
ndi)"svi10", "svi14", "svi20",
"svi20s") - the CDC’s composite vulnerability measure, with
four methodology variantsData can be retrieved at the county, census tract, ZCTA5, or ZCTA3 level for years 2010 through 2022.
The easiest way to install deprivateR is from CRAN:
install.packages("deprivateR")Alternatively, you can install deprivateR from
GitHub:
# install.packages("remotes")
remotes::install_github("pfizer-opensource/deprivateR")To download data from the Census Bureau, you need a free API key. You can request one at https://api.census.gov/data/key_signup.html.
Once you have your key, store it for use with
tidycensus:
tidycensus::census_api_key("YOUR_KEY_HERE", install = TRUE)This saves the key to your .Renviron file so it is
available across sessions.
The package includes sample data so you can explore functionality without an API key. The sample data contains 2022 ACS 5-year estimates for all 115 counties in Missouri.
# load sample data for the Messer NDI
ndi_data <- dep_sample_data(index = "ndi_m")
# calculate the index
ndi_results <- dep_calc_index(
ndi_data,
geography = "county",
index = "ndi_m",
year = 2022,
return_percentiles = TRUE
)
#> Warning: The proportion of variance explained by PC1 is less than 0.50.
# view the results
ndi_results[, c("GEOID", "NAME", "NDI_M")]
#> # A tibble: 115 × 3
#> GEOID NAME NDI_M
#> <chr> <chr> <dbl>
#> 1 29001 Adair County, Missouri 63.2
#> 2 29003 Andrew County, Missouri 6.14
#> 3 29005 Atchison County, Missouri 24.6
#> 4 29007 Audrain County, Missouri 58.8
#> 5 29009 Barry County, Missouri 59.6
#> 6 29011 Barton County, Missouri 92.1
#> 7 29013 Bates County, Missouri 83.3
#> 8 29015 Benton County, Missouri 68.4
#> 9 29017 Bollinger County, Missouri 78.9
#> 10 29019 Boone County, Missouri 14.0
#> # ℹ 105 more rowsThe NDI_M column contains the calculated Neighborhood
Deprivation Index scores. Higher values indicate greater
deprivation.
To use deprivation scores as categorical variables in statistical models, you can split them into quantiles:
# split NDI into quartiles
ndi_results <- dep_quantiles(
ndi_results,
source_var = NDI_M,
new_var = ndi_quartile,
n = 4L,
return = "label"
)
# view the distribution
table(ndi_results$ndi_quartile)
#>
#> (1) Lowest Quartile (2) Second Quartile (3) Third Quartile
#> 29 29 28
#> (4) Highest Quartile
#> 29To create choropleth maps, use dep_map_breaks() to
calculate appropriate classification breaks:
# calculate Fisher-Jenks breaks with 5 classes
ndi_results <- dep_map_breaks(
ndi_results,
var = "NDI_M",
new_var = "map_class",
classes = 5,
style = "fisher"
)
# view the break labels
levels(ndi_results$map_class)
#> [1] "0.00 - 19.74" "19.75 - 39.91" "39.92 - 60.09" "60.10 - 80.26"
#> [5] "80.27 - 100.00"You can also specify manual breaks:
# define custom break points
my_breaks <- c(
min(ndi_results$NDI_M, na.rm = TRUE),
25, 50, 75,
max(ndi_results$NDI_M, na.rm = TRUE)
)
# apply manual breaks
ndi_results <- dep_map_breaks(
ndi_results,
var = "NDI_M",
new_var = "map_class_manual",
breaks = my_breaks
)
levels(ndi_results$map_class_manual)
#> [1] "0.00 - 25.00" "25.01 - 50.00" "50.01 - 75.00" "75.01 - 100.00"When you have a Census API key configured,
dep_get_index() handles the full workflow of downloading
raw data and computing indices in one step:
# download and calculate SVI for Missouri tracts
mo_svi <- dep_get_index(
geography = "tract",
index = "svi20",
year = 2020,
state = "MO"
)You can request multiple indices in a single call:
# calculate ADI and Gini together for Missouri counties
mo_multi <- dep_get_index(
geography = "county",
index = c("adi", "gini"),
year = 2022,
state = "MO"
)Set output = "sf" to get results as an sf
object with geometry attached, ready for mapping with
ggplot2 or leaflet:
# get SVI with geometry for mapping
mo_svi_sf <- dep_get_index(
geography = "tract",
index = "svi20",
year = 2020,
state = "MO",
output = "sf"
)
# plot with ggplot2
library(ggplot2)
ggplot(mo_svi_sf) +
geom_sf(aes(fill = SVI20), color = NA) +
scale_fill_viridis_c(direction = -1) +
theme_void() +
labs(title = "Social Vulnerability Index, Missouri Tracts (2020)")For deeper analysis, you can retain subscales and the underlying component variables:
# keep SVI theme subscales and all component variables
mo_detailed <- dep_get_index(
geography = "county",
index = "svi20",
year = 2020,
state = "MO",
keep_subscales = TRUE,
keep_components = TRUE
)For more control, you can separate data retrieval from calculation. This is useful when you want to inspect or modify the raw data before computing scores:
# step 1: build the variable list and download data
library(tidycensus)
vars <- dep_build_varlist(
geography = "county",
index = "ndi_m",
year = 2022
)
raw_data <- get_acs(
geography = "county",
variables = vars,
year = 2022,
state = "MO",
output = "wide"
)
# step 2: calculate the index on your data
results <- dep_calc_index(
raw_data,
geography = "county",
index = "ndi_m",
year = 2022
)| Function | Purpose |
|---|---|
dep_get_index() |
Download data and calculate indices (one step) |
dep_calc_index() |
Calculate indices on existing data |
dep_build_varlist() |
Get the Census variable names needed for an index |
dep_sample_data() |
Load bundled sample data (no API key required) |
dep_quantiles() |
Split scores into quantile categories |
dep_percentiles() |
Calculate percentile ranks |
dep_map_breaks() |
Create classification breaks for choropleth maps |