Downloads raw data and then calculates various measures of
deprivation and/or vulnerability, including a range of options for structuring output. The
included measures include four versions of the CDC's social vulnerability
index, which is a unique offering, along with wrappers that bring in
additional measures from related packages: the area deprivation index
(ADI; via sociome
), gini coefficient (via tidycensus
), and
the neighborhood deprivation index (NDI; via ndi
). Both ADI and NDI
contain variations as well. See Details for more information.
dep_get_index(geography, index, year, survey = "acs5",
return_percentiles = FALSE, keep_subscales = FALSE,
keep_components = FALSE, output = "wide",
state = NULL, county = NULL, puerto_rico = FALSE, zcta = NULL,
zcta_geo_method = NULL, zcta_cb = FALSE, zcta3_method = NULL,
shift_geo = FALSE, key = NULL)
A character scalar; one of "county"
, "zcta3"
,
"zcta5"
, or "tract"
A character scalar or vector listing deprivation measures
to return. These include the area deprivation index ("adi"
),
the gini coefficient ("gini"
), two versions of the neighborhood
deprivation index by Messer ("ndi_m"
) and Powell and Wiley
("ndi_pw"
), and four versions of the social vulnerability
index ("svi10"
, "svi14"
, "svi20"
, and "svi20s"
).
See Details.
A numeric scalar or vector. 2010 is earliest year deprivateR
supports, while 2022 is the most recent year.
A character scalar representing the Census product. It can
be any American Community Survey product (either "acs1"
,
"acs3"
, or "acs5"
). Note that "acs3"
was
discontinued after 2013.
A logical scalar; if TRUE
, scales
(and their subscales) will be returned as percentiles instead of in
raw scores. If FALSE
(default), raw scores will be returned. Note
that SVI is returned as a percentile regardless of what
return_percentiles
is set to.
A logical scalar; if FALSE
(default), only the
full ADI and/or SVI scores (depending on what is passed to the index
argument) will be returned. If TRUE
and "svi"
is listed for
the index
argument, the four SVI "themes" (see Details) will be
returned along with the full SVI score. Similarly, if "adi"
is
listed for the index
argument, the three ADI subscales (see Details)
will be returned.
A logical scalar; if FALSE
(default), none of
the components used to calculate the deprivation measures will be returned.
If TRUE
, all of the demographic variables used to calculate ADI
and/or SVI will be returned.
A character scalar; if "wide"
(default), a tibble
will be returned with row per jurisdiction where individual measures of
deprivation stored in columns. If "tidy"
, a tibble will be returned
with one row for each combination of jurisdiction and deprivation measure.
If "sf"
, a "wide" data set will be returned with geometric data
appeneded to facilitate mapping and/or spatial statistics.
A character scalar or vector with character state abbreviations
(e.x. "MO"
) or numeric FIPS codes (e.x. 29
).
A character scalar or vector with character GEOIDs (e.x.
"29510"
)
A logical scalar; if TRUE
(default), data for Puerto
Rico will be included in calculations. If FALSE
, Puerto Rico will
not be included.
An optional vector of ZCTAs that demographic data are requested
for. If this is NULL
and geography = "zcta5"
, data will be
returned for all ZCTAs. If a vector is supplied and geography = "zcta5"
,
only data for those requested ZCTAs will be returned. The vector can be created
with zippeR::zi_get_geometry()
and should only contain five-digit ZCTAs.
A character scalar; if geography = "zcta5"
or
geography = "zcta3"
, either "intersect"
or "centroid"
,
should be supplied. These two options alter how ZCTA overlap with states
or counties is defined. See zippeR::zi_get_geometry()
for more
information.
A logical scalar; if FALSE
, the most detailed TIGER/Line
data will be used for style = "zcta5"
. If TRUE
, a
generalized (1:500k) version of the data will be used. The generalized
data will download significantly faster, though they show less detail.
According to the tigris::zctas()
documentation, the download size
if TRUE
is ~65MB while it is ~500MB if cb = FALSE
.
This argument does not apply to geography = "zcta3"
, which only returns
generalized data. It only applies if output = "sf"
.
A character scalar; if geography = "zcta3"
, a
method for aggregating spatially intensive values should be given;
either "mean"
or "median"
. In either case, a weighted approach
is used where total population for each five-digit ZCTA is used to calculate
individual ZCTAs' weights. For American Community Survey Data, this is
applied to the margin of error as well.
A logical scalar; if TRUE
, Alaska, Hawaii, and Puerto Rico
will be re-positioned so that the lie to the southwest of the continental
United States. This defaults to FALSE
, and can only be used when
states are not listed for the state
argument. It only applies if
output = "sf"
.
A Census API key, which can be obtained at
https://api.census.gov/data/key_signup.html. This can be omitted if
tidycensus::census_api_key()
has been used to write your key to
your .Renviron
file. You can check whether an API key has been
written to .Renviron
by using Sys.getenv("CENSUS_API_KEY")
.
deprivateR
provides a unique implementation of the Centers
for Disease Control's Social Vulnerability Index at a greater range
of years and geographies than the CDC originally supported. Four versions
of the SVI are offered:
"svi10"
The CDC's 2010 SVI vintage did not include a measure
of civilians with a disability, unlike their later vintages. This version
can be calculated using deprivateR
for each year from 2010 through
2021.
"svi14"
The CDC's 2014, 2016, and 2018 vintages added the
measure of civilians with a disability to their SVI calculations. The
disability measure was added to the American Community Survey beginning
in 2012, so this version can be calculated using deprivateR
for
each year from 2012 through 2021.
"svi20"
The CDC's 2020 vintage made multiple substantive
changes to how SVI is calculated that changed the underlying data
used for the first three of the four themes. In the SES theme: (1) per
capita income was replaced with a measure of housing burden; (2) poverty
was converted to 150
insurance. The Household Composition & Disability (HCD) theme was renamed
Household Characteristics (HOU), and the English language proficiency measure
was moved here from the former Minority Status and Language (MSL) theme.
Since the English language measure was removed from MSL theme, it was
renamed Racial & Ethnic Minority Status (REM). Though the CDC released
this definition with their 2020 data, the underlying data can be
accessed from the American Community Survey from 2012 onward. This means
that this version can be calculated using deprivateR
for
each year from 2012 through 2021.
"svi20s"
The CDC's 2020 vintage changed the variables
used to calculate the number of single-parent households. Their new
approach does not have the backward compatibility that the other
changes made in 2020 do. This version of SVI uses the same underlying
data for single-parent households that the CDC's 2020 vintage does,
along with the other changes made in 2020. This version can be
calculated using deprivateR
for each year from 2012 through
2019.
In addition, wrappers to the sociome
, ndi
, and tidycensus
package create a single point of departure for comparative work using multiple
measures of deprivation or inequality.
# \donttest{
# calculate ADI for all US counties
dep_get_index(geography = "county", index = "adi", year = 2022)
#> Warning: • You have not set a Census API key. Users without a key are limited to 500
#> queries per day and may experience performance limitations.
#> ℹ For best results, get a Census API key at
#> http://api.census.gov/data/key_signup.html and then supply the key to the
#> `census_api_key()` function to use it throughout your tidycensus session.
#> This warning is displayed once per session.
#> Warning:
#> The variables C24010_039 and C24010_040 are both present.
#> C24010_039 will be used for "civilian females age 16+ in
#> white-collar occupations", which is incorrect for pre-2010 data.
#> If seeking pre-2010 estimates, remove C24010_039 from dataset.
#>
#> Single imputation performed
#> # A tibble: 3,144 × 3
#> GEOID NAME ADI
#> <chr> <chr> <dbl>
#> 1 01001 Autauga County, Alabama 88.5
#> 2 01003 Baldwin County, Alabama 84.1
#> 3 01005 Barbour County, Alabama 137.
#> 4 01007 Bibb County, Alabama 124.
#> 5 01009 Blount County, Alabama 109.
#> 6 01011 Bullock County, Alabama 147.
#> 7 01013 Butler County, Alabama 122.
#> 8 01015 Calhoun County, Alabama 115.
#> 9 01017 Chambers County, Alabama 117.
#> 10 01019 Cherokee County, Alabama 107.
#> # ℹ 3,134 more rows
# calculate two forms of SVI for all Missouri ZCTAs
dep_get_index(geography = "zcta5", index = c("svi20", "svi20s"), year = 2022,
state = "MO")
#> # A tibble: 33,642 × 3
#> GEOID SVI_20 SVI_20S
#> <chr> <dbl> <dbl>
#> 1 01001 0.771 0.766
#> 2 01002 0.676 0.691
#> 3 01003 0.206 0.211
#> 4 01005 0.215 0.202
#> 5 01007 0.362 0.379
#> 6 01008 0.0468 0.0501
#> 7 01009 0.143 0.154
#> 8 01010 0.240 0.202
#> 9 01011 0.361 0.376
#> 10 01012 0.222 0.147
#> # ℹ 33,632 more rows
# calculate ADI and two forms of NDI for all US counties over three years
# percentiles are returned to ease comparison
dep_get_index(geography = "county", index = c("adi", "svi14"),
year = c(2018:2020), return_percentiles = TRUE)
#>
#> Single imputation performed
#>
#> Single imputation performed
#>
#> Single imputation performed
#> # A tibble: 9,427 × 5
#> GEOID NAME YEAR SVI ADI
#> <chr> <chr> <int> <dbl> <dbl>
#> 1 01001 Autauga County, Alabama 2018 44.2 28.3
#> 2 01001 Autauga County, Alabama 2019 38.8 31.6
#> 3 01001 Autauga County, Alabama 2020 47.5 35.1
#> 4 01003 Baldwin County, Alabama 2018 22.7 16.6
#> 5 01003 Baldwin County, Alabama 2019 24.1 16.3
#> 6 01003 Baldwin County, Alabama 2020 22.9 15.5
#> 7 01005 Barbour County, Alabama 2018 99.6 96.0
#> 8 01005 Barbour County, Alabama 2019 99.7 97.2
#> 9 01005 Barbour County, Alabama 2020 99.5 97.0
#> 10 01007 Bibb County, Alabama 2018 59.5 65.9
#> # ℹ 9,417 more rows
# }