This function takes input ZCTA data and aggregates it to three-digit areas, which are considerably larger. These regions are sometimes used in American health care contexts for publishing geographic identifiers.
Usage
zi_aggregate(.data, year, extensive = NULL, intensive = NULL,
intensive_method = "mean", survey, output = "tidy", zcta = NULL,
key = NULL)
Arguments
- .data
A tidy set of demographic data containing one or more variables that should be aggregated to three-digit ZCTAs. This data frame or tibble should contain all five-digit ZCTAs within the three digit ZCTAs that you plan to use for aggregating data. See Details below for formatting requirements.
- year
A four-digit numeric scalar for year.
zippeR
currently supports data for from 2010 to 2022. Differentsurvey
products are available for different years. See thesurvey
parameter for more details.- extensive
A character scalar or vector listing all extensive (i.e. count data) variables you wish to aggregate. These will be summed. For American Community Survey data, the margin of error will be calculated by taking the square root of the summed, squared margins of error for each five-digit ZCTA within a given three-digit ZCTA.
- intensive
A character scalar or vector listing all intensive (i.e. ratio, percent, or median data) variables you wish to aggregate. These will be combined using the approach listed for
intensive_method
.- intensive_method
A character scalar; either
"mean"
(default) or"median"
. In either case, a weighted approach is used where total population for each five-digit ZCTA is used to calculate individual ZCTAs' weights. For American Community Survey Data, this is applied to the margin of error as well.- survey
A character scalar representing the Census product. It can be either a Decennial Census product (either
"sf1"
or"sf3"
) or an American Community Survey product (either"acs1"
,"acs3"
, or"acs5"
). For Decennial Census calls, only the 2010 Census is available. In addition, if a variable cannot be found in"sf1"
, the function will look in"sf3"
. Also note that"acs3"
was discontinued after 2013.- output
A character scalar; one of
"tidy"
(long output) or"wide"
depending on the type of data format you want. If you are planning to join these data with geometric data,"wide"
is the strongly encouraged format.- zcta
An optional vector of ZCTAs that demographic data are requested for. If this is
NULL
, data will be returned for all ZCTAs. If a vector is supplied, only data for those requested ZCTAs will be returned. The vector can be created withzi_get_geometry()
. Ifstyle = "zcta5"
, this vector should be made up of five-digitGEOID
values. Ifstyle = "zcta3"
, this vector should be made up of three-digitalZCTA3
values.- key
A Census API key, which can be obtained at https://api.census.gov/data/key_signup.html. This can be omitted if
tidycensus::census_api_key()
has been used to write your key to your.Renviron
file. You can check whether an API key has been written to.Renviron
by usingSys.getenv("CENSUS_API_KEY")
.
Examples
# load sample demographic data
mo22_demos <- zi_mo_pop
# the above data can be replicated with the following code:
# zi_get_demographics(year = 2022, variables = c("B01003_001", "B19013_001"),
# survey = "acs5")
# load sample geometric data
mo22_zcta3 <- zi_mo_zcta3
# the above data can be replicated with the following code:
# zi_get_geometry(year = 2022, style = "zcta3", state = "MO",
# method = "intersect")
# aggregate a single variable
zi_aggregate(mo22_demos, year = 2020, extensive = "B01003_001", survey = "acs5",
zcta = mo22_zcta3$ZCTA3)
#> # A tibble: 62 × 4
#> # Groups: ZCTA3 [31]
#> ZCTA3 variable estimate moe
#> <chr> <chr> <dbl> <dbl>
#> 1 501 B01003_001 190606 1928.
#> 2 501 B19013_001 4927943 224587.
#> 3 516 B01003_001 22779 649.
#> 4 516 B19013_001 1326659 128553.
#> 5 525 B01003_001 109992 1607.
#> 6 525 B19013_001 2965635 153947.
#> 7 526 B01003_001 101629 1496.
#> 8 526 B19013_001 2187815 116391.
#> 9 620 B01003_001 296331 2842.
#> 10 620 B19013_001 4623927 163752.
#> # ℹ 52 more rows
# \donttest{
# aggregate multiple variables, outputting wide data
zi_aggregate(mo22_demos, year = 2020,
extensive = "B01003_001", intensive = "B19013_001", survey = "acs5",
zcta = mo22_zcta3$ZCTA3, output = "wide")
#> Warning: • You have not set a Census API key. Users without a key are limited to 500
#> queries per day and may experience performance limitations.
#> ℹ For best results, get a Census API key at
#> http://api.census.gov/data/key_signup.html and then supply the key to the
#> `census_api_key()` function to use it throughout your tidycensus session.
#> This warning is displayed once per session.
#> # A tibble: 31 × 5
#> # Groups: ZCTA3 [31]
#> ZCTA3 B01003_001E B01003_001M B19013_001E B19013_001M
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 501 190606 1928. 74666. 19684.
#> 2 516 22779 649. 69824. 21896.
#> 3 525 109992 1607. 63099. 17528.
#> 4 526 101629 1496. 70575. 16179.
#> 5 620 296331 2842. 66056. 16494.
#> 6 630 736536 7216. 84735. 12280.
#> 7 631 901642 8313. 69807. 9031
#> 8 633 536266 5361. 78343. 16652
#> 9 634 67129 1583. 60690. 21671.
#> 10 635 60410 1185. 54968. 17621.
#> # ℹ 21 more rows
# }