The output from zi_load_crosswalk()
for HUD data requires
additional processing to be used in the zi_crosswalk()
function.
This function prepares the HUD data for use in joins.
Arguments
- .data
The output from
zi_load_crosswalk()
with HUD data.- by
Character scalar; the column name to use for identifying the best match for a given ZIP Code. This could be either
"residential"
,"commercial"
, or"total"
.- return_max
Logical scalar; if
TRUE
(default), only the county with the highest proportion of the ZIP Code type will be returned. If the ZIP Code straddles two states, two records will be returned. IfFALSE
, all records for the ZIP Code will be returned. Where a tie exists (i.e. two counties each contain half of all addresses), the county with the lowestGEOID
value will be returned.
Examples
# load sample crosswalk data
mo_xwalk <- zi_mo_hud
# the above data can be replicated with the following code:
# zi_load_crosswalk(zip_source = "HUD", year = 2023, qtr = 1,
# target = "COUNTY", query = "MO")
# prep crosswalk
# when a ZIP Code crosses county boundaries, the portion with the largest
# number of residential addresses will be returned
zi_prep_hud(mo_xwalk, by = "residential", return_max = TRUE)
#> # A tibble: 1,127 × 5
#> zip5 geoid state state_fips ratio
#> <chr> <chr> <chr> <chr> <dbl>
#> 1 63005 29189 MO 29 0.997
#> 2 63006 29189 MO 29 1
#> 3 63010 29099 MO 29 1
#> 4 63011 29189 MO 29 1
#> 5 63012 29099 MO 29 1
#> 6 63013 29071 MO 29 1
#> 7 63014 29071 MO 29 0.871
#> 8 63015 29071 MO 29 0.762
#> 9 63016 29099 MO 29 1
#> 10 63017 29189 MO 29 1
#> # ℹ 1,117 more rows