Skip to contents

The output from zi_load_crosswalk() for HUD data requires additional processing to be used in the zi_crosswalk() function. This function prepares the HUD data for use in joins.

Usage

zi_prep_hud(.data, by, return_max = TRUE)

Arguments

.data

The output from zi_load_crosswalk() with HUD data.

by

Character scalar; the column name to use for identifying the best match for a given ZIP Code. This could be either "residential", "commercial", or "total".

return_max

Logical scalar; if TRUE (default), only the county with the highest proportion of the ZIP Code type will be returned. If the ZIP Code straddles two states, two records will be returned. If FALSE, all records for the ZIP Code will be returned. Where a tie exists (i.e. two counties each contain half of all addresses), the county with the lowest GEOID value will be returned.

Value

A tibble that has been further prepared for use as a crosswalk.

Examples

# load sample crosswalk data
mo_xwalk <- zi_mo_hud

  # the above data can be replicated with the following code:
  # zi_load_crosswalk(zip_source = "HUD", year = 2023, qtr = 1,
  #   target = "COUNTY", query = "MO")

# prep crosswalk
# when a ZIP Code crosses county boundaries, the portion with the largest
# number of residential addresses will be returned
zi_prep_hud(mo_xwalk, by = "residential", return_max = TRUE)
#> # A tibble: 1,127 × 5
#>    zip5  geoid state state_fips ratio
#>    <chr> <chr> <chr> <chr>      <dbl>
#>  1 63005 29189 MO    29         0.997
#>  2 63006 29189 MO    29         1    
#>  3 63010 29099 MO    29         1    
#>  4 63011 29189 MO    29         1    
#>  5 63012 29099 MO    29         1    
#>  6 63013 29071 MO    29         1    
#>  7 63014 29071 MO    29         0.871
#>  8 63015 29071 MO    29         0.762
#>  9 63016 29099 MO    29         1    
#> 10 63017 29189 MO    29         1    
#> # ℹ 1,117 more rows