Compute representative points for census tracts
Source:R/representative-points.R
compute_representative_points.RdComputes a single representative point for each census tract polygon. Three methods are available: geometric point-on-surface (default), centroid, or population-density-weighted using WorldPop raster data.
Usage
compute_representative_points(
tracts_sf,
method = c("point_on_surface", "centroid", "pop_weighted"),
pop_raster = NULL,
min_area_for_pop_weight = 1,
tract_id = "id",
osm_roads = NULL,
verbose = TRUE
)Arguments
- tracts_sf
An
sfpolygon object containing census tract geometries.- method
Character. Method for computing representative points:
"point_on_surface"(Default) Uses
sf::st_point_on_surface(). Guarantees the point falls inside the polygon, unlike centroids which can fall outside concave shapes."centroid"Uses
sf::st_centroid(). Classic geometric centroid. May fall outside concave polygons."pop_weighted"Uses a population density raster (WorldPop Constrained 2020 by default) to find the best representative point within each tract via cluster analysis and optional road proximity. Only applied to tracts with area >=
min_area_for_pop_weight; smaller tracts usepoint_on_surface. Requires theterrapackage.
- pop_raster
A terra::SpatRaster object, a file path to a GeoTIFF, or
NULL. Population density raster formethod = "pop_weighted". IfNULL(default), the WorldPop Brazil Constrained 2020 raster (~48 MB) is downloaded automatically and cached. Ignored for other methods.- min_area_for_pop_weight
Numeric. Minimum tract area in km² for applying the population-weighted method. Tracts smaller than this threshold use
point_on_surfaceinstead. Default: 1 (km²). Only used whenmethod = "pop_weighted".- tract_id
Character. Name of the ID column in
tracts_sf. Default:"id".- osm_roads
An
sfobject with OSM road geometries, orNULL. When provided andmethod = "pop_weighted", cells within 200m of roads are preferred, using a tiered hierarchy: Tier 1 (primary, secondary, tertiary, residential) > Tier 2 (unclassified, service) > Tier 3 (track, path, footway). The algorithm tries the highest tier first and falls back only if no cells are within 200m of that tier's roads. Typically read from the clipped PBF file bycompute_travel_times().- verbose
Logical. Print progress messages? Default:
TRUE.
Value
An sf POINT object in WGS84 (EPSG:4326) with one row per tract,
preserving the tract_id column. Carries attributes:
"point_method"Character. Which method was used.
"pop_raster"The SpatRaster object (if
pop_weighted)."no_pop_tracts"Character vector of tract IDs that fell back to
point_on_surfacedue to zero population."pop_weighted_diagnostics"Named list of per-tract diagnostic records (cluster counts, patch populations, selected point, road proximity info, road tier used). See
plot_representative_points().
Details
The pop_weighted method uses cluster-based selection: it identifies
connected clusters of populated cells via terra::patches() and selects
the largest cluster by total population. If OSM road data is available
(via osm_roads), it applies tiered road proximity refinement:
cells within 200m of well-connected road types (primary, secondary,
tertiary, residential) are preferred over cells near isolated rural
roads (tracks, paths). This reduces the chance of placing the
representative point near a disconnected road fragment that r5r
cannot route from.
See also
plot_representative_points() to visualize the selection process.
Examples
if (FALSE) { # \dontrun{
tracts <- br_prepare_tracts(code_muni = 3170701)
tracts$id <- tracts$code_tract
# Default: point on surface
pts <- compute_representative_points(tracts)
# Population-weighted for large tracts
pts_pop <- compute_representative_points(tracts, method = "pop_weighted")
} # }