Skip to contents

This function takes a crosswalk from pnadc_identify_periods and applies it to any PNADC dataset (quarterly or annual). It can optionally calibrate the survey weights to match external population totals at the chosen temporal granularity (month, fortnight, or week).

Usage

pnadc_apply_periods(
  data,
  crosswalk,
  weight_var,
  anchor,
  calibrate = TRUE,
  calibration_unit = c("month", "fortnight", "week"),
  calibration_min_cell_size = 1,
  target_totals = NULL,
  smooth = FALSE,
  keep_all = TRUE,
  verbose = TRUE
)

Arguments

data

A data.frame or data.table with PNADC microdata. Must contain join keys Ano, Trimestre, UPA, V1008, and V1014 to merge with the crosswalk.

crosswalk

A data.table crosswalk from pnadc_identify_periods.

weight_var

Character. Name of the survey weight column. Must be specified:

  • "V1028" for quarterly PNADC data

  • "V1032" for annual PNADC data (visit-specific or annual releases organized by quarters)

anchor

Character. How to anchor the weight redistribution. Must be specified:

  • "quarter" for quarterly data or annual releases organized by quarters (preserves quarterly totals)

  • "year" for annual visit-specific data (preserves yearly totals)

calibrate

Logical. If TRUE (default), calibrate weights to external population totals. If FALSE, only merge the crosswalk without calibration.

calibration_unit

Character. Temporal unit for weight calibration. One of "month" (default), "fortnight", or "week".

calibration_min_cell_size

Integer. Minimum sample size required in a cell for it to be used in hierarchical raking. Cells smaller than this threshold are collapsed to coarser levels. Default: 1 (use all cells).

target_totals

Optional data.table with population targets. If NULL (default), fetches monthly population from SIDRA and derives targets for fortnight/week. Each time period (month, fortnight, or week) is calibrated to the FULL Brazilian population from SIDRA.

If providing custom targets, the population column (m_populacao for months, f_populacao for fortnights, w_populacao for weeks) must be in thousands. The function multiplies by 1000 internally.

smooth

Logical. If TRUE, smooth calibrated weights to remove quarterly artifacts. Smoothing is adapted per time period: monthly (3-period window), fortnight (7-period window), weekly (no smoothing). Default: FALSE.

keep_all

Logical. If TRUE (default), keep all observations including those with undetermined reference periods. If FALSE, drop undetermined rows.

verbose

Logical. If TRUE (default), print progress messages.

Value

A data.table with the input data plus crosswalk columns:

ref_month_in_quarter, ref_month_in_year

Month position (1-3 in quarter, 1-12 in year)

ref_fortnight_in_month, ref_fortnight_in_quarter

Fortnight position (1-2 in month, 1-6 in quarter)

ref_week_in_month, ref_week_in_quarter

Week position (1-4 in month, 1-12 in quarter)

ref_month_yyyymm, ref_fortnight_yyyyff, ref_week_yyyyww

Integer period codes

determined_month, determined_fortnight, determined_week

Logical determination flags

weight_monthly, weight_fortnight, or weight_weekly

Calibrated weights (if calibrate=TRUE)

Details

Merges a reference period crosswalk with PNADC microdata and optionally calibrates survey weights for sub-quarterly analysis.

Weight Calibration

When calibrate = TRUE, the function performs hierarchical rake weighting:

  1. Groups observations by nested demographic/geographic cells

  2. Iteratively adjusts weights so sub-period totals match anchor-period totals

  3. Calibrates final weights against external population totals (FULL Brazilian population)

  4. Optionally smooths weights to remove quarterly artifacts

Population Targets

All time periods (months, fortnights, and weeks) are calibrated to the FULL Brazilian population from SIDRA. This means:

  • Monthly weights sum to the Brazilian population for that month

  • Fortnight weights sum to the Brazilian population for the containing month

  • Weekly weights sum to the Brazilian population for the containing month

Hierarchical Raking Levels

The number of hierarchical cell levels is automatically adjusted based on the calibration unit to avoid sparse cell issues:

  • "month": 4 levels (age, region, state, post-stratum) - full hierarchy

  • "fortnight": 2 levels (age, region) - simplified for lower sample size

  • "week": 1 level (age groups only) - minimal hierarchy for sparse data

Anchor Period

The anchor parameter determines how weights are redistributed:

  • "quarter": Quarterly totals are preserved and redistributed to months/fortnights/weeks

  • "year": Yearly totals are preserved and redistributed to months/fortnights/weeks

Use anchor = "quarter" with quarterly V1028 weights, and anchor = "year" with annual V1032 weights.

See also

pnadc_identify_periods to build the crosswalk

Examples

if (FALSE) { # \dontrun{
# Build crosswalk
crosswalk <- pnadc_identify_periods(pnadc_stacked)

# Apply to quarterly data with monthly calibration
result <- pnadc_apply_periods(
  pnadc_2023,
  crosswalk,
  weight_var = "V1028",
  anchor = "quarter"
)

# Apply to annual data
result <- pnadc_apply_periods(
  pnadc_annual,
  crosswalk,
  weight_var = "V1032",
  anchor = "year"
)

# Weekly calibration
result <- pnadc_apply_periods(
  pnadc_2023,
  crosswalk,
  weight_var = "V1028",
  anchor = "quarter",
  calibration_unit = "week"
)

# No calibration (just merge crosswalk)
result <- pnadc_apply_periods(
  pnadc_2023,
  crosswalk,
  weight_var = "V1028",
  anchor = "quarter",
  calibrate = FALSE
)
} # }