Find optimal decay parameters (alpha) for spatial interpolation
Source:R/optimize.R
optimize_alpha.RdOptimizes the per-tract-per-bracket decay parameters that minimize the error between interpolated values and known population counts. Uses per-bracket SGD with column normalization and log-barrier penalty inside torch autograd. Each demographic bracket gets its own per-tract kernel; gradients flow through all computations. Works on both CPU and GPU (CUDA/MPS).
Usage
optimize_alpha(
time_matrix,
pop_matrix,
source_matrix,
row_targets = NULL,
optim = optim_control(),
offset = 1,
verbose = TRUE
)Arguments
- time_matrix
Numeric matrix [n x m]. Raw travel times. Rows = target zones, columns = source points.
- pop_matrix
Numeric matrix [n x k]. Known population per zone, with k demographic groups as columns.
- source_matrix
Numeric matrix [m x k]. Known counts at source points (e.g., registered voters by age group).
- row_targets
Numeric vector of length n, or NULL. Target row sums for balancing. Each element specifies how much weight a zone should attract, proportional to its share of total population. If NULL (default), auto-computed as
rowSums(pop_matrix) / sum(pop_matrix) * m.- optim
An
optim_control()object with optimization parameters (max_epochs, lr_init, convergence_tol, patience, barrier_mu, alpha_init, alpha_min, use_gpu, device, dtype). Default:optim_control().- offset
Numeric. Value added to travel times before exponentiation (power kernel only; ignored when
kernel = "exponential"). Default: 1.- verbose
Logical. Print progress messages? Default: TRUE.
Value
A list of class "interpElections_optim" with components:
- alpha
Numeric matrix [n x k]. Optimal per-tract-per-bracket decay parameters. Each row is a census tract, each column is a demographic bracket. Inactive brackets (zero population or voters) are filled with 1.
- value
Numeric. Poisson deviance at optimum.
- loss
Numeric. Full loss at optimum (deviance + barrier + entropy).
- W
Numeric matrix [n x m]. Column-normalized weight matrix from the best-epoch. Use directly for interpolation via
W \%*\% data. Column sums are approximately 1.- method
Character. Method used (e.g.,
"pb_sgd_colnorm_cpu","pb_sgd_colnorm_cuda").- convergence
Integer. 0 = converged (gradient-based or window-based early stopping); 1 = stopped at max_epochs.
- epochs
Integer. Number of epochs completed.
- steps
Integer. Total number of SGD gradient steps.
- elapsed
difftimeobject. Wall-clock time.- message
Character. Additional information.
- history
Numeric vector. Full-dataset loss at each epoch.
- grad_norm_final
Numeric. Final gradient norm (theta-space).
- grad_history
Numeric vector. Gradient norm (theta-space) at each epoch.
- lr_history
Numeric vector. Learning rate at each epoch.
- kernel
Character. Kernel used (
"power"or"exponential").
Details
The optimization requires the torch R package. Install it with
setup_torch() if not already available.
Kernel: Two kernel functions are available, controlled via the
kernel field in optim_control():
Power (default): \(K(t) = (t + \text{offset})^{-\alpha}\). Classic inverse distance weighting.
Exponential: \(K(t) = \exp(-\alpha \cdot t)\). Lighter tail; relative decay increases with distance. Does not use
offset.
Parameterization: alpha[i,b] is reparameterized as
alpha = alpha_min + softplus(theta) with theta unconstrained,
where softplus(x) = log(1 + exp(x)). With the default
alpha_min = 1 (power kernel), alpha is always at least 1
(inverse-distance decay or steeper). The exponential kernel defaults
to alpha_min = 0. Set alpha_min = 0 for unconstrained
optimization (similar to legacy exp(theta)).
Epoch structure: Each epoch is one full-data gradient step with exact gradients (column sums require all tracts). The loss reported at each epoch is the true loss evaluated on the full dataset.
Two execution paths:
CPU (default):
use_gpu = FALSEorNULL. Uses torch on CPU device. Fast for small/medium problems (< 2000 tracts).GPU:
use_gpu = TRUE. Uses CUDA or MPS. Faster for large problems (> 2000 tracts).
GPU memory usage is bounded by
2 * ka * n * m * bytes_per_elem.
See also
optim_control() for tuning parameters,
use_gpu() to toggle GPU globally, compute_weight_matrix()
to rebuild the weight matrix from pre-computed alpha,
setup_torch() to install torch.
Examples
if (FALSE) { # \dontrun{
tt <- matrix(c(2, 5, 3, 4, 6, 2), nrow = 2)
pop <- matrix(c(100, 200), nrow = 2)
src <- matrix(c(80, 120, 100), nrow = 3)
result <- optimize_alpha(tt, pop, src, verbose = FALSE)
result$alpha
# With custom control
result <- optimize_alpha(tt, pop, src,
optim = optim_control(use_gpu = TRUE, max_epochs = 5000),
verbose = FALSE)
} # }