`autoSTK`

autoSTK provides automatic spatio-temporal Kriging inspired by automap (Hiemstra et al. 2010). It extends the automap workflow to spatio-temporal variogram fitting, Kriging prediction, and cross-validation for common R spatio-temporal data classes.

Installation

remotes::install_github("sigmafelix/autoSTK")

Before installing, make sure the core spatial dependencies are available: gstat, spacetime, sp, sf, sftime, and automap. The cubble package is suggested and is only required when working directly with cubble_df objects.

Main Features

Automatic spatio-temporal variogram fitting with autofitVariogramST().
Automatic spatio-temporal Kriging with autoKrigeST().
Explicit fit/predict workflow with fitVariogramST() and predictKrigeST().
Cross-validation with autoKrigeST.cv() using spatial, temporal, spatio-temporal, or random folds.
Support for STFDF, STSDF, STIDF, sftime, and cubble inputs in the main fitting and prediction workflow.
Conversion helper cubble_to_sftime() for nested or temporal cubble objects.
Multiple spatio-temporal variogram structures, including sumMetric, metric, productSum, and separable.
Multiple optimizers, including lbfgsb, grid, sa, and ga.

Supported Functions

autofitVariogramST(): automatically fits a spatio-temporal variogram.
fitVariogramST(): fits a variogram and returns a reusable STVariogramFit object.
predictKrigeST(): predicts from a fitted STVariogramFit object.
autoKrigeST(): fits the variogram and predicts in one call.
autoKrigeST.cv(): cross-validates spatio-temporal Kriging results.
cubble_to_sftime(): converts cubble_df objects to sftime.

Data Classes

The main Kriging workflow accepts both legacy spacetime classes and modern simple-feature classes:

Input class	Typical use
`STFDF`	Full space-time grids, such as all stations observed at all dates.
`STSDF`	Sparse space-time grids, such as missing station-date observations.
`STIDF`	Irregular space-time observations.
`sftime`	Simple-feature workflows with one geometry and one time column.
`cubble_df`	Tidy spatio-temporal data stored in spatial and temporal faces.

Internally, autoSTK converts modern inputs to the spacetime classes required by gstat::variogramST() and gstat::krigeST().

Use Cases

Forecast Air Quality From Station Networks

Use autoKrigeST() when you have repeated measurements at monitoring stations and want predictions on a future spatio-temporal grid. This is the classic STFDF/STSDF workflow.

library(autoSTK)
library(gstat)
library(sp)
library(spacetime)
library(stars)

data(air)

deair <- STFDF(stations, dates, data.frame(PM10 = as.vector(air)))
deair_sf <- st_as_stars(deair) |>
  st_transform("+proj=longlat +ellps=sphere") |>
  st_transform(3857)

deair_r <- as(deair_sf, "STFDF")
deair_r@sp@proj4string <- CRS(
  "+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs +type=crs"
)

deair_rs <- deair_r[, 3701:3800]
deair_rss <- as(deair_rs, "STSDF")

akst_stk <- autoKrigeST(
  formula = PM10 ~ 1,
  input_data = deair_rss,
  cutoff = 300000,
  width = 30000,
  type_stv = "sumMetric",
  model = c("Exc", "Mat", "Ste", "Exp", "Wav"),
  tlags = 0:7,
  cores = 8,
  optimizer = "sa",
  verbose = TRUE
)

akst_stk_stars <- st_as_stars(akst_stk$krige_output)
plot(akst_stk_stars[1, ])

Work Directly With `sftime`

Use sftime when your data already lives in an sf-style workflow with one geometry column and one time column. autoKrigeST() accepts sftime for both training data and prediction data.

library(autoSTK)
library(sf)
library(sftime)

# station_long has columns:
#   id, time, PM10, geometry
# where geometry is an sfc_POINT column and time is Date/POSIXct.
station_sf <- st_as_sf(
  station_long,
  sf_column_name = "geometry",
  crs = 3857
)

station_st <- st_as_sftime(
  station_sf,
  time_column_name = "time",
  sf_column_name = "geometry"
)

fit <- fitVariogramST(
  formula = PM10 ~ 1,
  data = station_st,
  typestv = "sumMetric",
  candidate_model = c("Sph", "Exp", "Gau", "Ste"),
  cutoff = 300000,
  width = 30000,
  tlags = 0:7
)

pred <- predictKrigeST(
  fit = fit,
  data = station_st,
  newdata = prediction_grid_st,
  formula = PM10 ~ 1,
  nmax = 40
)

fitVariogramST() and predictKrigeST() are useful when you want to reuse the same fitted variogram across several prediction grids.

Use `cubble` Data

Use cubble when your workflow stores station metadata in the spatial face and repeated measurements in the temporal face. autoSTK can convert either the nested or temporal face to sftime.

library(autoSTK)
library(cubble)

# cb is a cubble_df created from an sf object with station geometries and a
# temporal table containing PM10 observations.
pm10_st <- cubble_to_sftime(cb, time_col = "time", key_col = "id")

akst_cb <- autoKrigeST(
  formula = PM10 ~ 1,
  input_data = pm10_st,
  cutoff = 300000,
  width = 30000,
  tlags = 0:7,
  type_stv = "sumMetric"
)

You can also pass a cubble_df directly to the main Kriging workflow when supplying a prediction grid:

akst_cb <- autoKrigeST(
  formula = PM10 ~ 1,
  input_data = cb,
  new_data = prediction_grid_st,
  cutoff = 300000,
  width = 30000,
  tlags = 0:7
)

For cross-validation, convert cubble or sftime data to a spacetime object first, because autoKrigeST.cv() builds folds from the @sp and @time slots.

pm10_stfdf <- as(as(pm10_st, "STIDF"), "STFDF")

akst_cv_cb <- autoKrigeST.cv(
  formula = PM10 ~ 1,
  data = pm10_stfdf,
  nfold = 3,
  fold_dim = "temporal",
  cutoff = 300000,
  width = 30000,
  tlags = 0:7
)

Cross-Validation

autoKrigeST.cv() evaluates prediction performance by holding out different parts of the spatio-temporal data set.

`fold_dim`	What is held out	Good for
`"spatial"`	Groups of monitoring locations	Testing transfer to unseen stations.
`"temporal"`	Groups of dates/times	Testing forecasting over unseen periods.
`"spacetime"`	Blocks of locations and times	Testing combined spatial and temporal extrapolation.
`"random"`	Random observations	Testing interpolation under scattered missingness.

akst_cv_t <- autoKrigeST.cv(
  formula = PM10 ~ 1,
  data = deair_rs,
  nfold = 3,
  fold_dim = "temporal",
  cutoff = 300000,
  width = 30000,
  tlags = 0:7,
  cores = 8,
  optimizer = "sa"
)

akst_cv_s <- autoKrigeST.cv(
  formula = PM10 ~ 1,
  data = deair_rs,
  nfold = 3,
  fold_dim = "spatial",
  cutoff = 300000,
  width = 30000,
  tlags = 0:7,
  cores = 8
)

akst_cv_spt <- autoKrigeST.cv(
  formula = PM10 ~ 1,
  data = deair_rs,
  nfold = 4,
  fold_dim = "spacetime",
  cutoff = 300000,
  width = 30000,
  tlags = 0:7,
  cores = 8
)

The result is a data frame with one row per fold:

akst_cv_t
#>   CVFold     RMSE      MAE      BIAS
#> 1      1  ...
#> 2      2  ...
#> 3      3  ...

The columns are:

CVFold: fold identifier.
RMSE: root mean squared prediction error.
MAE: mean absolute prediction error.
BIAS: mean prediction minus mean observed value.

Summarise the cross-validation results to compare model settings:

summary_cv <- data.frame(
  fold_dim = "temporal",
  mean_RMSE = mean(akst_cv_t$RMSE, na.rm = TRUE),
  mean_MAE = mean(akst_cv_t$MAE, na.rm = TRUE),
  mean_BIAS = mean(akst_cv_t$BIAS, na.rm = TRUE)
)

summary_cv

Lower RMSE and MAE indicate smaller prediction errors. BIAS values close to zero indicate little systematic over- or under-prediction.

For fold_dim = "spacetime", nfold must be a perfect square, such as 4, 9, or 16, because the folds are built from spatial groups crossed with temporal groups.

Notes

Projected coordinate reference systems are recommended because variogram distances are computed in map units.
Universal Kriging formulas such as PM10 ~ elevation + traffic require new_data with matching covariate columns.
predict_chunk can be used for large prediction grids to reduce memory pressure.
variogram_from_full = TRUE in autoKrigeST.cv() fits one variogram on the full data set and reuses it across folds, which can be much faster when that assumption is appropriate.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github/workflows		.github/workflows
.vscode		.vscode
R		R
man		man
tests		tests
tools		tools
.Rbuildignore		.Rbuildignore
.codex		.codex
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
_pkgdown.yaml		_pkgdown.yaml
autoSTK.Rproj		autoSTK.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`autoSTK`

Installation

Main Features

Supported Functions

Data Classes

Use Cases

Forecast Air Quality From Station Networks

Work Directly With `sftime`

Use `cubble` Data

Cross-Validation

Notes

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

autoSTK

Installation

Main Features

Supported Functions

Data Classes

Use Cases

Forecast Air Quality From Station Networks

Work Directly With sftime

Use cubble Data

Cross-Validation

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`autoSTK`

Work Directly With `sftime`

Use `cubble` Data

Packages