Apply a specification to a data frame

Applies a herald_spec to a data frame in a single transactional call: scaffolds missing variables as typed NAs, drops unspecified columns, coerces types, sets all metadata attributes, orders columns, and sorts rows by key variables.

Usage

apply_spec(x, spec, dataset)

Arguments

x: A data frame.
spec: A herald_spec object or a path to a spec file (.xlsx, .xml, .json).
dataset: Character. The dataset name to look up in spec.

Value

The decorated data frame (invisibly). All six operations are applied to a copy — the original x is never modified. Column-level attributes set: label, format.sas, sas.length. Data frame-level attributes set: label, herald.dataset, herald.sort_keys.

Examples

spec <- herald_spec(
  ds_spec = data.frame(
    dataset = "DM",
    label   = "Demographics",
    keys    = "STUDYID, USUBJID",
    stringsAsFactors = FALSE
  ),
  var_spec = data.frame(
    dataset   = "DM",
    variable  = c("STUDYID", "USUBJID", "AGE"),
    label     = c("Study ID", "Unique Subject ID", "Age"),
    data_type = c("text", "text", "integer"),
    order     = 1:3,
    stringsAsFactors = FALSE
  )
)

dm <- data.frame(
  USUBJID = c("001-001", "001-002"),
  AGE     = c(45L, 60L),
  EXTRA   = c("x", "y"),
  stringsAsFactors = FALSE
)

result <- apply_spec(dm, spec, "DM")
#> Scaffolded 1 variable: `STUDYID`
#> Dropped 1 variable: `EXTRA`
names(result)          # STUDYID, USUBJID, AGE — EXTRA dropped
#> [1] "STUDYID" "USUBJID" "AGE"    
attr(result$AGE, "label")  # "Age"
#> [1] "Age"
attr(result, "label")      # "Demographics"
#> [1] "Demographics"

Usage

Arguments

Value

See also

Examples