Reads dataset files from a path and checks them against a specification
and/or conformance rules. Returns a herald_validation object with
structured findings.
Usage
validate(
path = NULL,
datasets = NULL,
format = "xpt",
spec = NULL,
config = NULL,
rules = NULL,
standard = NULL,
version = NULL,
define_xml = NULL,
ignore_spec_checks = NULL,
files = NULL,
ct_path = NULL
)Arguments
- path
Character. Path to a directory containing dataset files, or to a single
.xptor.jsonfile. Ignored whenfilesis provided.- datasets
Character vector of dataset names to validate, e.g.
c("DM", "AE"). Case-insensitive.NULL(default) validates all files matchingformatinpath. Ignored whenpathis a single file.- format
"xpt"(default) or"json". File type to read frompathwhen it is a directory.- spec
A
herald_specobject, a path to a spec file (.xlsx,.xml,.json), orNULL.- config
Optional. A herald-rules submission config identifier string (e.g.,
"fda-sdtm-ig-3.3","pmda-adam-ig-1.1"). When provided, loads the pre-built rule profile from the bundled herald-rules rules. Takes precedence overrules. WhenNULL, auto-selected fromstandard+version(defaults to FDA authority) if a matching bundled config exists.- rules
Optional. A character shortcut (
"fda","pmda","core","all"), or a list ofherald_ruleobjects. Used whenconfigis not provided and auto-selection finds no match.- standard
Character. CDISC standard:
"sdtmig","adamig", or"sendig". Whenspecis provided, read from thestandardcolumn of the dataset sheet. When no spec is given this parameter is required for anchor auto-detection. When both are absent, anchor detection is skipped.- version
Character. Standard version, e.g.
"3.4"for SDTMIG 3.4 or"1.1"for ADaMIG 1.1. Whenspecis provided and contains a version in thestandardcolumn (e.g."SDTMIG 3.3"), extracted automatically.- define_xml
Character. Path to a Define-XML 2.1 file. Stored in the output context; used for cross-checks in future releases.
- ignore_spec_checks
Character vector of spec checks to skip. Default
NULLruns all spec checks:"presence","labels","types","lengths","dataset_label","codelist","common". Example:ignore_spec_checks = c("lengths", "codelist")skips those two. Spec checks are silently skipped whenspec = NULL.- files
Optional. Named list or character vector of explicit file paths to load. Allows selecting specific datasets from different directories.
Named list:
list(ADAE = "/path/adae.xpt", ADSL = "/shared/adsl.xpt")— list names become dataset names.Unnamed character vector:
c("/path/adae.xpt", "/shared/adsl.xpt")— dataset names inferred from file basenames (uppercased, extension stripped).
When
filesis provided,pathanddatasetsare ignored. Cross-dataset rules (anchor detection) fire when two or more files are loaded.- ct_path
Optional character. Path to a custom controlled terminology file (
.xlsxor.csv, NCI EVS column layout). When provided, the custom CT is merged on top of the bundled CDISC CT for this call only. To register CT for the entire session useregister_ct().
Value
A herald_validation object with:
- findings
Data frame of issues.
- summary
List with counts by impact level (reject, high, medium, low, total).
- datasets_checked
Character vector of dataset names validated.
Controlled Terminology
Herald uses the CT package bundled with the installed version of the
package (inst/rules/ct/). To update to a newer CT release, call
fetch_ct() which downloads to the user cache. The validation report
always shows which CT version and source (bundled / fetched) was used.
Rule IDs
- HRL-VAR-001
Variable in spec but missing from data.
- HRL-VAR-002
Variable in data but not in spec.
- HRL-LBL-001
Variable label mismatch.
- HRL-TYP-001
Variable type mismatch.
- HRL-LEN-001
Character variable exceeds spec length.
- HRL-DS-001
Dataset label mismatch.
- HRL-CL-001
Variable value not found in spec codelist.
See also
herald_spec() for building specs, write_xpt() for writing
validated data, validation_report() for rendering the report.
Other conformance:
adam_rules(),
build_anchor_index(),
clear_ct(),
fda_rules(),
fetch_core_rules(),
fetch_herald_rules(),
herald_rules_cache_dir(),
list_ct(),
load_herald_config(),
new_herald_context(),
pmda_rules(),
register_ct(),
register_operator(),
rule_catalog(),
rule_config(),
update_core_rules(),
validate_spec(),
validate_spec_define(),
validation_report(),
verify_html_report()
Examples
# Validate a minimal XPT written to a temp directory
tmp_dir <- tempdir()
xpt_path <- file.path(tmp_dir, "dm.xpt")
on.exit(unlink(xpt_path), add = TRUE)
dm <- data.frame(
STUDYID = "STUDY001",
USUBJID = "001-001",
stringsAsFactors = FALSE
)
write_xpt(dm, xpt_path, dataset = "DM")
result <- validate(xpt_path)
result
#>
#> ── herald validation ──
#>
#> Datasets checked: 1
#> ℹ Spec checks only -- no conformance rules evaluated
#> Findings: 0 reject, 0 high, 0 medium, 0 low
