Define-XML 2.1 is required for every FDA and PMDA submission that
includes clinical datasets. It describes dataset structure, variable
metadata, codelists, derivation methods, and analysis results in a
machine-readable ODM 1.3 XML document. herald generates valid Define-XML
2.1 from a herald_spec with a single function call.
Building a multi-dataset spec
skip_xml <- !requireNamespace("xml2", quietly = TRUE)
spec <- herald_spec(
ds_spec = data.frame(
dataset = c("DM", "AE"),
label = c("Demographics", "Adverse Events"),
class = c("SPECIAL PURPOSE", "EVENTS"),
structure = c("One record per subject",
"One record per subject per adverse event"),
keys = c("STUDYID, USUBJID", "STUDYID, USUBJID, AESEQ"),
stringsAsFactors = FALSE
),
var_spec = data.frame(
dataset = c("DM","DM","DM","DM", "AE","AE","AE","AE"),
variable = c("STUDYID","USUBJID","AGE","SEX",
"STUDYID","USUBJID","AESEQ","AETERM"),
label = c("Study Identifier","Unique Subject Identifier","Age","Sex",
"Study Identifier","Unique Subject Identifier",
"Sequence Number of AE","Reported Term for the Adverse Event"),
data_type = c("text","text","integer","text",
"text","text","integer","text"),
length = c(12L,11L,8L,1L, 12L,11L,8L,200L),
order = c(1L,2L,3L,4L, 1L,2L,3L,4L),
mandatory = c("Yes","Yes","No","No", "Yes","Yes","Yes","Yes"),
origin = c("Assigned","Assigned","CRF","CRF",
"Assigned","Assigned","Derived","CRF"),
stringsAsFactors = FALSE
),
codelist = data.frame(
codelist_id = c("SEX","SEX"),
term = c("M","F"),
decoded_value = c("Male","Female"),
stringsAsFactors = FALSE
)
)Generating Define-XML
xml_path <- tempfile(fileext = ".xml")
write_define_xml(spec, xml_path, validate = FALSE)
# Confirm the file was written
file.exists(xml_path)
#> [1] TRUE
file.info(xml_path)$size
#> [1] 4642The generated file is valid ODM 1.3 XML with Define-XML 2.1 namespace extensions. It includes:
-
<Study>→<MetaDataVersion>root structure -
<ItemGroupDef>for each dataset (with SASDatasetName, Label, Keys) -
<ItemDef>for each variable (with Name, Label, DataType, Length, Origin) -
<CodeList>for each controlled terminology codelist -
<leaf>hrefs pointing to the data files
Peek at the XML
doc <- xml2::read_xml(xml_path)
# Dataset names from ItemGroupDef elements
ns <- c(d = "http://www.cdisc.org/ns/def/v2.1",
o = "http://www.cdisc.org/ns/odm/v1.3")
igds <- xml2::xml_find_all(doc, ".//o:ItemGroupDef", ns = ns)
xml2::xml_attr(igds, "Name")
#> [1] "DM" "AE"Rendering to HTML
write_define_html() produces a self-contained HTML
document in the same format reviewers see in the CDISC Define Viewer —
no external stylesheet or dependencies required.
html_path <- tempfile(fileext = ".html")
write_define_html(spec, html_path)
file.exists(html_path)
#> [1] TRUEReading Define-XML back
read_spec_define() parses an existing Define-XML 2.1
file back into a herald_spec. Use it for migration
workflows (existing submissions) or to verify that
write_define_xml() produced what you expect.
spec2 <- read_spec_define(xml_path)
spec2
#>
#> ── herald_spec ──
#>
#> Study: "UNKNOWN"
#> • Datasets: 2
#> • Variables: 8
#> • Codelist: 1
#> Datasets: "DM" and "AE"
spec2$ds_spec[, c("dataset", "label")]
#> dataset label
#> 1 DM Demographics
#> 2 AE Adverse Events
spec2$var_spec[spec2$var_spec$dataset == "AE",
c("variable", "label", "data_type")]
#> variable label data_type
#> IT.AE.STUDYID STUDYID Study Identifier text
#> IT.AE.USUBJID USUBJID Unique Subject Identifier text
#> IT.AE.AESEQ AESEQ Sequence Number of AE integer
#> IT.AE.AETERM AETERM Reported Term for the Adverse Event textValidating the spec against Define-XML rules
validate_spec_define() checks the spec object against
Define-XML conformance rules (DD-prefix). This catches issues like
missing required metadata before you submit.
result <- validate_spec_define(spec)
result$summary
#> $reject
#> [1] 0
#>
#> $high
#> [1] 61
#>
#> $medium
#> [1] 0
#>
#> $low
#> [1] 0
#>
#> $total
#> [1] 61ADaM Analysis Results Metadata (ARM)
For ADaM submissions, herald supports the ARM 1.0 extension. Add
arm_displays and arm_results slots to your
spec:
adsl_spec <- herald_spec(
ds_spec = data.frame(
dataset = "ADSL", label = "Subject-Level Analysis Dataset",
stringsAsFactors = FALSE
),
var_spec = data.frame(
dataset = c("ADSL","ADSL","ADSL"),
variable = c("STUDYID","USUBJID","AGE"),
label = c("Study Identifier","Unique Subject Identifier","Age"),
data_type = c("text","text","integer"),
length = c(12L,11L,8L),
stringsAsFactors = FALSE
),
arm_displays = data.frame(
display_name = "Table 14.1.1",
display_description = "Summary of Demographics",
display_title = "Table 14.1.1 Summary of Demographic and Baseline Characteristics",
stringsAsFactors = FALSE
),
arm_results = data.frame(
display_name = "Table 14.1.1",
result_key = "R.AGE.MEAN",
parameter_oid = "ADSL.AGE",
analysis_reason = "PRIMARY OUTCOME MEASURE",
analysis_purpose = "Analysis",
stringsAsFactors = FALSE
)
)
adsl_spec
#>
#> ── herald_spec ──
#>
#> • Dataset: 1
#> • Variables: 3
#> • ARM: 1 display, 1 result
#> Datasets: "ADSL"
arm_xml <- tempfile(fileext = ".xml")
write_define_xml(adsl_spec, arm_xml, validate = FALSE)
file.exists(arm_xml)
#> [1] TRUEP21 Excel → Define-XML workflow
The typical production workflow reads a Pinnacle 21 Excel spec and generates Define-XML in one pipeline:
spec <- read_spec("path/to/study_spec.xlsx")
write_define_xml(spec, "sdtm/define.xml")
write_define_html(spec, "sdtm/define.html", define_xml = "sdtm/define.xml")No GUI, no Java, no license — just one herald_spec
object flowing through to a standards-compliant submission
deliverable.
Before vs After
| Task | Old way | herald |
|---|---|---|
| Generate Define-XML | Pinnacle 21 Enterprise (GUI, license, Java) | write_define_xml(spec, "define.xml") |
| Render to HTML | P21 Enterprise or separate XSLT tool | write_define_html(spec, "define.html") |
| Parse existing Define-XML | Manual XML parsing or P21 | read_spec_define("define.xml") |
| Validate Define-XML | P21 Validator (Java) | validate_spec_define("define.xml") |
| ARM 1.0 support | P21 Enterprise only |
arm_displays + arm_results slots |
What to read next
-
vignette("spec-management")— building specs from P21 Excel or JSON -
vignette("validation")— dataset conformance checking -
vignette("submission-workflow")—submit()callswrite_define_xml()automatically
