Serialize a data frame to a CDISC Dataset-JSON v1.1 (.json) file,
Dataset-JSON being the native home of the artoo_meta shape: the file is
the metadata block plus a flat rows array. The emit end of the artoo
workflow (spec -> apply_spec -> write_json); a thin wrapper over
write_dataset() with format = "json".
Usage
write_json(
x,
path,
on_invalid = c("error", "replace", "ignore"),
created = NULL,
strict = FALSE
)Arguments
- x
The dataset to write.
<data.frame>: required. Typically the output ofapply_spec(), carryingartoo_meta.- path
Destination
.jsonpath.<character(1)>: required.- on_invalid
Policy for values that are not valid UTF-8.
<character(1)>: default "error". One of"error"(abort withartoo_error_codec, naming the offenders with their invalid bytes hex-escaped),"replace"(substitute?and warn withartoo_warning_encoding), or"ignore"(drop the invalid bytes). The same policy vocabulary aswrite_xpt(); text correctly read through artoo is always valid UTF-8, so this only fires on bytes that entered the frame through a mis-declared source encoding.- created
Creation timestamp.
<POSIXct(1)> | NULL.NULL(default) stamps the current time intodatasetJSONCreationDateTime; freeze it for byte-stable output.- strict
Suppress the
_artooextension block.<logical(1)>: default FALSE. By default the file carries a single namespaced_artooobject when (and only when) there is content strict CDISC cannot express: SAS special-missing tags (.A-.Z,._), the recorded source encoding, and informats. Data values stay plainnulls either way, so a foreign reader degrades gracefully.Note:
strict = TRUEwrites a pure closed-vocabulary file and warns (artoo_warning_codec) naming exactly what was dropped; those attributes will not survive a read-back.
Details
Full metadata, no loss. Unlike .xpt, a .json file records the
complete artoo_meta: keySequence, codelist, origin, targetDataType, and
significantDigits all survive. Dates, datetimes, and times are exchanged as
ISO 8601 strings, or as SAS-epoch numbers when their targetDataType is
"integer" (the ADaM numeric-date convention); decimal rides as a string
so exact precision is preserved. The file is always UTF-8 (RFC 8259 / CDISC
v1.1). NaN and infinite values are not valid CDISC numerics and abort the
write.
Streaming write, whole-file read. The writer streams the rows array
in bounded slabs (a .json.gz path gzips the stream transparently), but
read_json() must parse the whole array at once. For multi-million-row
datasets prefer the NDJSON variant (write_ndjson() / read_ndjson()),
which bounds memory in both directions.
See also
read_json() for the inverse; write_dataset() for the generic
dispatcher.
Examples
# ---- Example 1: write a conformed dataset as Dataset-JSON ----
#
# apply_spec() attaches the metadata; write_json() serializes the full
# itemGroup plus the data rows.
adsl <- apply_spec(cdisc_adsl, adam_spec, "ADSL", conformance = "off")
#> 6 variables the spec declares are absent from the data (not added):
#> `TRTDURD`, `DISONDT`, `EOSSTT`, `DCSREAS`, `EOSDISP`, and `MMS1TSBL`.
path <- tempfile(fileext = ".json")
write_json(adsl, path)
# ---- Example 2: a frozen timestamp for reproducible bytes ----
#
# Fixing `created` makes two writes byte-identical; the columns() pane on
# the written file shows the full metadata the file carries (DM is SDTM,
# so it conforms against the bundled sdtm_spec).
dm <- apply_spec(cdisc_dm, sdtm_spec, "DM", conformance = "off")
#> 1 variable the spec declares is absent from the data (not added):
#> `BRTHDTC`.
path2 <- tempfile(fileext = ".json")
write_json(dm, path2, created = as.POSIXct("2020-01-01", tz = "UTC"))
columns(path2)
#> <artoo_columns> DM -- 25 variables, 60 obs
#> # Variable Type Len Format Label Key
#> 1 STUDYID Char 7 Study Identifier 1
#> 2 DOMAIN Char 2 Domain Abbreviation
#> 3 USUBJID Char 14 Unique Subject Identifier 2
#> 4 SUBJID Char 6 Subject Identifier for the Study
#> 5 RFSTDTC Char 10 Subject Reference Start Date/Time
#> 6 RFENDTC Char 10 Subject Reference End Date/Time
#> 7 SITEID Char 3 Study Site Identifier
#> 8 AGE Num Age
#> 9 AGEU Char 5 Age Units
#> 10 SEX Char 16 Sex
#> 11 RACE Char 41 Race
#> 12 ETHNIC Char 22 Ethnicity
#> 13 ARMCD Char 8 Planned Arm Code
#> 14 ARM Char 20 Description of Planned Arm
#> 15 COUNTRY Char 3 Country
#> 16 RFXSTDTC Char 10
#> 17 RFXENDTC Char 10
#> 18 RFICDTC Char 1
#> 19 RFPENDTC Char 16
#> 20 DTHDTC Char 10
#> 21 DTHFL Char 1
#> 22 ACTARMCD Char 8
#> 23 ACTARM Char 20
#> 24 DMDTC Char 10
#> 25 DMDY Num