Skip to contents

Writes a data frame (or named list of data frames) to an XPT transport file in V5 (FDA submission standard) or V8 (extended) format. Pure R implementation — no SAS or haven dependency.

Usage

write_xpt(
  x,
  file,
  version = 5,
  dataset = NULL,
  label = NULL,
  encoding = "wlatin1"
)

Arguments

x

A data frame, or a named list of data frames for multiple members.

file

File path for the output .xpt file.

version

Transport format version: 5 (default, FDA standard) or 8 (extended names/labels).

dataset

Dataset name (e.g., "DM"). Default: the "dataset_name" attribute of x (set by read_xpt(), read_json(), or apply_spec()), then the uppercase file stem ("sdtm/dm.xpt""DM"), then "DATA". V5: max 8 characters, uppercased. V8: max 32 characters.

label

Dataset label. Defaults to attr(x, "label") or "".

encoding

Character encoding for the output file. Defaults to "wlatin1" (SAS WLATIN1 = Windows-1252), which converts UTF-8 characters to extended ASCII for SAS compatibility. Accepts SAS encoding names ("wlatin1", "latin1", "utf-8", "shift-jis") or standard names ("WINDOWS-1252", "ISO-8859-1"). Set to NULL to write bytes as-is without conversion.

Value

x invisibly (the input data frame, not the file path).

Details

If the herald.sort_keys attribute is set on x (e.g., by apply_spec() or sort_keys()), the data is sorted by those keys before writing.

Date/datetime handling

R Date columns are converted to SAS date values (days since 1960-01-01) and automatically assigned format.sas = "DATE9." unless the column already has a format.sas attribute. Similarly, POSIXct columns are converted to SAS datetime values (seconds since 1960-01-01 00:00:00 UTC) with format.sas = "DATETIME20.". The format.sas attribute is written into the XPT NAMESTR header so SAS recognizes the variable as a date. Informats are not auto-set (matching SAS behaviour); set informat.sas on the column before writing if needed.

SAS missing values

  • Numeric NA, NaN, Inf, -Inf are written as SAS missing (.). NA dates and datetimes are also written as SAS missing.

  • Character NA values are written as blank strings (spaces).

V5 constraints

Variable names must be at most 8 characters (A-Z, 0-9, underscore only), character variables at most 200 bytes, labels at most 40 characters. All names are uppercased.

V8 extensions

Variable names up to 32 characters with mixed case. Labels up to 256 characters via LABELV8/LABELV9 extension records.

Character encoding

By default, write_xpt() converts UTF-8 character data to WLATIN1 (Windows-1252) before writing. This ensures the XPT file is compatible with SAS sessions using the default WLATIN1 encoding. For pure ASCII data, the conversion is a no-op. See read_xpt() for the full encoding reference table.