Skip to contents

Infers the ADaM dataset class from the variables present in a dataset or spec. Uses variable signatures rather than dataset name conventions so it works across companies that name their ADaM datasets differently.

Classes returned:

ADSL

Subject-level: one row per subject, no PARAMCD/AVAL.

BDS

Basic Data Structure: PARAMCD + AVAL present. Includes lab, vitals, ECG, exposure, and similar parameter-based datasets.

TTE

Time-to-event: BDS signature plus CNSR (censoring indicator).

OCCDS

Occurrence Data Structure: occurrence-based without a numeric AVAL parameter spine; e.g. AE, CM, MH, CE.

unknown

Insufficient variables to determine class.

Usage

detect_adam_class(vars)

Arguments

vars

Character vector of variable names (uppercase). Can be column names from a data frame or the variable column from a spec.

Value

A single character string: one of "ADSL", "BDS", "TTE", "OCCDS", or "unknown".

Details

Signature rules (evaluated in order):

  1. TTE: PARAMCD + AVAL + CNSR all present.

  2. BDS: PARAMCD + AVAL present (without CNSR).

  3. ADSL: USUBJID present, no PARAMCD, no AVAL, no occurrence-flag pattern.

  4. OCCDS: USUBJID present + either (a) a term variable (*TERM, *DECOD, *DOSE) or (b) at least two occurrence flag variables matching *FL but no PARAMCD.

  5. unknown: none of the above.

Examples

detect_adam_class(c("STUDYID", "USUBJID", "AGE", "SEX", "RACE"))         # "ADSL"
#> [1] "ADSL"
detect_adam_class(c("USUBJID", "PARAMCD", "PARAM", "AVAL",  "BASE"))     # "BDS"
#> [1] "BDS"
detect_adam_class(c("USUBJID", "PARAMCD", "PARAM", "AVALC", "DTYPE"))    # "BDS"
#> [1] "BDS"
detect_adam_class(c("USUBJID", "PARAMCD", "AVAL", "CNSR", "STARTDT"))    # "TTE"
#> [1] "TTE"
detect_adam_class(c("USUBJID", "AETERM", "AEDECOD", "AESTDTC"))          # "OCCDS"
#> [1] "OCCDS"