Generate data sets under realistic parameter configurations

Generates a (simulation) instance, a list of multiple datasets to be processed (analyzed) with process_instance. Ground truth parameters (Sensitvity & Specificity) are initially generated according to a generative model whereby multiple decision rules (with different parameter values) are derived by thresholding multiple biomarkers.

This function is only needed for simulation via batchtools, not relevant in interactive use!

Usage

generate_instance_roc(
  nrep = 10,
  n = 100,
  prev = 0.5,
  random = FALSE,
  m = 10,
  auc = "seq(0.85, 0.95, length.out = 5)",
  rhose = 0.5,
  rhosp = 0.5,
  dist = "normal",
  e = 10,
  k = 100,
  delta = 0,
  ...,
  data = NULL,
  job = NULL
)

Arguments

nrep: (numeric)
integer, number of instances
n: (numeric)
integer, total sample size
prev: (numeric)
disease prevalence
random: (logical)
fixed prevalence (FALSE) or simple random sampling (TRUE)
m: (numeric)
integer, number of candidates
auc: (numeric)
vector of AUCs of biomarkers
rhose: (numeric)
correlation parameter for sensitivity
rhosp: (numeric)
correlation parameter for specificity
dist: (character)
either "normal" or "exponential" specifying the subgroup biomarker distributions
e: (numeric)
emulates better (worse) model selection quality with higher (lower) values of e
k: (numeric)
technical parameter which adjusts grid size
delta: (numeric)
specify importance between sensitivity and specificity (default 0: equal importance)
...: (any)
further arguments
data: (NULL)
ignored (for batchtools compatibility)
job: (NULL)
ignored (for batchtools compatibility)

Value

(list)
a single (ROC) simulation instance of length nrep

Details

Utilizes same arguments as draw_data_roc unless mentioned otherwise above.