Generate data sets under realistic parameter configurations
Source:R/generate_instance.R
generate_instance_roc.Rd
Generates a (simulation) instance, a list of multiple datasets to be processed (analyzed) with process_instance. Ground truth parameters (Sensitvity & Specificity) are initially generated according to a generative model whereby multiple decision rules (with different parameter values) are derived by thresholding multiple biomarkers.
This function is only needed for simulation via batchtools, not relevant in interactive use!
Usage
generate_instance_roc(
nrep = 10,
n = 100,
prev = 0.5,
random = FALSE,
m = 10,
auc = "seq(0.85, 0.95, length.out = 5)",
rhose = 0.5,
rhosp = 0.5,
dist = "normal",
e = 10,
k = 100,
delta = 0,
...,
data = NULL,
job = NULL
)
Arguments
- nrep
(numeric)
integer, number of instances- n
(numeric)
integer, total sample size- prev
(numeric)
disease prevalence- random
(logical)
fixed prevalence (FALSE) or simple random sampling (TRUE)- m
(numeric)
integer, number of candidates- auc
(numeric)
vector of AUCs of biomarkers- rhose
(numeric)
correlation parameter for sensitivity- rhosp
(numeric)
correlation parameter for specificity- dist
(character)
either "normal" or "exponential" specifying the subgroup biomarker distributions- e
(numeric)
emulates better (worse) model selection quality with higher (lower) values of e- k
(numeric)
technical parameter which adjusts grid size- delta
(numeric)
specify importance between sensitivity and specificity (default 0: equal importance)- ...
(any)
further arguments- data
(NULL)
ignored (for batchtools compatibility)- job
(NULL)
ignored (for batchtools compatibility)
Details
Utilizes same arguments as draw_data_roc unless mentioned otherwise above.