This function allows to split continuous values, e.g. (risk) scores or (bio)markers, into two or more categories by specifying one or more cutoff values.
Arguments
- values
(matrix)
numeric matrix of continuous values to be categorized. Assume an (n x r) matrix with n observations (subjects) of r continuous values.- cutoffs
(numeric | matrix)
numeric matrix of dimension m x k. Each row of cutoffs defines a split into k+1 distinct categories. Each row must contain distinct values. In the simplest case (k=1), cutoffs is a single column matrix whereby each row defines a binary split (<=t vs. >t). In this case (k=1), cutoffs can also be a numeric vector.- map
(numeric)
integer vector of length k with values in 1:r, whereby r = ncol(values). map_l gives the value which column of values should be categorized by ...- labels
(character)
character of length m (= number of prediction r)
Examples
set.seed(123)
M <- as.data.frame(mvtnorm::rmvnorm(20, mean = rep(0, 3), sigma = 2 * diag(3)))
M
#> V1 V2 V3
#> 1 -0.79263226 -0.32552013 2.2043464
#> 2 0.09971392 0.18284047 2.4254682
#> 3 0.65183395 -1.78906676 -0.9713566
#> 4 -0.63026120 1.73111308 0.5088536
#> 5 0.56677642 0.15652900 -0.7860781
#> 6 2.52707679 0.70406690 -2.7812167
#> 7 0.99186703 -0.66862802 -1.5101308
#> 8 -0.30826308 -1.45098941 -1.0308079
#> 9 -0.88393901 -2.38534456 1.1848098
#> 10 0.21690234 -1.60956869 1.7731621
#> 11 0.60311149 -0.41729409 1.2658988
#> 12 1.24186829 1.16189111 0.9738844
#> 13 0.78335786 -0.08755638 -0.4326965
#> 14 -0.53806725 -0.98246403 -0.2940394
#> 15 -1.78954068 3.06736694 1.7083162
#> 16 -1.58831539 -0.56976520 -0.6599503
#> 17 1.10303725 -0.11790166 0.3582465
#> 18 -0.04037121 -0.06062798 1.9354959
#> 19 -0.31928839 2.14461330 -2.1902672
#> 20 0.82676869 0.17515635 0.3053875
categorize(M)
#> V1_0 V2_0 V3_0
#> 1 0 0 1
#> 2 1 1 1
#> 3 1 0 0
#> 4 0 1 1
#> 5 1 1 0
#> 6 1 1 0
#> 7 1 0 0
#> 8 0 0 0
#> 9 0 0 1
#> 10 1 0 1
#> 11 1 0 1
#> 12 1 1 1
#> 13 1 0 0
#> 14 0 0 0
#> 15 0 1 1
#> 16 0 0 0
#> 17 1 0 1
#> 18 0 0 1
#> 19 0 1 0
#> 20 1 1 1
C <- matrix(rep(c(-1, 0, 1, -2, 0, 2), 3), ncol = 3, byrow = TRUE)
C
#> [,1] [,2] [,3]
#> [1,] -1 0 1
#> [2,] -2 0 2
#> [3,] -1 0 1
#> [4,] -2 0 2
#> [5,] -1 0 1
#> [6,] -2 0 2
w <- c(1, 1, 2, 2, 3, 3)
categorize(M, C, w)
#> V1_a V1_b V2_a V2_b V3_a V3_b
#> 1 1 1 1 1 3 3
#> 2 2 2 2 2 3 3
#> 3 2 2 0 1 1 1
#> 4 1 1 3 2 2 2
#> 5 2 2 2 2 1 1
#> 6 3 3 2 2 0 0
#> 7 2 2 1 1 0 1
#> 8 1 1 0 1 0 1
#> 9 1 1 0 0 3 2
#> 10 2 2 0 1 3 2
#> 11 2 2 1 1 3 2
#> 12 3 2 3 2 2 2
#> 13 2 2 1 1 1 1
#> 14 1 1 1 1 1 1
#> 15 0 1 3 3 3 2
#> 16 0 1 1 1 1 1
#> 17 3 2 1 1 2 2
#> 18 1 1 1 1 3 2
#> 19 1 1 3 3 0 0
#> 20 2 2 2 2 2 2