Takes the output of
specify
(a
simpr_spec
object) and defines the
metaparameters (i.e. simulation factors).
define(.x = NULL, ..., .list = NULL, .suffix = "_index")
a simpr_spec
object (the
output of
specify
)
metaparameters: named arguments containing vectors or lists of objects to be used in the simulation.
additional parameters passed to define() as a list. Useful if you already have desired metaparameters already in list format or created by other functions.
name of suffix to append onto
index column for list metaparameters,
"_index"
by default. See
Details.
This is the second step in the simulation
process, after specifying the simulated data
using specify
.
The output of define
is then
passed to
generate
to
actually generate the simulation.
Metaparameters are named arguments, passed to
...
, that are used in the simulation.
A metaparameter is some kind of vector or list,
representing something that is to be
systematically varied as a part of the
simulation design. Any metaparameter should
also appear in the formulas of
specify
, and
thus the simulation changes depending on the
value of the metaparameter.
When creating the simulation, simulations for
all possible combinations of metaparameters are
generated, resulting in a fully crossed
simulation design. If only a subset of the
fully crossed design is needed, use the
filtering options available in
generate
.
When one of ...
is a list, a new
column is generated in the output to
generate
to
serve as the index of the list. This new
column will be the name of the list argument,
with the suffix
argument appended onto
the end. So if Y = list(a = 1:2, b =
letters[2:3])
, and suffix = "_index"
,
the default, a column named Y_index
would be added to the output of
generate
with values "a"
and "b"
.
# Simple example of setting a metaparameter
simple_meta = specify(a = ~ 1 + rnorm(n)) %>%
define(n = c(5, 10)) %>%
generate(1)
simple_meta # $sim has a 5-row tibble and a 10-row tibble
#> full tibble
#> --------------------------
#> # A tibble: 2 × 4
#> .sim_id n rep sim
#> <int> <dbl> <int> <list>
#> 1 1 5 1 <tibble [5 × 1]>
#> 2 2 10 1 <tibble [10 × 1]>
#>
#> sim[[1]]
#> --------------------------
#> # A tibble: 5 × 1
#> a
#> <dbl>
#> 1 0.377
#> 2 0.667
#> 3 1.90
#> 4 0.141
#> 5 0.283
#>
multi_meta = specify(a = ~ mu + rnorm(n)) %>%
define(n = c(5, 10),
mu = seq(-1, 1, length.out = 3)) %>%
generate(1)
multi_meta # generates simulations for all combos of n and mu
#> full tibble
#> --------------------------
#> # A tibble: 6 × 5
#> .sim_id n mu rep sim
#> <int> <dbl> <dbl> <int> <list>
#> 1 1 5 -1 1 <tibble [5 × 1]>
#> 2 2 10 -1 1 <tibble [10 × 1]>
#> 3 3 5 0 1 <tibble [5 × 1]>
#> 4 4 10 0 1 <tibble [10 × 1]>
#> 5 5 5 1 1 <tibble [5 × 1]>
#> 6 6 10 1 1 <tibble [10 × 1]>
#>
#> sim[[1]]
#> --------------------------
#> # A tibble: 5 × 1
#> a
#> <dbl>
#> 1 -0.114
#> 2 -0.807
#> 3 -0.863
#> 4 0.488
#> 5 -1.45
#>
# define can handle lists which can contain multiple matrices, etc.
meta_list_out = specify(a = ~ MASS::mvrnorm(n, rep(0, 2), Sigma = S)) %>%
define(n = c(10, 20, 30),
S = list(independent = diag(2), correlated = diag(2) + 2)) %>%
generate(1)
meta_list_out # generates S_index column
#> full tibble
#> --------------------------
#> # A tibble: 6 × 6
#> .sim_id n S_index rep S sim
#> <int> <dbl> <chr> <int> <list> <list>
#> 1 1 10 independent 1 <dbl [2 × 2]> <tibble [10 × 2]>
#> 2 2 20 independent 1 <dbl [2 × 2]> <tibble [20 × 2]>
#> 3 3 30 independent 1 <dbl [2 × 2]> <tibble [30 × 2]>
#> 4 4 10 correlated 1 <dbl [2 × 2]> <tibble [10 × 2]>
#> 5 5 20 correlated 1 <dbl [2 × 2]> <tibble [20 × 2]>
#> 6 6 30 correlated 1 <dbl [2 × 2]> <tibble [30 × 2]>
#>
#> sim[[1]]
#> --------------------------
#> # A tibble: 10 × 2
#> a_1 a_2
#> <dbl> <dbl>
#> 1 -1.00 1.82
#> 2 1.44 -0.358
#> 3 -0.180 1.07
#> 4 2.14 -1.14
#> 5 0.169 0.862
#> 6 -1.29 1.56
#> 7 2.19 0.533
#> 8 -1.58 -0.739
#> 9 1.80 -0.376
#> 10 0.219 -0.139
#>
# define can also take arguments as a list using the .list argument
meta_list_out_2 = specify(a = ~ MASS::mvrnorm(n, rep(0, 2), Sigma = S)) %>%
define(.list = list(n = c(10, 20, 30),
S = list(independent = diag(2), correlated = diag(2) + 2))) %>%
generate(1)