# S3 method for simpr_spec
generate(
x,
.reps,
...,
.sim_name = "sim",
.quiet = TRUE,
.warn_on_error = TRUE,
.stop_on_error = FALSE,
.debug = FALSE,
.progress = FALSE,
.options = furrr_options(seed = TRUE)
)
a simpr_spec
object generated
by define
or
specify
,
containing the specifications of the
simulation
number of replications to run (a whole number greater than 0)
filtering criteria for which rows
to simulate, passed to
filter
. This is useful
for reproducing just a few selected rows of a
simulation without needing to redo the entire
simulation, see vignette("Reproducing
simulations")
,
name of the list-column to be
created, containing simulation results.
Default is "sim"
Should simulation errors be broadcast to the user as they occur?
Should there be a warning
when simulation errors occur? See
vignette("Managing simulation
errors")
.
Should the simulation stop immediately when simulation errors occur?
Run simulation in debug mode, allowing objects, etc. to be explored for each generated variable specification.
A logical, for whether or not to print a progress bar for multiprocess, multisession, and multicore plans.
The future
specific
options to use with the workers when using
futures. This must be the result from a call
to
furrr_options(seed
= TRUE)
.
a simpr_sims
object,
which is a tibble with a row for each
repetition (a total of rep
repetitions) for each combination of
metaparameters and some extra metadata used
by fit
. The
columns are rep
for the repetition
number, the names of the metaparameters, and
a list-column (named by the argument
sim_name
) containing the dataset for
each repetition and metaparameter
combination. simpr_sims
objects can be
manipulated elementwise by dplyr
and
tidyr
verbs: the command is applied to
each element of the simulation list-column.
This is the third step in the simulation
process: after specifying the population model
and defining the metaparameters, if any,
generate
is the workhorse function that
actually generates the simulated datasets, one
for each replication and combination of
metaparameters. You likely want to use the
output of generate
to fit model(s) with
fit
.
Errors you get using this function usually have
to do with how you specified the simulation in
specify
and
define
.
specify
and
define
for examples of how
these functions affect the output of
generate
. See
vignette("Optimization")
and the
furrr
website for more information on
working with futures:
https://furrr.futureverse.org/
meta_list_out = specify(a = ~ MASS::mvrnorm(n, rep(0, 2), Sigma = S)) %>%
define(n = c(10, 20, 30),
S = list(independent = diag(2), correlated = diag(2) + 2)) %>%
generate(1)
## View overall structure of the result and a single simulation output
meta_list_out
#> full tibble
#> --------------------------
#> # A tibble: 6 × 6
#> .sim_id n S_index rep S sim
#> <int> <dbl> <chr> <int> <list> <list>
#> 1 1 10 independent 1 <dbl [2 × 2]> <tibble [10 × 2]>
#> 2 2 20 independent 1 <dbl [2 × 2]> <tibble [20 × 2]>
#> 3 3 30 independent 1 <dbl [2 × 2]> <tibble [30 × 2]>
#> 4 4 10 correlated 1 <dbl [2 × 2]> <tibble [10 × 2]>
#> 5 5 20 correlated 1 <dbl [2 × 2]> <tibble [20 × 2]>
#> 6 6 30 correlated 1 <dbl [2 × 2]> <tibble [30 × 2]>
#>
#> sim[[1]]
#> --------------------------
#> # A tibble: 10 × 2
#> a_1 a_2
#> <dbl> <dbl>
#> 1 -0.664 0.526
#> 2 0.190 -1.07
#> 3 -1.03 -1.29
#> 4 -0.827 -0.653
#> 5 0.144 -0.322
#> 6 1.02 0.0764
#> 7 -0.936 -2.38
#> 8 -0.444 0.279
#> 9 -1.54 0.00321
#> 10 0.0709 -0.289
#>
## Changing .reps will change the number of replications and thus the number of
## rows in the output
meta_list_2 = specify(a = ~ MASS::mvrnorm(n, rep(0, 2), Sigma = S)) %>%
define(n = c(10, 20, 30),
S = list(independent = diag(2), correlated = diag(2) + 2)) %>%
generate(2)
meta_list_2
#> full tibble
#> --------------------------
#> # A tibble: 12 × 6
#> .sim_id n S_index rep S sim
#> <int> <dbl> <chr> <int> <list> <list>
#> 1 1 10 independent 1 <dbl [2 × 2]> <tibble [10 × 2]>
#> 2 2 20 independent 1 <dbl [2 × 2]> <tibble [20 × 2]>
#> 3 3 30 independent 1 <dbl [2 × 2]> <tibble [30 × 2]>
#> 4 4 10 correlated 1 <dbl [2 × 2]> <tibble [10 × 2]>
#> 5 5 20 correlated 1 <dbl [2 × 2]> <tibble [20 × 2]>
#> 6 6 30 correlated 1 <dbl [2 × 2]> <tibble [30 × 2]>
#> 7 7 10 independent 2 <dbl [2 × 2]> <tibble [10 × 2]>
#> 8 8 20 independent 2 <dbl [2 × 2]> <tibble [20 × 2]>
#> 9 9 30 independent 2 <dbl [2 × 2]> <tibble [30 × 2]>
#> 10 10 10 correlated 2 <dbl [2 × 2]> <tibble [10 × 2]>
#> 11 11 20 correlated 2 <dbl [2 × 2]> <tibble [20 × 2]>
#> 12 12 30 correlated 2 <dbl [2 × 2]> <tibble [30 × 2]>
#>
#> sim[[1]]
#> --------------------------
#> # A tibble: 10 × 2
#> a_1 a_2
#> <dbl> <dbl>
#> 1 0.698 -1.78
#> 2 0.609 0.406
#> 3 -1.30 1.69
#> 4 0.198 -0.712
#> 5 1.05 1.48
#> 6 -0.0732 0.909
#> 7 -0.665 -0.0117
#> 8 1.48 -0.632
#> 9 0.157 -0.0286
#> 10 0.701 1.94
#>
## Fitting, tidying functions can be included in this step by running those functions and then
## generate. This can save computation time when doing large
## simulations, especially with parallel processing
meta_list_generate_after = specify(a = ~ MASS::mvrnorm(n, rep(0, 2), Sigma = S)) %>%
define(n = c(10, 20, 30),
S = list(independent = diag(2), correlated = diag(2) + 2)) %>%
fit(lm = ~ lm(a_2 ~ a_1, data = .)) %>%
tidy_fits %>%
generate(1)
meta_list_generate_after
#> # A tibble: 12 × 10
#> .sim_id n S_index rep Source term estim…¹ std.e…² stati…³ p.value
#> <int> <dbl> <chr> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 1 10 independent 1 lm (Inte… 0.989 0.274 3.61 6.86e-3
#> 2 1 10 independent 1 lm a_1 0.358 0.240 1.49 1.74e-1
#> 3 2 20 independent 1 lm (Inte… 0.173 0.163 1.06 3.04e-1
#> 4 2 20 independent 1 lm a_1 -0.472 0.193 -2.44 2.51e-2
#> 5 3 30 independent 1 lm (Inte… -0.138 0.155 -0.885 3.84e-1
#> 6 3 30 independent 1 lm a_1 -0.0222 0.161 -0.138 8.91e-1
#> 7 4 10 correlated 1 lm (Inte… -0.431 0.564 -0.765 4.67e-1
#> 8 4 10 correlated 1 lm a_1 0.489 0.410 1.19 2.67e-1
#> 9 5 20 correlated 1 lm (Inte… -0.129 0.395 -0.327 7.48e-1
#> 10 5 20 correlated 1 lm a_1 0.879 0.228 3.87 1.13e-3
#> 11 6 30 correlated 1 lm (Inte… -0.0868 0.274 -0.316 7.54e-1
#> 12 6 30 correlated 1 lm a_1 0.718 0.154 4.68 6.72e-5
#> # … with abbreviated variable names ¹estimate, ²std.error, ³statistic