Work directly with simulation results with dplyr and tidyr

Allows applying data transformations to every simulation result with syntax as if dealing with a single simulation result using dplyr and tidyr verbs

per_sim(obj)

Arguments

obj: A simpr_tibble or simpr_spec object.

Value

A simpr_sims object for use with dplyr and tidyr verbs.

Details

After producing simulation results (a simpr_tibble object), it is sometimes needed to do some data transformation to prepare for analysis. This can always be specified in specify through custom functions, but per_sim allows you to also easily specify this in your pipeline. After running per_sim, you can use the dplyr and tidyr verbs you would use on a single simulation result and it will be applied to all results.

If, after running per_sim, you wish to return to the default behavior to access simpr_tibble results as a tibble with a list_column for simulation results again, run whole_tibble.

Examples

## Often most convenient to specify simulations for 'wide' data
data_wide = specify(a = ~ runif(5, min = 0, max = 1),
                    b = ~ runif(5, min = 0, max = 2)) %>%
  generate(2)

data_wide
#> full tibble
#> --------------------------
#> # A tibble: 2 × 3
#>   .sim_id   rep sim             
#>     <int> <int> <list>          
#> 1       1     1 <tibble [5 × 2]>
#> 2       2     2 <tibble [5 × 2]>
#> 
#> sim[[1]]
#> --------------------------
#> # A tibble: 5 × 2
#>       a     b
#>   <dbl> <dbl>
#> 1 0.734 1.98 
#> 2 0.118 0.149
#> 3 0.283 0.929
#> 4 0.594 0.681
#> 5 0.166 0.680
#> 

## Any dplyr or tidyr verbs can be applied after per_sim()
data_long = data_wide %>%
  per_sim() %>%
  pivot_longer(everything(), names_to = "name",
               values_to = "value")
data_long
#> full tibble
#> --------------------------
#> # A tibble: 2 × 3
#>   .sim_id   rep sim              
#>     <int> <int> <list>           
#> 1       1     1 <tibble [10 × 2]>
#> 2       2     2 <tibble [10 × 2]>
#> 
#> sim[[1]]
#> --------------------------
#> # A tibble: 10 × 2
#>    name  value
#>    <chr> <dbl>
#>  1 a     0.734
#>  2 b     1.98 
#>  3 a     0.118
#>  4 b     0.149
#>  5 a     0.283
#>  6 b     0.929
#>  7 a     0.594
#>  8 b     0.681
#>  9 a     0.166
#> 10 b     0.680
#> 

## Now, ready for analysis
data_long %>%
  fit(lm = ~lm(value ~ name)) %>%
  tidy_fits
#> # A tibble: 4 × 8
#>   .sim_id   rep Source term        estimate std.error statistic p.value
#>     <int> <int> <chr>  <chr>          <dbl>     <dbl>     <dbl>   <dbl>
#> 1       1     1 lm     (Intercept)    0.379     0.231     1.64   0.139 
#> 2       1     1 lm     nameb          0.506     0.326     1.55   0.160 
#> 3       2     2 lm     (Intercept)    0.664     0.239     2.78   0.0240
#> 4       2     2 lm     nameb          0.323     0.338     0.955  0.367