Title: | Run a Function Iteratively While Varying Parameters |
---|---|
Description: | Run simulations or other functions while easily varying parameters from one iteration to the next. Some common use cases would be grid search for machine learning algorithms, running sets of simulations (e.g., estimating statistical power for complex models), or bootstrapping under various conditions. See the 'paramtest' documentation for more information and examples. |
Authors: | Jeffrey Hughes [aut, cre] |
Maintainer: | Jeffrey Hughes <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2025-03-24 19:55:27 UTC |
Source: | https://github.com/jeff-hughes/paramtest |
gen_data
will generate sample data based on a factor structure and
effects structure specified by the user.
gen_data(factor_struct, effects_struct, n_cases = 1000, true_scores = FALSE)
gen_data(factor_struct, effects_struct, n_cases = 1000, true_scores = FALSE)
factor_struct |
A matrix describing the measurement model of latent factors (columns) as measured by observed variables (rows). |
effects_struct |
A matrix describing the variances and covariances of the latent variables in the model. |
n_cases |
Number of sample cases to generate. |
true_scores |
Whether or not to include the data for each variable as measured without error. If set to TRUE, the resulting data frame will include all the variables in the model twice: once with measurement error, and once without. |
Returns a data frame with n_cases
rows and columns for each
observed and latent variable. These variables will approximately accord
with the factor structure and effects structure that was specified, within
sampling error.
# two uncorrelated predictors, one criterion, with measurement error in all # variables beta1 <- .5 beta2 <- .6 y_resid_var <- sqrt(1 - (beta1^2 + beta2^2)) fmodel <- matrix( c(.8, 0, 0, # x1 0, .6, 0, # x2 0, 0, .5), # y nrow=3, ncol=3, byrow=TRUE, dimnames=list( c('x1', 'x2', 'y'), c('x1', 'x2', 'y'))) # in this case, observed and latent variables are the same effects <- matrix( c(1, 0, beta1, 0, 1, beta2, 0, 0, y_resid_var), nrow=3, ncol=3, byrow=TRUE, dimnames=list( c('x1', 'x2', 'y'), c('x1', 'x2', 'y'))) sample_data <- gen_data(fmodel, effects, n_cases=1000) round(var(sample_data), 2) round(cor(sample_data), 2) summary(lm(y ~ x1 + x2, data=sample_data)) # note that beta coefficients are much smaller, due to measurement error
# two uncorrelated predictors, one criterion, with measurement error in all # variables beta1 <- .5 beta2 <- .6 y_resid_var <- sqrt(1 - (beta1^2 + beta2^2)) fmodel <- matrix( c(.8, 0, 0, # x1 0, .6, 0, # x2 0, 0, .5), # y nrow=3, ncol=3, byrow=TRUE, dimnames=list( c('x1', 'x2', 'y'), c('x1', 'x2', 'y'))) # in this case, observed and latent variables are the same effects <- matrix( c(1, 0, beta1, 0, 1, beta2, 0, 0, y_resid_var), nrow=3, ncol=3, byrow=TRUE, dimnames=list( c('x1', 'x2', 'y'), c('x1', 'x2', 'y'))) sample_data <- gen_data(fmodel, effects, n_cases=1000) round(var(sample_data), 2) round(cor(sample_data), 2) summary(lm(y ~ x1 + x2, data=sample_data)) # note that beta coefficients are much smaller, due to measurement error
grid_search
runs a user-defined function iteratively. Parameter values
can be given to grid_search
, which will fully cross all parameters so
that each parameter value is tested at all other values of all parameters.
grid_search( func, params = NULL, n.iter = 1, output = c("list", "data.frame"), boot = FALSE, bootParams = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, beep = NULL, ... )
grid_search( func, params = NULL, n.iter = 1, output = c("list", "data.frame"), boot = FALSE, bootParams = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, beep = NULL, ... )
func |
A user-defined function. The first argument to this function will be the iteration number. |
params |
A list of parameters to be passed to |
n.iter |
Number of iterations (per set of params). |
output |
Specifies how |
boot |
Whether or not to use bootstrapped data to pass along to
|
bootParams |
If |
parallel |
The type of parallel operation to be used (if any). |
ncpus |
Integer: the number of processes to be used in parallel operation. |
cl |
An optional |
beep |
Include a numeric value or character vector indicating the sound you wish to play once the tests are done running. Requires the 'beepr' package, and information about supported values is available in the documentation for that package. |
... |
Additional arguments to be passed to |
Returns a list (by default) with one element per iteration. If
output
is specified as "data.frame", then func
must
return a (named) vector with the results you wish to capture.
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample size N=200 and N=300, with 500 iterations for each power_sim <- grid_search(lm_test, params=list(N=c(200, 300)), n.iter=500, b0=0, b1=.15)
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample size N=200 and N=300, with 500 iterations for each power_sim <- grid_search(lm_test, params=list(N=c(200, 300)), n.iter=500, b0=0, b1=.15)
lm_error_var
will calculate the required error variance for a linear
model, given specified model coefficients, to create variance for your
dependent variable of approximately 'var'.
lm_error_var(var = 1, ...)
lm_error_var(var = 1, ...)
var |
The variance you wish your dependent variable to be. |
... |
Pass along all model coefficients, excluding the intercept. These can be named or unnamed. |
Note: This function assumes that all predictors are independent (i.e., uncorrelated).
Returns the required error variance so that the variance of your dependent variable is approximately 'var'.
lm_error_var(var=1, .15, .3) # returns error variance of 0.8875
lm_error_var(var=1, .15, .3) # returns error variance of 0.8875
n.iter
extracts information about the number of iterations (per
specific test) performed by a parameter test.
n.iter(test, ...) ## S3 method for class 'paramtest' n.iter(test, ...)
n.iter(test, ...) ## S3 method for class 'paramtest' n.iter(test, ...)
test |
An object of type 'paramtest'. |
... |
Not currently implemented; used to ensure consistency with S3 generic. |
Returns the number of iterations done in each test.
n.iter(paramtest)
: Number of iterations for a parameter test.
print.paramtest_summary
prints a summary of the various combinations
of parameter values tested in a given parameter test.
## S3 method for class 'paramtest_summary' print(x, ...)
## S3 method for class 'paramtest_summary' print(x, ...)
x |
An object of class 'paramtest_summary', from
|
... |
Not currently implemented; used to ensure consistency with S3 generic. |
Returns a data frame with one row per set of unique tests.
random_search
runs a user-defined function iteratively. Lower and
upper bounds for parameter values can be given to random_search
, which
will then (uniformly) randomly select values within those bounds on each
iteration.
random_search( func, params = NULL, n.sample = 1, n.iter = 1, output = c("list", "data.frame"), boot = FALSE, bootParams = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, beep = NULL, ... )
random_search( func, params = NULL, n.sample = 1, n.iter = 1, output = c("list", "data.frame"), boot = FALSE, bootParams = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, beep = NULL, ... )
func |
A user-defined function. The first argument to this function will be the iteration number. |
params |
A named list of parameters to be passed to |
n.sample |
Number of times to sample from the parameter values. |
n.iter |
Number of iterations (per set of params). |
output |
Specifies how |
boot |
Whether or not to use bootstrapped data to pass along to
|
bootParams |
If |
parallel |
The type of parallel operation to be used (if any). |
ncpus |
Integer: the number of processes to be used in parallel operation. |
cl |
An optional |
beep |
Include a numeric value or character vector indicating the sound you wish to play once the tests are done running. Requires the 'beepr' package, and information about supported values is available in the documentation for that package. |
... |
Additional arguments to be passed to |
Returns a list (by default) with one element per iteration. If
output
is specified as "data.frame", then func
must
return a (named) vector with the results you wish to capture.
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample sizes between N=200 and N=300, with 500 iterations total power_sim <- random_search(lm_test, params=list(N=c(200, 300)), n.iter=500, b0=0, b1=.15)
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample sizes between N=200 and N=300, with 500 iterations total power_sim <- random_search(lm_test, params=list(N=c(200, 300)), n.iter=500, b0=0, b1=.15)
results
returns the raw data from a parameter test.
results(test, ...) ## S3 method for class 'paramtest' results(test, ...)
results(test, ...) ## S3 method for class 'paramtest' results(test, ...)
test |
An object of type 'paramtest'. |
... |
Not currently implemented; used to ensure consistency with S3 generic. |
Returns a data frame with all the data returned from each test.
results(paramtest)
: Results for a parameter test.
run_test
runs a user-defined function iteratively. This function is
intentionally kept general and flexible, to allow for a wide variety of
applications. This function is the general-purpose function called by
functions such as grid_search
and random_search
, which provide
different methods for generating the parameters to be tested.
run_test( func, params = NULL, n.iter = 1, output = c("list", "data.frame"), boot = FALSE, bootParams = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, beep = NULL, ... )
run_test( func, params = NULL, n.iter = 1, output = c("list", "data.frame"), boot = FALSE, bootParams = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, beep = NULL, ... )
func |
A user-defined function. The first argument to this function will be the iteration number. |
params |
A list or data frame of parameters to be passed to |
n.iter |
Number of iterations (per set of params). |
output |
Specifies how |
boot |
Whether or not to use bootstrapped data to pass along to
|
bootParams |
If |
parallel |
The type of parallel operation to be used (if any). |
ncpus |
Integer: the number of processes to be used in parallel operation. |
cl |
An optional |
beep |
Include a numeric value or character vector indicating the sound you wish to play once the tests are done running. If set to TRUE, a random sound will be played. Requires the 'beepr' package, and information about supported values is available in the documentation for that package. |
... |
Additional arguments to be passed to |
Returns a list (by default) with one element per iteration. If
output
is specified as "data.frame", then func
must
return a (named) vector with the results you wish to capture.
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample size N=200 and N=300, with 500 iterations for each power_sim <- run_test(lm_test, params=data.frame(N=c(200, 300)), n.iter=500, b0=0, b1=.15)
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample size N=200 and N=300, with 500 iterations for each power_sim <- run_test(lm_test, params=data.frame(N=c(200, 300)), n.iter=500, b0=0, b1=.15)
summary.paramtest
provides a summary of the various combinations of
parameter values tested in a given parameter test.
## S3 method for class 'paramtest' summary(object, ...)
## S3 method for class 'paramtest' summary(object, ...)
object |
An object of class 'paramtest'. |
... |
Not currently implemented; used to ensure consistency with S3 generic. |
Returns a data frame with one row per set of unique tests.
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample sizes between N=200 and N=300, with 500 iterations total power_sim <- random_search(lm_test, params=list(N=c(200, 300)), n.iter=500, b0=0, b1=.15) summary(power_sim)
lm_test <- function(iter, N, b0, b1) { x <- rnorm(N, 0, 1) y <- rnorm(N, b0 + b1*x, sqrt(1 - b1^2)) data <- data.frame(y, x) model <- lm(y ~ x, data) # capture output from model summary est <- coef(summary(model))['x', 'Estimate'] se <- coef(summary(model))['x', 'Std. Error'] p <- coef(summary(model))['x', 'Pr(>|t|)'] return(c(xm=mean(x), xsd=sd(x), ym=mean(y), ysd=sd(y), est=est, se=se, p=p, sig=est > 0 & p <= .05)) } # test power for sample sizes between N=200 and N=300, with 500 iterations total power_sim <- random_search(lm_test, params=list(N=c(200, 300)), n.iter=500, b0=0, b1=.15) summary(power_sim)
tests
extracts information about the set of specific tests (parameter
values) for a parameter test.
tests(test, ...) ## S3 method for class 'paramtest' tests(test, ...)
tests(test, ...) ## S3 method for class 'paramtest' tests(test, ...)
test |
An object of type 'paramtest'. |
... |
Not currently implemented; used to ensure consistency with S3 generic. |
Returns a data frame with one row for each set of tests that was performed.
tests(paramtest)
: Parameter values for a parameter test.
timing
returns the information about how long a parameter test took.
timing(test, ...) ## S3 method for class 'paramtest' timing(test, ...)
timing(test, ...) ## S3 method for class 'paramtest' timing(test, ...)
test |
An object of type 'paramtest'. |
... |
Not currently implemented; used to ensure consistency with S3 generic. |
Returns an object of class "proc_time" with information about how long the parameter test process took.
timing(paramtest)
: Timing information for a parameter test.