matreg {metafor} | R Documentation |
Function to fit regression models based on correlation and covariance matrices.
matreg(y, x, R, n, V, cov=FALSE, means, ztor=FALSE, nearpd=FALSE, level=95, digits, ...)
y |
index (or name given as a character string) of the outcome variable. |
x |
indices (or names given as a character vector) of the predictor variables. |
R |
correlation or covariance matrix (or only the lower triangular part including the diagonal). |
n |
sample size based on which the elements in the correlation/covariance matrix were computed. |
V |
variance-covariance matrix of the lower triangular elements of the correlation/covariance matrix. Either |
cov |
logical to specify whether |
means |
optional vector to specify the means of the variables (only relevant when |
ztor |
logical to specify whether |
nearpd |
logical to specify whether the |
level |
numeric value between 0 and 100 to specify the confidence interval level (the default is 95). |
digits |
optional integer to specify the number of decimal places to which the printed results should be rounded. |
... |
other arguments. |
Let \(R\) be a \(p \times p\) correlation or covariance matrix. Let \(y\) denote the row/column of the outcome variable and \(x\) the row(s)/column(s) of the predictor variable(s) in this matrix. Let \(R_{x,x}\) and \(R_{x,y}\) denote the corresponding submatrices of \(R\). Then \[b = R_{x,x}^{-1} R_{x,y}\] yields the standardized or raw regression coefficients (depending on whether \(R\) is a correlation or covariance matrix, respectively) when regressing the outcome variable on the predictor variable(s).
The \(R\) matrix may be computed based on a single sample of \(n\) subjects. In this case, one should specify the sample size via argument n
. The variance-covariance matrix of the standardized regression coefficients is then given by \(\mbox{Var}[b] = \mbox{MSE} \times R_{x,x}^{-1}\), where \(\mbox{MSE} = (1 - b'R_{x,y}) / (n - m)\) and \(m\) denotes the number of predictor variables. The standard errors are then given by the square root of the diagonal elements of \(\mbox{Var}[b]\). Test statistics (in this case, t-statistics) and the corresponding p-values can then be computed as in a regular regression analysis. When \(R\) is a covariance matrix, one should set cov=TRUE
and specify the means of the \(p\) variables via argument means
to obtain raw regression coefficients including the intercept and corresponding standard errors.
Alternatively, \(R\) may be the result of a meta-analysis of correlation coefficients. In this case, the elements in \(R\) are pooled correlation coefficients and the variance-covariance matrix of these pooled coefficients should be specified via argument V
. The order of elements in V
should correspond to the order of elements in the lower triangular part of \(R\) column-wise. For example, if \(R\) is a \(4 \times 4\) matrix of the form: \[\begin{bmatrix} 1 & & & \\ r_{21} & 1 & & \\ r_{31} & r_{32} & 1 & \\ r_{41} & r_{42} & r_{43} & 1 \end{bmatrix}\] then the elements are \(r_{21}\), \(r_{31}\), \(r_{41}\), \(r_{32}\), \(r_{42}\), and \(r_{43}\) and hence V
should be a \(6 \times 6\) variance-covariance matrix of these elements in this order. The variance-covariance matrix of the standardized regression coefficients (i.e., \(\mbox{Var}[b]\)) is then computed as a function of V
as described in Becker (1992) using the multivariate delta method. The standard errors are then again given by the square root of the diagonal elements of \(\mbox{Var}[b]\). Test statistics (in this case, z-statistics) and the corresponding p-values can then be computed in the usual manner.
In case \(R\) is the result of a meta-analysis of Fisher r-to-z transformed correlation coefficients (and hence V
is then the corresponding variance-covariance matrix of these pooled transformed coefficients), one should set argument ztor=TRUE
, so that the appropriate back-transformation is then applied to R
(and V
) within the function.
Finally, \(R\) may be a covariance matrix based on a meta-analysis (e.g., the estimated variance-covariance matrix of the random effects in a multivariate model). In this case, one should set cov=TRUE
and V
should again be the variance-covariance matrix of the elements in \(R\), but now including the diagonal. Hence, if \(R\) is a \(4 \times 4\) matrix of the form: \[\begin{bmatrix} \tau_1^2 & & & \\ \tau_{21} & \tau_2^2 & & \\ \tau_{31} & \tau_{32} & \tau_3^2 & \\ \tau_{41} & \tau_{42} & \tau_{43} & \tau_4^2 \end{bmatrix}\] then the elements are \(\tau^2_1\), \(\tau_{21}\), \(\tau_{31}\), \(\tau_{41}\), \(\tau^2_2\), \(\tau_{32}\), \(\tau_{42}\), \(\tau^2_3\), \(\tau_{43}\), and \(\tau^2_4\), and hence V
should be a \(10 \times 10\) variance-covariance matrix of these elements in this order. Argument means
can then again be used to specify the means of the variables.
An object of class "matreg"
. The object is a list containing the following components:
tab |
a data frame with the estimated model coefficients, standard errors, test statistics, degrees of freedom (only for t-tests), p-values, and lower/upper confidence interval bounds. |
vb |
the variance-covariance matrix of the estimated model coefficients. |
... |
some additional elements/values. |
The results are formatted and printed with the print
function.
Only the lower triangular part of R
(and V
if it is specified) is used in the computations.
If \(R_{x,x}\) is not invertible, an error will be issued. In this case, one can set argument nearpd=TRUE
, in which case the nearPD
function from the Matrix package will be used to find the nearest positive semi-definite matrix, which should be invertible. The results should be treated with caution when this is done.
When \(R\) is a covariance matrix with V
and means
specified, the means are treated as known constants when estimating the standard error of the intercept.
Wolfgang Viechtbauer wvb@metafor-project.org https://www.metafor-project.org
Becker, B. J. (1992). Using results from replicated studies to estimate linear models. Journal of Educational Statistics, 17(4), 341–362. https://doi.org/10.3102/10769986017004341
Becker, B. J. (1995). Corrections to "Using results from replicated studies to estimate linear models". Journal of Educational and Behavioral Statistics, 20(1), 100–102. https://doi.org/10.3102/10769986020001100
Becker, B. J., & Aloe, A. (2019). Model-based meta-analysis and related approaches. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (3rd ed., pp. 339–363). New York: Russell Sage Foundation.
rma.mv
for a function to meta-analyze multiple correlation coefficients that can be used to construct an \(R\) matrix.
rcalc
for a function to construct the variance-covariance matrix of dependent correlation coefficients.
### copy data into 'dat' dat <- dat.craft2003 ### construct dataset and var-cov matrix of the correlations tmp <- rcalc(ri ~ var1 + var2 | study, ni=ni, data=dat) V <- tmp$V dat <- tmp$dat ### turn var1.var2 into a factor with the desired order of levels dat$var1.var2 <- factor(dat$var1.var2, levels=c("acog.perf", "asom.perf", "conf.perf", "acog.asom", "acog.conf", "asom.conf")) ### multivariate random-effects model res <- rma.mv(yi, V, mods = ~ var1.var2 - 1, random = ~ var1.var2 | study, struct="UN", data=dat) res ### restructure estimated mean correlations into a 4x4 matrix R <- vec2mat(coef(res)) rownames(R) <- colnames(R) <- c("perf", "acog", "asom", "conf") round(R, digits=3) ### check that order in vcov(res) corresponds to order in R round(vcov(res), digits=4) ### fit regression model with 'perf' as outcome and 'acog', 'asom', and 'conf' as predictors matreg(1, 2:4, R=R, V=vcov(res)) ### can also specify variable names matreg("perf", c("acog","asom","conf"), R=R, V=vcov(res)) ## Not run: ### repeat the above but with r-to-z transformed correlations dat <- dat.craft2003 tmp <- rcalc(ri ~ var1 + var2 | study, ni=ni, data=dat, rtoz=TRUE) V <- tmp$V dat <- tmp$dat dat$var1.var2 <- factor(dat$var1.var2, levels=c("acog.perf", "asom.perf", "conf.perf", "acog.asom", "acog.conf", "asom.conf")) res <- rma.mv(yi, V, mods = ~ var1.var2 - 1, random = ~ var1.var2 | study, struct="UN", data=dat) R <- vec2mat(coef(res)) rownames(R) <- colnames(R) <- c("perf", "acog", "asom", "conf") matreg(1, 2:4, R=R, V=vcov(res), ztor=TRUE) ## End(Not run) ############################################################################ ### a different example based on van Houwelingen et al. (2002) ### create dataset in long format dat.long <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.colditz1994) dat.long <- escalc(measure="PLO", xi=out1, mi=out2, data=dat.long) dat.long$tpos <- dat.long$tneg <- dat.long$cpos <- dat.long$cneg <- NULL levels(dat.long$group) <- c("CON", "EXP") ### fit bivariate model res <- rma.mv(yi, vi, mods = ~ group - 1, random = ~ group | trial, struct="UN", data=dat.long, method="ML") res ### regression of log(odds)_EXP on log(odds)_CON matreg(y=2, x=1, R=res$G, cov=TRUE, means=coef(res), n=res$g.levels.comb.k) ### but the SE of the CON coefficient is not computed correctly, since above we treat res$G as if ### it was a var-cov matrix computed from raw data based on res$g.levels.comb.k (= 13) data points ### fit bivariate model and get the var-cov matrix of the estimates in res$G res <- rma.mv(yi, vi, mods = ~ group - 1, random = ~ group | trial, struct="UN", data=dat.long, method="ML", cvvc="varcov", control=list(nearpd=TRUE)) ### now use res$vvc as the var-cov matrix of the estimates in res$G matreg(y=2, x=1, R=res$G, cov=TRUE, means=coef(res), V=res$vvc)