Title: | Instrumental Variables |
---|---|
Description: | Contains tools for instrumental variables estimation. Currently, non-parametric bounds, two-stage estimation and G-estimation are implemented. Balke, A. and Pearl, J. (1997) <doi:10.2307/2965583>, Vansteelandt S., Bowden J., Babanezhad M., Goetghebeur E. (2011) <doi:10.1214/11-STS360>. |
Authors: | Arvid Sjolander, Elisabeth Dahlqwist, Torben Martinussen |
Maintainer: | Arvid Sjolander <[email protected]> |
License: | LGPL (>= 3) |
Version: | 2.3.0 |
Built: | 2024-11-05 04:27:02 UTC |
Source: | https://github.com/cran/ivtools |
ah
is a wrapper around the ahaz
function in the ahaz
package, with a more user-friendly and standard interface. Refer to the
manual for ahaz
for details.
ah(formula, data, weights, robust=FALSE)
ah(formula, data, weights, robust=FALSE)
formula |
an object of class "formula": a symbolic description of the model to be fitted. |
data |
a data frame containing the variables in the model. |
weights |
an optional vector of prior weights to be used in the fitting process. |
robust |
robust calculation of variance; see manual for |
See manual for ahaz
.
An object of class "ah"
is a list containing the same elements as
an object of class "ahaz"
, plus
call |
the matched call. |
formula |
the formula argument. |
coefficients |
a named vector of estimated coefficients. |
vcov |
the variance-covariance matrix for the estimated coefficients. |
incl |
the |
The ahaz
function does not allow for ties. Thus, before calling
ah
any ties have to be manually broken.
Arvid Sjolander.
Lin D.Y., Ying Z. (1994). Semiparametric analysis of the additive risk model. Biometrika 81(1), 61-71.
require(ahaz) ##This example is adapted from the example given for the ahaz function ##in the ahaz package data(sorlie) # Break ties set.seed(10101) sorlie$time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Fit additive hazards model fit <- ah(formula=Surv(time, status)~X13+X14+X15+X16+X17+X18+X19+X20+X21+X22, data=sorlie) summary(fit)
require(ahaz) ##This example is adapted from the example given for the ahaz function ##in the ahaz package data(sorlie) # Break ties set.seed(10101) sorlie$time <- sorlie$time+runif(nrow(sorlie))*1e-2 # Fit additive hazards model fit <- ah(formula=Surv(time, status)~X13+X14+X15+X16+X17+X18+X19+X20+X21+X22, data=sorlie) summary(fit)
This is a confint
method for class "ivmod"
.
## S3 method for class 'ivmod' confint(object, parm, level=0.95, ...)
## S3 method for class 'ivmod' confint(object, parm, level=0.95, ...)
object |
an object of class |
parm |
not used. |
level |
the coverage probability of the confidence intervals. |
... |
not used. |
Arvid Sjolander.
"ivmod"
objects, fitted with estmethod="g"
.
estfun
computes the estimating function
for a
"ivmod"
object, fitted with estmethod="g"
,
for a range of values of . The
estfun
is not implemented
for "ivah"
objects, since G-estimation in additive hazards models
is based on a recursive estimation technique, and not standard estimating equations.
estfun(object, lower, upper, step)
estfun(object, lower, upper, step)
object |
an object of class |
lower |
an optional vector of lower values for |
upper |
an optional vector of upper values for |
step |
an optional vector of steps between |
estfun
may be useful for visual inspection of the estimating
function, to make sure that a solution to the estimating equation
was found, see ‘Examples’. For the :th element of
, the estimating
function sum is computed for a range of values within (
lower[i]
, upper[i]
), at the
G-estimate of the remaining elements of .
An object of class "estfun"
is a list containing
f |
a named list of matricies; one matrix for each element of |
est |
the G-estimate of |
Arvid Sjolander.
Burgess S, Granell R, Palmer TM, Sterne JA, Didelez V. (2014). Lack of identification in semiparametric instrumental variable models with binary outcomes. American Journal of Epidemiology 180(1), 111-119.
Vansteelandt S., Bowden J., Babanezhad M., Goetghebeur E. (2011). On instrumental variables estimation of causal odds ratios. Statistical Science 26(3), 403-422.
set.seed(9) ##Note: the parameter values in the examples below are chosen to make ##Y0 independent of Z, which is necessary for Z to be a valid instrument. n <- 1000 psi0 <- 0.5 psi1 <- 0.2 ##---Example 1: linear model and interaction between X and L--- L <- rnorm(n) Z <- rnorm(n, mean=L) X <- rnorm(n, mean=Z) m0 <- X-Z+L Y <- rnorm(n, mean=psi0*X+psi1*X*L+m0) data <- data.frame(L, Z, X, Y) #G-estimation fitZ.L <- glm(formula=Z~L, data=data) fitIV <- ivglm(estmethod="g", X="X", Y="Y", fitZ.L=fitZ.L, data=data, formula=~L, link="identity") summary(fitIV) H <- estfun(fitIV) plot(H) ##---Example 2: logistic model and no covariates--- Z <- rbinom(n, 1, 0.5) X <- rbinom(n, 1, 0.7*Z+0.2*(1-Z)) m0 <- plogis(1+0.8*X-0.39*Z) Y <- rbinom(n, 1, plogis(psi0*X+log(m0/(1-m0)))) data <- data.frame(Z, X, Y) #G-estimation fitZ.L <- glm(formula=Z~1, data=data) fitY.LZX <- glm(formula=Y~X+Z+X*Z, family="binomial", data=data) fitIV <- ivglm(estmethod="g", X="X", fitZ.L=fitZ.L, fitY.LZX=fitY.LZX, data=data, link="logit") summary(fitIV) H <- estfun(fitIV) plot(H)
set.seed(9) ##Note: the parameter values in the examples below are chosen to make ##Y0 independent of Z, which is necessary for Z to be a valid instrument. n <- 1000 psi0 <- 0.5 psi1 <- 0.2 ##---Example 1: linear model and interaction between X and L--- L <- rnorm(n) Z <- rnorm(n, mean=L) X <- rnorm(n, mean=Z) m0 <- X-Z+L Y <- rnorm(n, mean=psi0*X+psi1*X*L+m0) data <- data.frame(L, Z, X, Y) #G-estimation fitZ.L <- glm(formula=Z~L, data=data) fitIV <- ivglm(estmethod="g", X="X", Y="Y", fitZ.L=fitZ.L, data=data, formula=~L, link="identity") summary(fitIV) H <- estfun(fitIV) plot(H) ##---Example 2: logistic model and no covariates--- Z <- rbinom(n, 1, 0.5) X <- rbinom(n, 1, 0.7*Z+0.2*(1-Z)) m0 <- plogis(1+0.8*X-0.39*Z) Y <- rbinom(n, 1, plogis(psi0*X+log(m0/(1-m0)))) data <- data.frame(Z, X, Y) #G-estimation fitZ.L <- glm(formula=Z~1, data=data) fitY.LZX <- glm(formula=Y~X+Z+X*Z, family="binomial", data=data) fitIV <- ivglm(estmethod="g", X="X", fitZ.L=fitZ.L, fitY.LZX=fitY.LZX, data=data, link="logit") summary(fitIV) H <- estfun(fitIV) plot(H)
ivah
performs instrumental variable estimation of the causal exposure effect in
AH models with individual-level data. Below, ,
, and
are the instrument, the exposure, and the outcome, respectively.
is a vector of covariates that we wish to control for in the analysis;
these would typically be confounders for the instrument and the outcome.
ivah(estmethod, X, T, fitZ.L=NULL, fitX.LZ=NULL, fitT.LX=NULL, data, ctrl=FALSE, clusterid=NULL, event, max.time, max.time.psi, n.sim=100, vcov.fit=TRUE, ...)
ivah(estmethod, X, T, fitZ.L=NULL, fitX.LZ=NULL, fitT.LX=NULL, data, ctrl=FALSE, clusterid=NULL, event, max.time, max.time.psi, n.sim=100, vcov.fit=TRUE, ...)
estmethod |
a string specifying the desired estimation method; either |
X |
a string specifying the name of the exposure |
T |
a string specifying the name of the follow-up time |
fitZ.L |
an object of class |
fitX.LZ |
an object of class |
fitT.LX |
If |
data |
a data frame containing the variables in the model. The covariates, instrument,
exposure and outcome can have arbitrary names, e.g. they don't need to
be called |
ctrl |
logical. Should the control function |
clusterid |
an optional string containing the name of a cluster identification variable when
data are clustered. Specifying |
event |
a string specifying the name of the status indicator, 0="no event", 1="event".
This argument is not used when |
max.time |
optional follow-up for estimating |
max.time.psi |
optional follow-up for estimating |
n.sim |
optional number of resamplings for testing goodness-of-fit of constant effects model
for G-estimation. Defaults to 100. This argument is not used when |
vcov.fit |
logical. Should the variance-covariance matrix be computed? |
... |
optional arguments passed on to the |
The ivah
estimates different parameters, depending on whether
estmethod="ts"
or estmethod="g"
. If estmethod="ts"
, then
ivah
uses two-stage estimation to estimate the parameter in the causal AH model
Here, is counterfactual hazard function,
had the exposure been set to 0. The vector function
contains interaction terms
between
and
. These are specified
implicitly through the model
fitY
. The model fitX.LZ
is used to
construct predictions . These predictions are
subsequently used to re-fit the model
fitY
, with replaced with
. The obtained coefficient(s) for
is the two-stage
estimator of
.
If estmethod="g"
, then ivah
uses G-estimation to estimate the function
in the causal AH model
It also delivers an estimate of assuming that this function is
constant across time (=
).
ivah
returns an object of class "ivah"
, which inherits from
class "ivmod"
. An object of class "ivah"
is a list containing
call |
the matched call. |
input |
|
est |
a vector containing the estimate of |
vcov |
the variance-covariance matrix for the estimate of |
estfunall |
a matrix of all subject-specific contributions to the estimating functions used in the estimation process.
One row for each subject, one column for each parameter. If |
d.estfun |
the jacobian matrix of |
converged |
logical. Was a solution found to the estimating equations? |
fitY |
the re-fitted model |
stime |
the ordered event times within (0,max.time). This element is NULL when |
B |
the estimate of |
se_B |
the standard error of the estimate of |
pval_0 |
p-value corresponding to supremum test of the null |
eps_B |
the iid-decomposition of |
pval_psi |
p-value corresponding to the null |
pval_GOF_sup |
p-value corresponding to supremum test of the null |
pval_GOF_CvM |
as pval_GOF_sup but now based on the Cramer Von Mises test statistic.
This element is NULL when |
GOF.resamp |
a matrix with first row the ordered jump times in (0,max.time.bet),
second row the observed test process, and the remaining rows are
50 processes sampled under the null.
This element is NULL when |
ivah
allows for weights. However, these are defined implicitly
through the input models. Thus, when models are used as input to ivah
,
these models have to be fitted with the same weights.
Left-truncation and correction of standard errors for clustered data are currently not
implemented when estmethod="g"
.
Arvid Sjolander and Torben Martinussen.
Martinussen T., Vansteelandt S., Tchetgen Tchetgen E.J., Zucker D.M. (2017). Instrumental variables estimation of exposure effects on a time-to-event endpoint using structural cumulative survival models. Epidemiology 73(4): 1140-1149.
Sjolander A., Martinussen T. (2019). Instrumental variable estimation with the R package ivtools. Epidemiologic Methods 8(1), 1-20.
Tchetgen Tchetgen E.J., Walter S., Vansteelandt S., Martinussen T., Glymour M. (2015). Instrumental variable estimation in a survival context. Epidemiology 26(3): 402-410.
require(ahaz) set.seed(9) n <- 1000 psi0 <- 0.2 psi1 <- 0.0 U <- runif(n) L <- runif(n) Z <- rbinom(n, 1, plogis(-0.5+L)) X <- runif(n, min=Z+U, max=2+Z+U) T <- rexp(n, rate=psi0*X+psi1*X*L+0.2*U+0.2*L) C <- 5 #administrative censoring at t=5 d <- as.numeric(T<C) T <- pmin(T, C) data <- data.frame(L, Z, X, T, d) #break ties data$T <- data$T+rnorm(n=nrow(data), sd=0.001) #two-stage estimation fitX.LZ <- glm(formula=X~Z+L, data=data) fitT.LX <- ah(formula=Surv(T, d)~X+L+X*L, data=data) fitIV <- ivah(estmethod="ts", fitX.LZ=fitX.LZ, fitT.LX=fitT.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation fitZ.L <- glm(formula=Z~L, family="binomial", data=data) fitIV <- ivah(estmethod="g", X="X", T="T", fitZ.L=fitZ.L, data=data, event="d", max.time=4, max.time.psi=4, n.sim=100) summary(fitIV) plot(fitIV)
require(ahaz) set.seed(9) n <- 1000 psi0 <- 0.2 psi1 <- 0.0 U <- runif(n) L <- runif(n) Z <- rbinom(n, 1, plogis(-0.5+L)) X <- runif(n, min=Z+U, max=2+Z+U) T <- rexp(n, rate=psi0*X+psi1*X*L+0.2*U+0.2*L) C <- 5 #administrative censoring at t=5 d <- as.numeric(T<C) T <- pmin(T, C) data <- data.frame(L, Z, X, T, d) #break ties data$T <- data$T+rnorm(n=nrow(data), sd=0.001) #two-stage estimation fitX.LZ <- glm(formula=X~Z+L, data=data) fitT.LX <- ah(formula=Surv(T, d)~X+L+X*L, data=data) fitIV <- ivah(estmethod="ts", fitX.LZ=fitX.LZ, fitT.LX=fitT.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation fitZ.L <- glm(formula=Z~L, family="binomial", data=data) fitIV <- ivah(estmethod="g", X="X", T="T", fitZ.L=fitZ.L, data=data, event="d", max.time=4, max.time.psi=4, n.sim=100) summary(fitIV) plot(fitIV)
ivbounds
computes non-parametric bounds for counterfactual outcome probabilities
in instrumental variables scenarios. Let ,
, and
be the outcome, exposure, and instrument, respectively.
and
must be binary,
whereas
can be either binary or ternary.
Ternary instruments are common in, for instance, Mendelian randomization.
Let
be the counterfactual probability of the outcome, had all
subjects been exposed to level
.
ivbounds
computes bounds for the
counterfactuals probabilities and
. Below, we define
.
ivbounds(data, Z, X, Y, monotonicity=FALSE, weights)
ivbounds(data, Z, X, Y, monotonicity=FALSE, weights)
data |
either a data frame containing the variables in the model, or a named vector
|
Z |
a string containing the name of the instrument |
X |
a string containing the name of the exposure |
Y |
a string containing the name of the outcome |
monotonicity |
logical. It is sometimes realistic to make the monotonicity assumption
|
weights |
an optional vector of ‘prior weights’ to be used in the fitting process.
Should be NULL or a numeric vector. Only applicable if |
ivbounds
uses linear programming techniques to bound the counterfactual probabilities
and
. Bounds for a causal effect, defined as a contrast between these,
are obtained by plugging in the bounds for
and
into the
contrast. For instance, bounds for the causal risk difference
are obtained as
.
In addition to the bounds,
ivbounds
evaluates the IV inequality
An object of class "ivbounds"
is a list containing
call |
the matched call. |
p0 |
a named vector with elements |
p1 |
a named vector with elements |
p0.symbolic |
a named vector with elements |
p1.symbolic |
a named vector with elements |
IVinequality |
logical. Does the IV inequality hold? |
conditions |
a character vector containing the violated condiations, if |
Arvid Sjolander.
Balke, A. and Pearl, J. (1997). Bounds on treatment effects from studies with imperfect compliance. Journal of the American Statistical Association 92(439), 1171-1176.
Sjolander A., Martinussen T. (2019). Instrumental variable estimation with the R package ivtools. Epidemiologic Methods 8(1), 1-20.
##Vitamin A example from Balke and Pearl (1997). n000 <- 74 n001 <- 34 n010 <- 0 n011 <- 12 n100 <- 11514 n101 <- 2385 n110 <- 0 n111 <- 9663 n0 <- n000+n010+n100+n110 n1 <- n001+n011+n101+n111 #with data frame... data <- data.frame(Y=c(0,0,0,0,1,1,1,1), X=c(0,0,1,1,0,0,1,1), Z=c(0,1,0,1,0,1,0,1)) n <- c(n000, n001, n010, n011, n100, n101, n110, n111) b <- ivbounds(data=data, Z="Z", X="X", Y="Y", weights=n) summary(b) #...or with vector of probabilities p <- n/rep(c(n0, n1), 4) names(p) <- c("p00.0", "p00.1", "p01.0", "p01.1", "p10.0", "p10.1", "p11.0", "p11.1") b <- ivbounds(data=p) summary(b)
##Vitamin A example from Balke and Pearl (1997). n000 <- 74 n001 <- 34 n010 <- 0 n011 <- 12 n100 <- 11514 n101 <- 2385 n110 <- 0 n111 <- 9663 n0 <- n000+n010+n100+n110 n1 <- n001+n011+n101+n111 #with data frame... data <- data.frame(Y=c(0,0,0,0,1,1,1,1), X=c(0,0,1,1,0,0,1,1), Z=c(0,1,0,1,0,1,0,1)) n <- c(n000, n001, n010, n011, n100, n101, n110, n111) b <- ivbounds(data=data, Z="Z", X="X", Y="Y", weights=n) summary(b) #...or with vector of probabilities p <- n/rep(c(n0, n1), 4) names(p) <- c("p00.0", "p00.1", "p01.0", "p01.1", "p10.0", "p10.1", "p11.0", "p11.1") b <- ivbounds(data=p) summary(b)
ivcoxph
performs instrumental variable estimation of the causal exposure effect in
Cox PH models with individual-level data. Below, ,
, and
are the instrument, the exposure, and the outcome, respectively.
is a vector of covariates that we wish to control for in the analysis;
these would typically be confounders for the instrument and the outcome.
ivcoxph(estmethod, X, fitZ.L=NULL, fitX.LZ=NULL, fitX.L=NULL, fitT.LX=NULL, fitT.LZX=NULL, data, formula=~1, ctrl=FALSE, clusterid=NULL, t=NULL, vcov.fit=TRUE, ...)
ivcoxph(estmethod, X, fitZ.L=NULL, fitX.LZ=NULL, fitX.L=NULL, fitT.LX=NULL, fitT.LZX=NULL, data, formula=~1, ctrl=FALSE, clusterid=NULL, t=NULL, vcov.fit=TRUE, ...)
estmethod |
a string specifying the desired estimation method; either |
X |
a string specifying the name of the exposure |
fitZ.L |
an object of class |
fitX.LZ |
an object of class |
fitX.L |
an object of class |
fitT.LX |
an object of class |
fitT.LZX |
either an object of class |
data |
a data frame containing the variables in the model. The covariates, instrument,
exposure and outcome can have arbitrary names, e.g. they don't need to
be called |
formula |
an object of class |
ctrl |
logical. Should the control function |
clusterid |
an optional string containing the name of a cluster identification variable when
data are clustered. Specifying |
t |
a numeric scalar specifying the time point at which to solve the estimating
equation when |
vcov.fit |
logical. Should the variance-covariance matrix be computed? |
... |
optional arguments passed on to the |
ivcoxph
estimates the parameter in the causal Cox PH model
Here, is counterfactual hazard function,
had the exposure been set to 0. The vector function
contains interaction terms
between
and
. If
estmethod="ts"
, then these are specified
implicitly through the model fitT.LX
. If estmethod="g"
, then these
are specified explicitly through the formula
argument.
If estmethod="ts"
, then two-stage estimation of is performed.
In this case, the model
fitX.LZ
is used to construct predictions
. These predictions are subsequently used to re-fit
the model
fitT.LX
, with replaced with
. The obtained
coefficient(s) for
in the re-fitted model is the two-stage estimator of
.
If estmethod="g"
, then G-estimation of is performed. In this case,
the estimator is obtained as the solution to the estimating equation
where
The estimated function is chosen so that the true function
has conditional mean 0, given
;
.
The specific form of
is determined by the user-specified models.
If
fitX.LZ
and fitX.L
are specified, then ,
where
and
are obtained from
fitX.LZ
and fitX.L
, respectively. If these are not specified, then ,
where
is obtained from
fitZ.L
, which then must be specified.
The estimating equation is solved at the value of specified by the argument
t
.
is an estimate of
obtained
from the model
fitT.LZX
.
ivcoxph
returns an object of class "ivcoxph"
, which inherits from
class "ivmod"
. An object of class "ivcoxph"
is a list containing
call |
the matched call. |
input |
|
est |
a vector containing the estimate of |
vcov |
the variance-covariance matrix for the estimate of |
estfunall |
a matrix of all subject-specific contributions to the estimating functions used in the estimation process.
One row for each subject, one column for each parameter. If |
d.estfun |
the jacobian matrix of |
converged |
logical. Was a solution found to the estimating equations? |
fitT.LX |
the re-fitted model |
t |
the value of |
ivcoxph
allows for weights. However, these are defined implicitly
through the input models. Thus, when models are used as input to ivcoxph
,
these models have to be fitted with the same weights. When estmethod="g"
the weights are taken from fitX.LZ
, if specified by the user. If fitX.LZ
is not
specified then the weights are taken from fitZ.L
. Hence, if weights are used,
then either fitX.LZ
or fitZ.L
must be specified.
Arvid Sjolander.
Martinussen T., Sorensen D.D., Vansteelandt S. (2019). Instrumental variables estimation under a structural Cox model. Biostatistics 20(1), 65-79.
Sjolander A., Martinussen T. (2019). Instrumental variable estimation with the R package ivtools. Epidemiologic Methods 8(1), 1-20.
Tchetgen Tchetgen E.J., Walter S., Vansteelandt S., Martinussen T., Glymour M. (2015). Instrumental variable estimation in a survival context. Epidemiology 26(3), 402-410.
require(survival) set.seed(9) ##Note: the parameter values in the examples below are chosen to make ##Y0 independent of Z, which is necessary for Z to be a valid instrument. n <- 10000 psi0 <- 0.5 Z <- rbinom(n, 1, 0.5) X <- rbinom(n, 1, 0.7*Z+0.2*(1-Z)) m0 <- exp(0.8*X-0.41*Z) #T0 independent of Z at t=1 T <- rexp(n, rate=exp(psi0*X+log(m0))) C <- rexp(n, rate=exp(psi0*X+log(m0))) #50% censoring d <- as.numeric(T<C) T <- pmin(T, C) data <- data.frame(Z, X, T, d) #two-stage estimation fitX.LZ <- glm(formula=X~Z, data=data) fitT.LX <- coxph(formula=Surv(T, d)~X, data=data) fitIV <- ivcoxph(estmethod="ts", fitX.LZ=fitX.LZ, fitT.LX=fitT.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation with non-parametric model for S(t|L,Z,X) and model for Z fitZ.L <- glm(formula=Z~1, data=data) fitT.LZX <- survfit(formula=Surv(T, d)~X+Z, data=data) fitIV <- ivcoxph(estmethod="g", X="X", fitZ.L=fitZ.L, fitT.LZX=fitT.LZX, data=data, t=1) summary(fitIV) #G-estimation with Cox model for \lambda(t|L,Z,X) and model for Z fitZ.L <- glm(formula=Z~1, data=data) fitT.LZX <- coxph(formula=Surv(T, d)~X+X+X*Z, data=data) fitIV <- ivcoxph(estmethod="g", X="X", fitZ.L=fitZ.L, fitT.LZX=fitT.LZX, data=data, t=1) summary(fitIV)
require(survival) set.seed(9) ##Note: the parameter values in the examples below are chosen to make ##Y0 independent of Z, which is necessary for Z to be a valid instrument. n <- 10000 psi0 <- 0.5 Z <- rbinom(n, 1, 0.5) X <- rbinom(n, 1, 0.7*Z+0.2*(1-Z)) m0 <- exp(0.8*X-0.41*Z) #T0 independent of Z at t=1 T <- rexp(n, rate=exp(psi0*X+log(m0))) C <- rexp(n, rate=exp(psi0*X+log(m0))) #50% censoring d <- as.numeric(T<C) T <- pmin(T, C) data <- data.frame(Z, X, T, d) #two-stage estimation fitX.LZ <- glm(formula=X~Z, data=data) fitT.LX <- coxph(formula=Surv(T, d)~X, data=data) fitIV <- ivcoxph(estmethod="ts", fitX.LZ=fitX.LZ, fitT.LX=fitT.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation with non-parametric model for S(t|L,Z,X) and model for Z fitZ.L <- glm(formula=Z~1, data=data) fitT.LZX <- survfit(formula=Surv(T, d)~X+Z, data=data) fitIV <- ivcoxph(estmethod="g", X="X", fitZ.L=fitZ.L, fitT.LZX=fitT.LZX, data=data, t=1) summary(fitIV) #G-estimation with Cox model for \lambda(t|L,Z,X) and model for Z fitZ.L <- glm(formula=Z~1, data=data) fitT.LZX <- coxph(formula=Surv(T, d)~X+X+X*Z, data=data) fitIV <- ivcoxph(estmethod="g", X="X", fitZ.L=fitZ.L, fitT.LZX=fitT.LZX, data=data, t=1) summary(fitIV)
ivglm
performs instrumental variable estimation of the causal exposure effect in
generalized linear models with individual-level data. Below, ,
, and
are the instrument, the exposure, and the outcome, respectively.
is a vector of covariates that we wish to control for in the analysis;
these would typically be confounders for the instrument and the outcome.
ivglm(estmethod, X, Y, fitZ.L=NULL, fitX.LZ=NULL, fitX.L=NULL, fitY.LX=NULL, fitY.LZX=NULL, data, formula=~1, ctrl=FALSE, clusterid=NULL, link, vcov.fit=TRUE, ...)
ivglm(estmethod, X, Y, fitZ.L=NULL, fitX.LZ=NULL, fitX.L=NULL, fitY.LX=NULL, fitY.LZX=NULL, data, formula=~1, ctrl=FALSE, clusterid=NULL, link, vcov.fit=TRUE, ...)
estmethod |
a string specifying the desired estimation method; either |
X |
a string specifying the name of the exposure |
Y |
a string specifying the name of the outcome |
fitZ.L |
an object of class |
fitX.LZ |
an object of class |
fitX.L |
an object of class |
fitY.LX |
an object of class |
fitY.LZX |
an object of class |
data |
a data frame containing the variables in the model. The covariates, instrument,
exposure and outcome can have arbitrary names, e.g. they don't need to
be called |
formula |
an object of class |
ctrl |
logical. Should the control function |
clusterid |
an optional string containing the name of a cluster identification variable when
data are clustered. Specifying |
link |
a string specifying the link function for the causal generalized linear model;
see ‘Details’. Either |
vcov.fit |
logical. Should the variance-covariance matrix be computed? |
... |
optional arguments passed on to the |
ivglm
estimates the parameter in the causal generalized linear model
Here, is counterfactual mean of the outcome,
had the exposure been set to 0. The link function
is either the identity, log or logit link, as specified by
the
link
argument. The vector function contains interaction terms
between
and
. If
estmethod="ts"
, then these are specified
implicitly through the model fitY.LX
. If estmethod="g"
, then these
are specified explicitly through the formula
argument.
If estmethod="ts"
, then two-stage estimation of is performed.
In this case, the model
fitX.LZ
is used to construct predictions
. These predictions are subsequently used to re-fit
the model
fitY.LX
, with replaced with
. The obtained
coefficient(s) for
in the re-fitted model is the two-stage estimator of
.
If estmethod="g"
, then G-estimation of is performed. In this case,
the estimator is obtained as the solution to the estimating equation
The function is defined as
when link="identity"
,
when link="log"
, and
when link="logit"
. In the latter, is
an estimate of
obtained from the model
fitY.LZX
.
The estimated function is chosen so that the true function
has conditional mean 0, given
;
.
The specific form of
is determined by the user-specified models.
If
fitX.LZ
and fitX.L
are specified, then ,
where
and
are obtained from
fitX.LZ
and fitX.L
, respectively. If these are not specified, then ,
where
is obtained from
fitZ.L
, which then must be specified.
ivglm
returns an object of class "ivglm"
, which inherits from
class "ivmod"
. An object of class "ivglm"
is a list containing
call |
the matched call. |
input |
|
est |
a vector containing the estimate of |
vcov |
the variance-covariance matrix for the estimate of |
estfunall |
a matrix of all subject-specific contributions to the estimating functions used in the estimation process.
One row for each subject, one column for each parameter. If |
d.estfun |
the jacobian matrix of |
converged |
logical. Was a solution found to the estimating equations? |
fitY.LX |
the re-fitted model |
ivglm
allows for weights. However, these are defined implicitly
through the input models. Thus, when models are used as input to ivglm
,
these models have to be fitted with the same weights. When estmethod="g"
the weights are taken from fitX.LZ
, if specified by the user. If fitX.LZ
is not
specified then the weights are taken from fitZ.L
. Hence, if weights are used,
then either fitX.LZ
or fitZ.L
must be specified.
Arvid Sjolander.
Bowden J., Vansteelandt S. (2011). Mendelian randomization analysis of case-control data using structural mean models. Statistics in Medicine 30(6), 678-694.
Sjolander A., Martinussen T. (2019). Instrumental variable estimation with the R package ivtools. Epidemiologic Methods 8(1), 1-20.
Vansteelandt S., Bowden J., Babanezhad M., Goetghebeur E. (2011). On instrumental variables estimation of causal odds ratios. Statistical Science 26(3), 403-422.
set.seed(9) ##Note: the parameter values in the examples below are chosen to make ##Y0 independent of Z, which is necessary for Z to be a valid instrument. n <- 1000 psi0 <- 0.5 psi1 <- 0.2 ##---Example 1: linear model and interaction between X and L--- L <- rnorm(n) Z <- rnorm(n, mean=L) X <- rnorm(n, mean=Z) m0 <- X-Z+L Y <- rnorm(n, mean=psi0*X+psi1*X*L+m0) data <- data.frame(L, Z, X, Y) #two-stage estimation fitX.LZ <- glm(formula=X~Z, data=data) fitY.LX <- glm(formula=Y~X+L+X*L, data=data) fitIV <- ivglm(estmethod="ts", fitX.LZ=fitX.LZ, fitY.LX=fitY.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation with model for Z fitZ.L <- glm(formula=Z~L, data=data) fitIV <- ivglm(estmethod="g", X="X", Y="Y", fitZ.L=fitZ.L, data=data, formula=~L, link="identity") summary(fitIV) #G-estimation with model for X fitX.LZ <- glm(formula=X~L+Z, data=data) fitX.L <- glm(formula=X~L, data=data) fitIV <- ivglm(estmethod="g", Y="Y", fitX.LZ=fitX.LZ, fitX.L=fitX.L, data=data, formula=~L, link="identity") summary(fitIV) ##---Example 2: logistic model and no covariates--- Z <- rbinom(n, 1, 0.5) X <- rbinom(n, 1, 0.7*Z+0.2*(1-Z)) m0 <- plogis(1+0.8*X-0.39*Z) Y <- rbinom(n, 1, plogis(psi0*X+log(m0/(1-m0)))) data <- data.frame(Z, X, Y) #two-stage estimation fitX.LZ <- glm(formula=X~Z, family="binomial", data=data) fitY.LX <- glm(formula=Y~X, family="binomial", data=data) fitIV <- ivglm(estmethod="ts", fitX.LZ=fitX.LZ, fitY.LX=fitY.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation with model for Z fitZ.L <- glm(formula=Z~1, data=data) fitY.LZX <- glm(formula=Y~X+Z+X*Z, family="binomial", data=data) fitIV <- ivglm(estmethod="g", X="X", fitZ.L=fitZ.L, fitY.LZX=fitY.LZX, data=data, link="logit") summary(fitIV)
set.seed(9) ##Note: the parameter values in the examples below are chosen to make ##Y0 independent of Z, which is necessary for Z to be a valid instrument. n <- 1000 psi0 <- 0.5 psi1 <- 0.2 ##---Example 1: linear model and interaction between X and L--- L <- rnorm(n) Z <- rnorm(n, mean=L) X <- rnorm(n, mean=Z) m0 <- X-Z+L Y <- rnorm(n, mean=psi0*X+psi1*X*L+m0) data <- data.frame(L, Z, X, Y) #two-stage estimation fitX.LZ <- glm(formula=X~Z, data=data) fitY.LX <- glm(formula=Y~X+L+X*L, data=data) fitIV <- ivglm(estmethod="ts", fitX.LZ=fitX.LZ, fitY.LX=fitY.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation with model for Z fitZ.L <- glm(formula=Z~L, data=data) fitIV <- ivglm(estmethod="g", X="X", Y="Y", fitZ.L=fitZ.L, data=data, formula=~L, link="identity") summary(fitIV) #G-estimation with model for X fitX.LZ <- glm(formula=X~L+Z, data=data) fitX.L <- glm(formula=X~L, data=data) fitIV <- ivglm(estmethod="g", Y="Y", fitX.LZ=fitX.LZ, fitX.L=fitX.L, data=data, formula=~L, link="identity") summary(fitIV) ##---Example 2: logistic model and no covariates--- Z <- rbinom(n, 1, 0.5) X <- rbinom(n, 1, 0.7*Z+0.2*(1-Z)) m0 <- plogis(1+0.8*X-0.39*Z) Y <- rbinom(n, 1, plogis(psi0*X+log(m0/(1-m0)))) data <- data.frame(Z, X, Y) #two-stage estimation fitX.LZ <- glm(formula=X~Z, family="binomial", data=data) fitY.LX <- glm(formula=Y~X, family="binomial", data=data) fitIV <- ivglm(estmethod="ts", fitX.LZ=fitX.LZ, fitY.LX=fitY.LX, data=data, ctrl=TRUE) summary(fitIV) #G-estimation with model for Z fitZ.L <- glm(formula=Z~1, data=data) fitY.LZX <- glm(formula=Y~X+Z+X*Z, family="binomial", data=data) fitIV <- ivglm(estmethod="g", X="X", fitZ.L=fitZ.L, fitY.LZX=fitY.LZX, data=data, link="logit") summary(fitIV)
This is a plot
method for class "estfun"
.
## S3 method for class 'estfun' plot(x, ...)
## S3 method for class 'estfun' plot(x, ...)
x |
an object of class |
... |
additional arguments to |
Arvid Sjolander.
##See documentation for estfun.
##See documentation for estfun.
This is a plot
method for class "ivah"
. It only supports
objects fitted with estmethod="g"
.
## S3 method for class 'ivah' plot(x, gof=FALSE, CI.level=0.95, ...)
## S3 method for class 'ivah' plot(x, gof=FALSE, CI.level=0.95, ...)
x |
an object of class |
gof |
should we plot the goodness-of-fit process? If not, then |
CI.level |
level for the confidence intervals. |
... |
not used. |
Arvid Sjolander and Torben Martinussen.
##See documentation for ivah.
##See documentation for ivah.
This is a print
method for class "ivmod"
.
## S3 method for class 'ivmod' print(x, digits=max(3L, getOption("digits")-3L), ...)
## S3 method for class 'ivmod' print(x, digits=max(3L, getOption("digits")-3L), ...)
x |
an object of class |
digits |
the number of significant digits to use. |
... |
not used. |
Arvid Sjolander
##See documentation for ivglm, ivcoxph and ivah.
##See documentation for ivglm, ivcoxph and ivah.
This is a print
method for class "summary.ivbounds"
.
## S3 method for class 'summary.ivbounds' print(x, digits=max(3L, getOption("digits")-3L), ...)
## S3 method for class 'summary.ivbounds' print(x, digits=max(3L, getOption("digits")-3L), ...)
x |
an object of class |
digits |
the number of significant digits to use. |
... |
not used. |
Arvid Sjolander
##See documentation for ivbounds.
##See documentation for ivbounds.
This is a print
method for class "summary.ivmod"
.
## S3 method for class 'summary.ivmod' print(x, digits=max(3L, getOption("digits")-3L), signif.stars=getOption("show.signif.stars"), ...)
## S3 method for class 'summary.ivmod' print(x, digits=max(3L, getOption("digits")-3L), signif.stars=getOption("show.signif.stars"), ...)
x |
an object of class |
digits |
the number of significant digits to use. |
signif.stars |
logical. If TRUE, "significance stars" are printed for each coefficient. |
... |
not used. |
Arvid Sjolander
##See documentation for ivglm, ivcoxph and ivah.
##See documentation for ivglm, ivcoxph and ivah.
This is a summary
method for class "ivbounds"
.
## S3 method for class 'ivbounds' summary(object, ...)
## S3 method for class 'ivbounds' summary(object, ...)
object |
an object of class |
... |
not used. |
Provides the lower and and upper bounds for
Arvid Sjolander
##See documentation for ivbounds.
##See documentation for ivbounds.
This is a summary
method for class "ivmod"
.
## S3 method for class 'ivmod' summary(object, ...)
## S3 method for class 'ivmod' summary(object, ...)
object |
an object of class |
... |
not used. |
Arvid Sjolander
##See documentation for ivglm, ivcoxph and ivah.
##See documentation for ivglm, ivcoxph and ivah.
This dataset originates from a real cohort study on Vitamin D and mortailty,
described by Martinussen et al (2017). However, to allow public availability
the data were slightly mutilated before inclusion in the
ivtools
package.
data(VitD)
data(VitD)
The dataset contains the following variables:
age at baseline.
binary indicator of whether the subject has mutations in the filaggrin gene.
vitamin D level at baseline, measured as serum 25-OH-D (nmol/L).
follow-up time.
indicator of whether the subject died during follow-up.
Martinussen T., Sorensen D.D., Vansteelandt S. (2019). Instrumental variables estimation under a structural Cox model. Biostatistics 20(1), 65-79.