A General Framework for and New Normalization of Attributable Proportion

Ola Hössjer; Ingrid Kockum; Lars Alfredsson; Anna Karin Hedström; Tomas Olsson; Magnus Lekman

doi:10.1515/em-2015-0028

Article Publicly Available

A General Framework for and New Normalization of Attributable Proportion

Ola Hössjer , Ingrid Kockum , Lars Alfredsson , Anna Karin Hedström , Tomas Olsson and Magnus Lekman

Published/Copyright: December 23, 2016

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Epidemiologic Methods Volume 6 Issue 1

Abstract

A unified theory is developed for attributable proportion (AP) and population attributable fraction (PAF) of joint effects, marginal effects or interaction among factors. We use a novel normalization with a range between –1 and 1 that gives the traditional definitions of AP or PAF when positive, but is different when they are negative. We also allow for an arbitrary number of factors, both those of primary interest and confounders, and quantify interaction as departure from a given model, such as a multiplicative, additive odds or disjunctive one. In particular, this makes it possible to compare different types of threeway or higher order interactions. Effect parameters are estimated on a linear or logit scale in order to find point estimates and confidence intervals for the various versions of AP and PAF, for prospective or retrospective studies. We investigate the accuracy of three confidence intervals; two of which use the delta method and a third bootstrapped interval. It is found that the delta method with logit type transformations, and the bootstrap, perform well for a wide range of models. The methodology is also applied to a multiple sclerosis (MS) data set, with smoking and two genetic variables as risk factors.

Keywords: attributable proportion; confidence interval; models of no interaction; normalization; population attributable fraction

1 Introduction

Most complex diseases are influenced by a number of genetic and environmental factors, co-factors or predictors that either have a marginal effect and/or interact in some way (Maher 2008; Eichler et al. 2010; Bookman et al. 2011). The impact of these factors on disease risk is specified with a penetrance function

(1)θ(x)=P(Y=1|X=x),

where Y equals 1 (0) for an individual with (without) the disease, and x=(x1,…,xp) is a vector with exposure status for p co-factors. We will assume that the factors are binary, with xi=1 or 0 depending on whether an individual is exposed to factor i or not, although extensions to continuous covariates will be discussed in Section 8. Often a few of the variables J⊂{1,…,p} are of main interest, whereas the others are confounders. In this paper we develop a unified theory for the attributable risk or attributable proportion (AP), and the population attributable fraction (PAF), when the effects of confounders are taken into account. This theory will involve both joint effects, marginal effects and interaction.

The joint effect of all non-confounders will refer to their cumulative marginal effects and interaction. The traditional definition of AP;

(2)APtrad(x)=θ(x)−θrem(x)θ(x),

is the joint fraction of disease risk due the factors in J. The penetrance function θrem(x) is the disease risk of an individual whose exposure to all factors i∈J of interest has been removed, whereas the confounding variables i∉J remain at the levels specified by x. Equation (2) generalizes the definition of attributable proportion, first introduced by Levin (1953) to quantify the effect that one single risk factor, smoking, has on lung cancer. The joint effect is then equivalent to a marginal effect, since it is not possible to have interaction for one single factor. Greenland and Robins (1988) noted that sometimes there is an ambiguity as to whether θrem should include all subjects that develop the disease without exposure to the relevant factor(s), or only those for which the factor(s) has no causal role. Often the former definition of attributable proportion is used, since it is easier to estimate from data.

Since smoking is a risk factor (θ>θrem), APtrad is a number between 0 and 1. On the other hand, it is negative without any lower bound for preventive factors (θ<θrem). It is then possible to use the preventive fraction

(3)PF(x)=θ(x)−θrem(x)θrem(x),

a number between 0 and 1 that equals the fraction of reduced disease risk for a subject exposed to preventive factors, see for instance Aschengrau et al. (2003, pp. 65–66). But it is redundant to use two different quantities APtrad and PF in order to quantify the effect of one set of factors, in particular when it is not known before a study whether a predictor increases or decreases risk. The relative risk could replace eqs (2) and (3) but it has no upper bound for risk factors. Instead, we propose another normalization

(4)AP(x)=θ(x)−θrem(x)maxθ(x),θrem(x)

of the attributable proportion, a number between –1 and 1 that equals APtrad for risk factors and −PF for preventive factors.

In order to define a version of AP that averages the effect of all confounders, while still controlling for their effect, we introduce θaddx, the penetrance function for a subject whose exposure to all factors i∈J has been added or turned on, whereas all confounding factors i∉J remain at the levels specified by x. Let =(X1,…,Xp) refer to the exposure vector of a randomly chosen subject in the population. The joint proportion of risk due to the predictors in J is

(5)AP=EθaddX−EθremXmaxEθaddX,EθremX,

when averaging the effect of all confounders. The numerator of eq. (5) is the average causal effect (ACE) when causation can be replaced by association (Agresti 2013), and the denominator normalizes ACE to a value between –1 and 1.

It is also possible to define counterparts of eqs (2), (3) and (4) for a whole population. The population attributable fraction

(6)PAFtrad=EθX−EθremXEθX,

quantifies the fraction of disease prevalence Eθ(X) that is jointly attributable to a subset J of risk factors. It is an important quantity for public health and a tool for deciding which possible interventions that should be prioritized, see Deubner et al. (1980), Northridge (1995), Benichou (2001) and Sjölander and Vansteelandt (2011). For preventive factors, one uses instead the population prevented fraction

(7)PPF=Eθ(X)−Eθrem(X)Eθrem(X),

i. e. the fraction by which prevalence is decreased when the population is jointly exposed to all factors in J. Analogously to eq. (4), we combine eqs (6) and (7) into a generalized version

(8)PAF=EθX−EθremXmaxEθX,EθremX

of population attributable fraction that takes values between –1 and 1 and equals PAFtrad for risk factors and −PPF for preventive factors.

It is possible to reinterpret eqs (4), (5) and (8) to quantify epistasis or interaction among a subset J of at least two predictors, rather than their joint effect. Several methods are available to quantify interaction, see Greenland (1983), Cordell (2002), Phillips (2008), VanderWeele (2010) and VanderWeele and Knol (2014) for reviews. Some epistasis definitions involve causation, counterfactuals or potential outcomes (Rothman 1976a), and under certain conditions they can be tested statistically (VanderWeele and Robins 2008; Sjölander et al. 2014; Lekman et al. 2014). We will adopt a common and weaker definition of epistasis in terms of a model whose interaction parameter between a subset J of factors is nonzero. When attributable proportion is used to quantify the strength of this interaction (rather than the joint effect), we must not include any marginal effects. This amounts to reinterpreting θrem(x) as the disease risk when exposure to each factor in J is the same as in the definition of θ(x), but interaction among them, not their marginal effects, is removed. This requires that a model M of no interaction is defined. Statisticians have commonly used logistic regression models and multiplicative odds, but additive models of interaction have recently gained popularity, since they are believed to approximate an underlying biological reality more accurately (Rothman 2012). In particular, a collection of factors exhibits sufficient cause interaction when they are all present in at least one causal mechanism that is sufficient to bring about the disease. For binary factors and outcomes, it has been shown in VanderWeele (2009) and references therein, that under mild regularity conditions, sufficient cause interaction is closely related to additive interaction.

Given that a model M has been specified, interaction is positive or negative when there is excess or lowered risk due to interaction, depending on the sign of APtrad. For positive interaction that increases disease risk (synergism), APtrad is a number between 0 and 1 that quantifies the proportion by which disease risk is lowered when interaction among the factors in J is removed for subjects with exposure x. But for negative interaction that lowers disease risk (antagonism), APtrad has no lower bound. The alternative definition eq. (4) is identical to eq. (2) when positive, but normalized differently for negative values, as minus the proportion by which disease risk is lowered for a subject with exposure x when interaction among the factors in J is added, with a lower bound of –1. The two extreme cases when AP equals –1 and 1 correspond to interaction that completely eliminates or dominates disease risk. The quantities AP and PAF in eqs (5) and (8) are also numbers between –1 and 1, with similar interpretation as the proportion of disease risk when interaction among the predictors in J is removed or added.

As far as we know, APtrad has only been used for interaction between pairs of binary factors under additive models of no interaction, but we will apply our new normalization of AP and PAF much more generally. Our framework is a model with an arbitrary number p of factors, when the joint effect or interaction among a subset J of them is of interest. We consider a large class of models M of no interaction, such as an additive, multiplicative or disjunctive one. Results of Hosmer and Lemeshow (1992), Assman et al. (1996) and Andersson et al. (2005) are extended, whereby linear or logit parametrization of effect parameters is used to obtain point estimates and three types of confidence intervals, two Wald intervals based on the multivariate delta method and asymptotic normality of parameter estimates, and a bootstrap interval. In particular, we utilize that eq. (4) is restricted to [−1,1] and show that a Wald interval based on a logit type transformation and the bootstrap work very well for a wide range of models.

Our work was motivated by a recent study in Lekman et al. (2014), where possible genetic interaction effects of major depressive disorder were sought for in terms of pairs of single nucleotide polymorphisms (SNPs), using different multiplicative and additive models of no interaction. For some pairs of SNPs, AP estimates for the additive models were far below –1, with difficulty of interpretation. Although different normalizations of AP due to interaction have been proposed (VanderWeele 2013), the one in eq. (4) is the first that guarantees a range of [−1,1].

In order to estimate AP and PAF one typically uses cohort data, and external information is needed in order to calculate these quantities from case-control data. But since odds ratios can be estimated consistently from case-control data, an alternative strategy is to redefine the attributable proportion in terms of odds ratios rather than disease risks. We apply such an odds ratio based definition of AP, with our new normalization, to estimate interaction effects of a multiple sclerosis (MS) data set previously analyzed in Hedström et al. (2011).

The paper is organized as follows. In Section 2 we consider the case of two factors and additive models of no interaction, and then in Section 3 an arbitrary number of factors, and a generalized linear model (McCullagh and Nelder 1989) family of no interaction models that include additive, additive odds, multiplicative and disjunctive models. In Section 4 we define the analogous versions of AP in terms of odds ratios. Then we define point estimates and confidence intervals for prospective and retrospective studies in Section 5, conduct a simulation study in Section 6, analyze the MS data set in Section 7, and conclude with a discussion in Section 8. Further mathematical and implementation details are given in Appendices A-F.

2 Two predictors

2.1 Marginal and joint effects

Suppose we have p=2 binary co-factors or predictors, and let

RR(x1,x2)=θ(x1,x2)θ(0,0)

be the relative risk for exposure profile (x1,x2) compared to a baseline risk when exposure is absent. The attributable proportion AP(x)=AP(x; J) in eq. (4) for a subset J⊂{1,2} of predictors takes the form

(9)AP(x1,x2; {1})=θ(x1,x2)−θ(0,x2)maxθ(x1,x2),θ(0,x2)=RR(x1,x2)−RR(0,x2)maxRR(x1,x2),RR(0,x2)

for the marginal effect of the first factor,

(10)AP(x1,x2; {2})=θ(x1,x2)−θ(x1,0)maxθ(x1,x2),θ(x1,0)=RR(x1,x2)−RR(x1,0)maxRR(x1,x2),RR(x1,0)

for the marginal effect of the second factor, and

(11)AP(x1,x2; {1,2})=θ(x1,x2)−θ(0,0)maxθ(x1,x2),θ(0,0)=RR(x1,x2)−1maxRR(x1,x2),1

for the joint effect of both factors. We see from eqs (9)–(11) that AP(x1,x2; J) is only a function of relative risk RR(x1,x2) when both factors are considered jointly, or when the effect of one factor is of interest and exposure from the other confounding factor is absent. In the latter case, if the confounding factor 2 (say) is exposed as well, AP(x1,1; {1}) is also a function of RR(0,1).

The marginal effect AP=AP(J) in eq. (5) of predictor 1 is

AP({1})=Eθ(1,X2)−Eθ(0,X2)maxEθ(1,X2),Eθ(0,X2).

where expectation is over the exposure distribution of the confounding variable X2. The marginal effect AP({2}) of predictor 2 is analogous.

The population attributable fraction PAF=PAF(J) in eq. (8) is defined as an average of eqs (9)–(11) with respect to an exposure distribution q(x1,x2)=P(X1=x1,X2=x2) of both predictors. For instance, the marginal effect of predictor 1 in the population is

PAF({1})=Eθ(X1,X2)−Eθ(0,X2)maxEθ(X1,X2),Eθ(0,X2).

The population attributable fraction of predictor 2 is defined in the same way, reversing the roles of X1 and X2. The joint effect of both predictor variables is

PAF({1,2})=Eθ(X1,X2)−θ(0,0)maxEθ(X1,X2),θ(0,0).

2.2 Interaction

Both variables J={1,2} are required to define AP(x)=AP(x; J,M) of interaction, and therefore J will be dropped in the notation. We also need a model M of no interaction, and write θrem(x)=θrem(x; M). The departure from no interaction is referred to as ERI(x)=θ(x)−θrem(x), the excess risk due to interaction. It is usually required that both factors i are turned on (xi=1) in order for ERI to be nonzero, and then it can either be positive or negative. For an additive model of no interaction the traditional definition is

θrem,trad(1,1; Additive)=θ(1,0)+θ(0,1)−θ(0,0).

Insertion into eq. (2) and division by the baseline risk θ(0,0) gives

(12)APtrad(1,1; Additive)=RR(1,1)−RR(0,1)+RR(1,0)−1RR(1,1).

The numerator of eq. (12) is commonly referred to as the interaction contrast ratio (ICR) or the relative excess risk due to interaction (RERI), whereas the denominator is the relative risk for a subject exposed to both factors. Their ratio in eq. (12) is a number between 0 and 1 that quantifies the proportion of risk due to interaction when RERI is positive, but it is negative without a lower bound when RERI<0. In order to overcome this deficiency of APtrad, let θi(x)=P(Y=1|Xi=x) be the marginal penetrance function of factor i. It has been suggested by Rothman (2012) to recode any factor i that is preventive (θi(1)<θi(0)) so that both factors become risk factors (θi(1)>θi(0)). Another suggestion of Knol et al. (2011) is to recode variables jointly, so that the penetrance function is minimized by θ(0,0). A third possibility is to redefine the denominator of eq. (2) to θ(1,1)−θ(0,0), or the one in eq. (12) to RR(1,1)−1, see Rothman (2012) and VanderWeele (2013).

Although all three of these suggestions can be helpful, the following example shows that none of them guarantees an AP between –1 and 1. We assume θ(0,0)=0, θ(1,0)=θ(0,1)=0.5, θ(1,1)=0.1 and that all exposure profiles are equally likely, i. e. q(x1,x2)=0.25. It follows that θi(0)=0.25 and θi(1)=0.3, so that both variables are risk factors. Neither marginal nor joint risk considerations will therefore cause any recoding, but yet

APtrad(1,1; Additive)=0.1−(0.5+0.5−0)0.1=−9,

the same value as when θ(1,1)−θ(0,0) is used in the denominator instead.

Our solution is to modify eq. (12) in two ways. First we truncate the predicted risk to a number between 0 and 1, i. e.

(13)θrem(x1,x2)=θ(0,0)+x1θ(1,0)−θ(0,0)+x2θ(0,1)−θ(0,0)01,

where [θ]ab=maxa,min(θ,b), arguing that it makes little sense to have a predictive risk that is not a probability. Second we apply the new normalization eq. (4) and divide by the baseline risk θ(0,0), so that

(14)AP(x1,x2; Additive)=RR(x1,x2)−RRrem(x1,x2)maxRR(x1,x2),RRrem(x1,x2),

where RRrem(x1,x2)=θrem(x1,x2)/θ(0,0) is the relative risk when interaction between the two factors has been removed. Defined in this way, eq. (14) is the proportion by which positive/antagonistic interaction increases risk, and minus the proportion by which protective/negative interaction decreases risk. In particular, eq. (14) equals –0.9 rather than –9 in the numerical example above. In order to motivate truncation in eq. (13), consider a second example with θ(0,0)=0.3 and θ(0,1)=θ(1,0)=θ(1,1)=0.1. Then AP(1,1; Additive)=1, but it would equal 2 without truncation.

The excess risk due to interaction in eq. (14) is only of interest when x1=x2=1, since otherwise it is zero. Since there are no confounders in the model (|J|=p=2), the attributable fraction [14] agrees with AP in eq. (5) when x1=x2=1. But we need to consider arbitrary exposure profiles (x1,x2) in order to define the population counterpart PAF=PAF(M) of eq. (14) in eq. (8). For two predictors it can be written as

(15)PAF(Additive)=ERR(X1,X2)−ERRrem(X1,X2)maxERR(X1,X2),ERRrem(X1,X2).

When positive, eq. (15) is the fraction of population risk due to additive interaction; when negative, it is minus the fraction by which such interaction decreases risk.

3 Arbitrary number of predictors

3.1 Marginal and joint effects

For an arbitrary number p of factors, we express in Appendix A.1 several quantities in terms of the relative risk

(16)RR(x)=θ(x)θ(x)

of exposure x compared to a scenario 0=(0,…,0) where exposure is absent for all factors. This includes the joint attributable proportion AP(x)=AP(x; J) in eq. (4) for exposure x, due to factors in J, the corresponding joint attributable proportion AP=AP(J) in eq. (5), when averaging over all confounders, and the joint population attributable fraction PAF=PAF(J) for factors J in eq. (8). The marginal attributable proportion or population attributable fraction for a single non-confounding variable is a special case of these definitions, with |J|=1.

3.2 Interaction

Consider an arbitrary subset J of at least two factors, and a model M that belongs to a class of generalized linear models of no interaction. To each x we associate another vector xJ=(xJ1,…,xJp), whose exposure to all factors of interest has been removed (xJi=0 if i∈J), whereas the confounding variables remain at the levels specified by x (xJi=xi if i∉J). With this notation, we can specify the disease risk for a subject with exposure x and no interaction among the variables in J, as

(17)θrem=θrem(x; J,M)=g−1ρg(θ(xJ))+∑i∈Jβixi,

where g is a link function that transforms the disease probability θ to regression space, βi=gθ(xJ+ei)−gθ(xJ) is a regression coefficient for predictor variable i∈J that is typically a function of xJ, ei is a unit vector of length p with one in position i and zeros elsewhere, and ρ a projection that guarantees θrem(x; J,M)∈[0,1].

A number of models (17) are shown in Table 1; the identity link g(θ)=θ gives an additive model, the odds link g(θ)=θ/(1−θ) an additive odds model, the canonical logit link g(θ)=logθ/(1−θ) a multiplicative model, the log link g(θ)=log(θ) a multiplicative risk model and g(θ)=−log(1−θ) a multicausal disjunctive model (Pearl 1988). No truncation is necessary (ρ(η)=η) for the multiplicative model in order to ensure θrem∈[0,1], for the additive model ρ(η)=[η]01, for the multiplicative risk model ρ(η)=[η]−∞0, and for the additive odds and disjunctive models ρ(η)=[η]0∞. Our framework also incorporates many other no interaction models than eq. (17), see Appendix E.

Table 1:

Examples of generalized linear models [17] of no interaction.

Model	Link function g(θ)	θ(xJ)	θ(xJ+ei)	d(O)	O(x_J)	O(x_J+e_i)
Additive	θ	β0	β0+βi	O1+O	β01−β0	β0+βi1−β0−βi
Additive odds	θ1−θ	β01+β0	β0+βi1+β0+βi	O	β0	β0+βi
Multiplicative	log(θ1−θ)	eβ01+eβ0	eβ0+βi1+eβ0+βi	log(O)	eβ0	eβ0+βi
Multipl. risk	log(θ)	eβ0	eβ0+βi	log(O1+O)	eβ01−eβ0	eβ0+β11−eβ0+β1
Disjunctive	−log(1−θ)	1−e−β0	1−e−β0−βi	log(1+O)	eβ0−1	eβ0+βi−1

Note: For each model, g is the link function that transforms the disease probability θ to regression space η=g(θ), and d is the function that similarly links odds O=θ/(1−θ) to η=d(O). Risks and odds formulas are shown when exposure is turned on for at most one of the factors in J⊂{1,…,p}.

In Appendix A.2 we express a number of quantities in terms of eq. (16) and

(18)RRrem(x; J,M)=θrem(x; J,M)θ(0),

the relative risk for exposure x when interaction among all factors in J has been removed. This includes the attributable proportion AP(x)=AP(x; J,M) of interaction in eq. (4) among the factors in J for exposure x, the corresponding attributable proportion AP=AP(J; M) of interaction in eq. (5) when averaging over confounders, and, finally, the attributable fraction of interaction PAF=PAF(J,M) in eq. (8) among the factors in J for the whole population.

4 Odds ratios

It is possible to estimate relative risks consistently for cohort studies or case-control studies with incident cases who developed the disease after exposure (Miettinen 1976). But many case-control studies involve prevalent cases that were diagnosed before the study began. It is not possible then to estimate relative risks consistently, unless the sampling fraction of cases and controls is known. But since odds ratios

(19)OR(x)=O(x)O(0)=θ(x)/1−θ(x)θ(0)/1−θ(0)

can be estimated consistently also for the latter kind of studies (Cornfield 1951, Prentice and Pyke 1979, Skrondal 2003) they may replace relative risks, both for AP due to marginal effects eq. (34) and interaction eq. (37).

Consider a subset J⊂{1,…,p} of factors, and a non-interaction model M among them. In Appendix A.3 we define a number of odds-ratio based quantities in terms of eq. (19) and

(20)ORrem(x; J,M)=θrem(x; J,M)/1−θrem(x; J,M)θ(0)/1−θ(0),

the odds ratio for exposure x when interaction among the factors in J has been removed, with θrem(x; J,M) as in eq. (17). This includes the joint attributable proportion APOR(x; J) of the odds ratio for exposure x due to the factors in J, the corresponding joint attributable proportion APOR(J) when averaging the effect of all confounders, the attributable proportion APOR(x; J,M) of the odds ratio due to interaction among the factors in J for exposure x, and finally, the corresponding attributable proportion APOR(J,M) of the odds ratio due interaction among the factors in J, when averaging over the confounding factors. When APOR>0, it is the relative proportion of the odds ratio of disease attributed to joint effect or interaction among the variables in J. On the other hand, when APOR<0, it is minus the proportion by which joint effect of interaction among the variables in J decreases the odds of developing the disease.

Odds ratios typically approximate relative risks well for rare diseases. When this is the case, the odds ratio based versions of the attributable proportion are close to the corresponding risk based quantities of Section 3. For the models studied by Kalilani and Atashili (2006), the odds ratio approximation was accurate for diseases with baseline risks up to 0.2. But sometimes the difference is notable, and depending on disease risks and mode of interaction the odds ratio based attributable proportion may be biased (Kalilani and Atashili 2006; Zou 2008).

The attributable proportion of the odds ratio due to interaction has a particularly explicit form for the additive odds model. For instance, when there are p=2 risk factors and x=(1,1), we tacitly assume in the notation that J={1,2} (since otherwise APOR=0) and find that

(21)APOR1,1; Additive odds=OR(1,1)−OR(0,1)+OR(1,0)−10∞maxOR(1,1),OR(0,1)+OR(1,0)−10∞

is an odds ratio analogue of eqs (13)–(14). The corresponding traditional definition

APOR,trad1,1; Additive odds=OR(1,1)−OR(0,1)+OR(1,0)−1OR(1,1)

can be found for instance in Rothman (2012).

5 Inference

We will outline a unified approach to estimate the attributable proportion eqs (4)–(5) or eqs (40)–(41), and the population attributable fraction [8] due to joint effects or interaction of an arbitrary subset J⊂{1,…,p} of factors. It is assumed that a saturated model Msat is fit to data. For the risk-based quantities eqs (4)–(5) or eq. (8), the parameter vector ψ=(ψx) of Msat has 2p components, one for each x. A linear parametrization of the risk function is

(22)θ(x)=θ(x,ψ)=ψx.

Another possibility is to parametrize a logit transformation of risk. This was utilized by Hosmer and Lemeshow (1992) for the traditional definition [12] of AP of interaction with p=2 risk factors and an additive model of no interaction. They noted that the disease probability for a subject with exposure x can be parametrized in two ways. Either

(23)θ(x)=θ(x,ψ)=exp∑y≤xψy1+exp∑y≤xψy,

with y=(y1,…,yp) and y≤x interpreted componentwise, i. e. yi≤xi for i=1,…,p, and ψx an interaction parameter defined for the |x|=∑ixi nonzero risk factors of x. This is an ordinary logistic regression parametrization for a model with p predictors having 0 as their baseline levels, and interactions of all orders (Agresti 2013). If all nonzero exposure profiles x are viewed as different levels of one single predictor variable, then

(24)θ(x)=θ(x,ψ)=expψ0+ψx1+expψ0+ψx,

with ψx the log odds of x compared to the baseline level 0. Since the two parametrizations eqs (23) and (24) only differ by a linear transformation, they give identical inference, although eq. (24) is often simpler. We can also express the odds ratio eq. (19) conveniently in terms of any of the two logistic parametrizations, either

(25)OR(x)=OR(x,ψ)=exp∑0<y≤xψy,

for eq. (23), or

(26)OR(x)=OR(x,ψ)={1,x=0,expψx,x≠0,

for eq. (24). Neither eq. (25) nor eq. (26) involve the baseline parameter ψ0, and therefore we redefine ψ to equal (ψx; x≠0) when quantities with odds ratios are estimated.

All quantities AP(x),APOR(x),AP,APOR and PAF of interest are functions

(27)ξ=ξ(ψ)=a(ψ)−b(ψ)maxa(ψ),b(ψ).

of the parameter vector ψ, with different choices of a and b, as summarized in Table 5. In order to estimate ξ, suppose we have data (xi,Yi), i=1,…,N from N subjects, with ψˆ=(ψˆx) the accompanying maximum likelihood estimator of ψˆ. If this is a cohort design or some other prospective study, ψˆ maximizes a prospective likelihood, the probability of disease given exposure. For the linear parametrization (22) we have that

ψˆx=Nx1Nx,

with Nxy the number of subjects i with xi=x and Yi=y, and Nx=Nx0+Nx1. For either of the two logit scale parametrizations (23)–(24), we use standard logistic regression software to compute ψˆ. This is also possible for a case-control study, where sampling is on the response variable, and ψˆ maximizes a retrospective likelihood, the probability of exposure given disease, as long as we focus on quantities that only involve odds ratios (Prentice and Pyke 1979). We estimate ξ by replacing ψ with ψˆ in eq. (27), so that

(28)ξˆ=ξ(ψˆ)=a(ψˆ)−b(ψˆ)maxa(ψˆ),b(ψˆ).

For large N one often approximates the distribution of ψˆ by a multivariate normal N(ψ,Σ) with asymptotic covariance matrix Σ. The denominator of eq. (27) is not differentiable when a(ψ)=b(ψ). But since the numerator is zero at such points, ξ(ψ) is still differentiable and ξˆ approximately normally distributed Nξ,σ2 with asymptotic variance σ2=DΣDT,D=∂ξ(ψ)/∂ψ, and DT its transpose. As in Hosmer and Lemeshow (1992) we use the delta method to find a Wald type confidence interval

(29)I= ξ^−z1−α/2σ^,ξ^+z1−α/2σ^

for ξ with asymptotic coverage probability 1−α, where zβ is the β-quantile of a standard normal distribution, and

(30)σˆ=(DˆΣˆDˆT)1/2

the standard error of ξˆ, with Σˆ and Dˆ estimates of Σ and D. For the linear parametrization (22), Σˆ is a diagonal matrix with elements

(31)Σˆxx=ψˆx(1−ψˆx)Nx,

since observed risks are independent binomial proportions. For the logit parametrizations, Σˆ is available from standard logistic regression software. A formula for Dˆ is derived in Appendix B.

When ξ is close to one of its two boundary values –1 and 1, the distribution of ξˆ will typically be skewed, unless the sample size N is very large. Since eq. (22) assumes a symmetric normal distribution of ξˆ, the actual coverage probability π(1−α)=P(ξ∈I) may deviate substantially from the nominal 1−α. This suggests that a conveniently chosen transformation of h(ξˆ) of ξˆ will have a distribution closer to normal. When ξ∉{−1,1}, the logit type transformation h(x)=log(1+x)/(1−x) gives a Wald confidence interval

(32)I=exph(ξˆ)−z1−α/2σ˜−1exph(ξˆ)−z1−α/2σ˜+1,exph(ξˆ)+z1−α/2σ˜−1exph(ξˆ)+z1−α/2σ˜+1,

where σ˜=h′(ξˆ)σˆ=2σ˜/{(1+ξˆ)(1−ξˆ)}. Whereas the interval in eq. (29) may extend outside (−1,1), eq. (32) is always a subset of (−1,1) as long as ξˆ∈(−1,1). Other confidence interval approaches for ξ that deal with asymmetric distributions include the methods of variance estimates recovery (Zou 2008), likelihood ratio based intervals obtained from a saturated additive odds ratio model (Richardson and Kaufman 2009), nonparametric bootstrap (Assman et al. 1996), with accelerated bias correction (Zou 2008). In Appendix D we define a bias-corrected and accelerated version of the percentile bootstrap method.

The exposure distribution q(x)=P(X=x) is assumed to be known in eq. (28) for the population based AP, PAF and APOR. In Appendix C we show how to compute standard error and confidence intervals when q is estimated from a prospective study.

6 Numerical illustration and simulation

Figures 1–2 illustrate how normalizations eqs (2) and (4) affect AP(1,1; M) of interaction for two saturated models I and II with p=2 factors. Model I has two risk factors, whereas II has one risk and one preventive factor. As expected, the normalization has a large impact for both models when AP is negative. The model of no interaction influences the value of AP a lot for model I, and to some extent for model II. Table 2 gives values of the disease risk θrem(x; J,M) when various types of interaction among the variables in J is removed. This is done for models I and II, as well as for a model III with p=3 co-factors.

$Figure 1: Plots of normalized attributable proportion AP(1,1)=AP(1,1; J,M)$${\rm AP}(1, 1) = {\rm AP}(1, 1; \ J, M)$$ of interaction [37] (solid) and the corresponding traditional quantity [2] (dashed). The curves are functions of θ(1,1)$$\theta (1, 1)$$ for a saturated model with p=2$$p = 2$$ co-factors, whose other three risks θ(0,0)=0.05$$\theta (0, 0) = 0.05$$, θ(1,0)=0.25$$\theta (1, 0) = 0.25$$, θ(0,1)=0.4$$\theta (0, 1) = 0.4$$ are the same as for Model I of Table 2. Both variables are non-confounders (J={1,2}$$J = \{1, 2\} $$), and three different no interaction models M$$M$$ are assumed; additive (thick line), additive odds (upper thin line) and multiplicative (lower thin line).$

Figure 1:

Plots of normalized attributable proportion AP(1,1)=AP(1,1; J,M) of interaction [37] (solid) and the corresponding traditional quantity [2] (dashed). The curves are functions of θ(1,1) for a saturated model with p=2 co-factors, whose other three risks θ(0,0)=0.05, θ(1,0)=0.25, θ(0,1)=0.4 are the same as for Model I of Table 2. Both variables are non-confounders (J={1,2}), and three different no interaction models M are assumed; additive (thick line), additive odds (upper thin line) and multiplicative (lower thin line).

$Figure 2: Plots of normalized attributable proportion AP(1,1)=AP(1,1; J,M)$${\rm AP}(1, 1) = {\rm AP}(1, 1; \ J, M)$$ of interaction [37] (solid) and the corresponding traditional quantity [2] (dashed) versus θ(1,1)$$\theta (1, 1)$$ for a saturated model with p=2$$p = 2$$ risk factors, whose other three risks θ(0,0)=0.1$$\theta (0, 0) = 0.1$$, θ(1,0)=0.05$$\theta (1, 0) = 0.05$$, θ(0,1)=0.15$$\theta (0, 1) = 0.15$$ are the same as for Model II of Table 2. Both variables are non-confounders (J={1,2}$$J = \{1, 2\} $$), and two different no interaction models M$$M$$ are assumed; additive (thick line), and multiplicative (thin line). The curves for the additive odds model (not shown) are very close to those of the additive model.$

Figure 2:

Plots of normalized attributable proportion AP(1,1)=AP(1,1; J,M) of interaction [37] (solid) and the corresponding traditional quantity [2] (dashed) versus θ(1,1) for a saturated model with p=2 risk factors, whose other three risks θ(0,0)=0.1, θ(1,0)=0.05, θ(0,1)=0.15 are the same as for Model II of Table 2. Both variables are non-confounders (J={1,2}), and two different no interaction models M are assumed; additive (thick line), and multiplicative (thin line). The curves for the additive odds model (not shown) are very close to those of the additive model.

Table 2:

Three saturated models I-III with p risk factors.

Sat. model						Additive	Add odds	Disj	Mult
	p	θ(0,0)	θ(1,0)	θ(0,1)	θ(1,1)	θrem(1,1)
I	2	0.05	0.25	0.4	0.4	0.6	0.486	0.526	0.809
II	2	0.10	0.05	0.15	0.3	0.1	0.106	0.103	0.077
	p	θ(0,0,0)	θ(1,0,0)	θ(0,1,0)	θ(0,0,1)	θ(1,1,0)	θ(1,0,1)	θ(0,1,1)	θ(1,1,1)
III	3	0.1	0.3	0.2	0.05	0.4	0.4	0.2	0.9
		Additive	Add odds	Disj	Mult	Additive	Add odds	Disj	Mult
		θrem(1,1,0)				θrem(1,0,1)
		0.400	0.362	0.378	0.491	0.250	0.270	0.261	0.169
		θrem(0,1,1)				θrem(1,1,1)
		0.150	0.161	0.156	0.106	0.350	0.337	0.343	0.314

Note: For each model, the risks θ(x) are displayed for all 2p exposure profiles x, and the predicted risks θrem(x)=θrem(x; J,M) for exposure profiles with at least two factors turned on. The latter are based on four different models M of no interaction among all co-factors (J={1,…,p}).

Table 3 displays estimated coverage probabilities of confidence intervals for the odds ratio based attributable fraction APOR(x; J,M) of interaction. Each coverage probability is estimated from a number of randomly drawn data sets with N/2 cases and controls, generated from the two saturated models I and II with p=2 co-factors, and the model III with p=3. We use two models M of no interaction, additive odds and multiplicative, and three types of confidence intervals; the delta method (29), the logit delta method (32) and the bootstrap method (48). It is seen that the accuracy of all three confidence intervals increase with N, and overall the bootstrap method has the best performance, although the simpler logit delta method works well for a wide range of models. Appendix F provides more details about the simulation study.

Table 3:

Estimates πˆ(1−α) of coverage probabilities for confidence intervals of odds ratio based APs of interaction.

Saturated model	APOR(x; J,M)				Delta		Logit delta		Bootstrap BCa
	x	M	Value	N	πˆ(0.95)	πˆ(0.99)	πˆ(0.95)	πˆ(0.99)	πˆ(0.95)	πˆ(0.99)
I	(1,1)	Add odds	–0.296	500	0.906	0.959	0.932	0.980	0.956	0.991
				1,000	0.944	0.981	0.954	0.988	0.951	0.990
				10,000	0.950	0.990	0.950	0.990	0.948	0.989
I	(1,1)	Mult	–0.842	500	0.917	0.956	0.954	0.993	0.957	0.993
				1,000	0.930	0.967	0.953	0.992	0.950	0.989
				10,000	0.949	0.988	0.951	0.990	0.951	0.989
II	(1,1)	Add odds	0.725	500	0.938	0.975	0.978	0.998	0.952	0.989
				1,000	0.951	0.988	0.967	0.994	0.949	0.988
				10,000	0.950	0.990	0.952	0.991	0.950	0.989
II	(1,1)	Mult	0.805	500	0.922	0.960	0.955	0.993	0.953	0.991
				1,000	0.934	0.972	0.952	0.991	0.950	0.989
				10,000	0.948	0.988	0.951	0.990	0.952	0.991
III	(1,1,0)	Add odds	0.149	1,000	0.901	0.960	0.923	0.977	0.957	0.992
				10,000	0.949	0.987	0.952	0.989	0.949	0.989
III	(1,1,0)	Mult	–0.309	1,000	0.876	0.934	0.922	0.975	0.958	0.991
				10,000	0.948	0.986	0.954	0.991	0.950	0.990
III	(1,1,1)	Add odds	0.943	1,000	0.916	0.957	0.958	0.993	0.959	0.993
				10,000	0.947	0.987	0.950	0.990	0.949	0.989
III	(1,1,1)	Mult	0.949	1,000	0.867	0.909	0.957	0.995	0.959	0.991
				10,000	0.939	0.976	0.950	0.991	0.947	0.990

Note: APOR(x; J,M) in eq. (40) of Appendix A.3 quantifies interaction among all co-factors J={1,…,p} for exposure x, with data generated from the three saturated models of Table 2, with Additive odds or Multiplicative models of no interaction. The confidence intervals with nominal coverage 1−α use the delta method (29), the logit delta method (32) or the bias corrected and accelerated bootstrap method (48) with B=1000 resamples. The delta methods (bootstrap) employ 1,00,000 (10,000) retrospectively simulated data sets from the saturated model, with N/2 cases and controls.

7 A real data set

MS is a complex and inflammatory disease causing damage to the central nervous system. A number of family studies indicate a genetic component of the disease, with a major contribution from variants at the human leukocyte antigen (HLA) complex. It is well known that presence of allele 15 of the HLA-DRB1 gene is a risk factor, whereas allele 02 of the HLA-A gene has a protective effect (Cree 2014). Several data sets reveal that environmental factors, in particular smoking, impact the aetiology of the disease as well. A recent paper of Hedström et al. (2011) demonstrated pairwise positive additive interaction between these three factors for a Scandinavian case-control data set with 843 cases and 1209 controls. Here we will redo parts of the analysis, using our novel measures APOR of joint effects and interaction. As in Hedström et al. (2011), we use a binary coding xDRB1∗15=1 for an individual with genotype 15/X or 15/15, xA∗02=1 if the genotype is Z/Z, and xsmoking=1 for a current smoker, with X and Z referring to any allele different from HLA-DRB1*15 and HLA-A*02 respectively. For simplicity, we will not adjust for the confounding effect of age, gender, residential area and ancestry, as in Hedström et al. (2011). Table 4 summarizes results for different models, three of which have one single factor, one with both genetic factors included, and one model with all three factors present.

Table 4:

Analysis of the Multiple Sclerosis data set of Hedström et al. (2011). This includes point estimates and confidence intervals of the odds ratio based attributable proportion APOR for joint effects and interaction, and point estimates of the odds ratio OR before and odds ratio ORrem after the joint effect or interaction has been removed.

Factors	x	J	M	APˆOR	ORˆ	ORˆrem	ILd	IBCa
DRB1*15	1	{1}		0.711	3.465	1.000	(0.654,0.761)	(0.652,0.760)
A*02	1	{1}		0.387	1.630	1.000	(0.273,0.490)	(0.267,0.486)
smoking	1	{1}		0.366	1.578	1.000	(0.232,0.487)	(0.224,0.479)
(DRB1*15,	(1,0)	{1}		0.718	3.542	1.000	(0.633,0.785)	(0.630,0.784)
A*02)	(0,1)	{2}		0.396	1.653	1.000	(0.234,0.535)	(0.222,0.530)
	(1,1)	{1}		0.704	5.576	1.653	(0.618,0.772)	(0.614,0.771)
	(1,1)	{2}		0.365	5.576	3.542	(0.178,0.526)	(0.158,0.516)
	(1,1)	{1,2}		0.821	5.576	1.000	(0.767,0.863)	(0.766,0.863)
	(1,1)	{1,2}	Add odds	0.248	5.576	4.196	(0.038,0.436)	(0.015,0.428)
	(1,1)	{1,2}	Mult	–0.048	5.576	5.856	(–0.383,0.299)	(–0.347,0.273)
(DRB1*15,	(1,0,0)	{1}		0.728	3.678	1.000	(0.632,0.802)	(0.628,0.801)
A*02,	(0,1,0)	{2}		0.351	1.542	1.000	(0.149,0.525)	(0.131,0.517)
smoking)	(1,1,0)	{1}		0.678	4.782	1.542	(0.570,0.763)	(0.565,0.760)
	(1,1,0)	{2}		0.231	4.782	3.678	(–0.023,0.457)	(–0.065,0.436)
	(1,1,0)	{1,2}		0.791	4.782	1.000	(0.718,0.847)	(0.717,0.846)
	(1,1,0)	{1,2}	Add odds	0.118	4.782	4.220	(–0.158,0.376)	(–0.169,0.362)
	(1,1,0)	{1,2}	Mult	–0.157	4.782	5.670	(–0.486,0.212)	(–0.457,0.232)
	(1,1,1)	{1}		0.803	13.340	2.635	(0.659,0.891)	(0.648,0.891)
	(1,1,1)	{2}		0.691	13.340	4.148	(0.457,0.835)	(0.429,0.833)
	(1,1,1)	{1,2}		0.900	13.340	1.335	(0.822,0.945)	(0.817,0.946)
	(1,1,1)	{1,2}	Add odds	0.593	13.340	5.447	(0.309,0.780)	(0.271,0.781)
	(1,1,1)	{1,2}	Mult	0.389	13.340	8.183	(–0.146,0.748)	(–0.255,0.724)
	(1,1,1)	{1,2,3}		0.925	13.340	1.000	(0.876,0.956)	(0.875,0.956)
	(1,1,1)	{1,2,3}	Add odds	0.660	13.340	4.555	(0.442,0.804)	(0.425,0.804)
	(1,1,1)	{1,2,3}	Mult	0.435	13.340	7.572	(–0.084,0.768)	(–0.222,0.741)

Note: The two genetic variables DRB1*15 and A*02 encode genotypes of human leukocyte antigens in a binary way, as described in the text, and smoking is an indicator of current smoking. All three variables are binary risk factors, with level 1 indicating increased risk. Column 1 lists the p∈{1,2,3} co-factors that are included in the model, column 2 displays for each factor whether risk is present (1) or absent (0), and column 3 shows which co-factors (J∈{1,…,p}) that are included in APOR, see eq. (40) for a definition. The fourth column displays whether joint effects (there is no model M of no interaction) or interaction (M is specified) between the factors in J is estimated. The fifth column either shows point estimates of APOR(x; J) in eq. (40) for joint effects, or of APOR(x; J,M) in eq. (42) for interaction. The sixth column contains point estimates of OR(x) in eq. (19), and the seventh column either lists estimates of OR(xJ) for joint effects, or of APOR(x; J,M) in eq. (20) for interaction. The eighth column is the Logit delta based confidence interval in eq. (32), and the last column the bootstrapped BCa interval of eq. (48), based on B=20000 replicates. Both confidence intervals have a nominal coverage probability of 1−α=0.95.

Table 5:

Examples of quantities ξ in eq. (27).

ξ	a(ψ)	b(ψ)	Type of quantity
AP(x; J)	θ(x,ψ)	θ(xJ,ψ)	Joint effect
AP(J)	∑xθ(xJ,ψ)q(x)	∑xθ(xJ,ψ)q(x)
PAF(J)	∑xθ(x,ψ)q(x)	∑xθ(xJ,ψ)q(x)
APOR(x; J)	OR(x,ψ)	OR(xJ,ψ)
APOR(J)	∑xOR(xJ,ψ)q(x)	∑xOR(xJ,ψ)q(x)
AP(x; J,M)	θ(x,ψ)	θrem(x,ψ; J,M)	Interaction
AP(J,M)	∑xθ(xJ,ψ)q(x)	∑xθrem(xJ,ψ; J,M)q(x)
PAF(J,M)	∑xθ(x,ψ)q(x)	∑xθrem(x,ψ; J,M)q(x)
APOR(x; J,M)	OR(x,ψ)	ORrem(x,ψ; J,M)
APOR(J,M)	∑xOR(xJ,ψ)q(x)	∑xORrem(xJ,ψ;j,M)q(X) ,

Note: All risk based quantities involve the θ(x,ψ), given by either of eqs (22)–(24). The odds ratio based quantities involve OR(x,ψ) in eqs (25) or (26). The quantity θrem(x,ψ; J,M) in rows 6–8 refers to the risk function [17] when interaction among the variables in J has been removed according to model M, see Table 6. The quantity ORrem(x,ψ; J,M) in the last two rows refers to the odds ratio [20] when interaction among the variables in J has been removed, see Table 7. The exposure distribution in the study population is q(x).

Each one-factor model quantifies the marginal effects of a single covariate, without taking the confounding effect of the other two into account. DRB1*15, A*02 and smoking are risk factors, since the estimated odds ratios ORˆ(1) between MS and each factor equals 3.47, 1.63 and 1.58 when the other two factors are not accounted for, see also Table 1 of Hedström et al. (2011). The fraction of odds ratio attributable to each factor is

APˆOR(1; {1})=ORˆ(1)−1maxORˆ(1),1={0.711,forDRB1∗15,0.387,forA∗02,0.366,forsmoking,

and the accompanying confidence intervals are displayed in the first three rows of Table 4. This does not imply that more than 70 % of the odds ratio between MS and various risk factors is attributable to DRB1*15, only that this is the case when no other co-factors or sufficient causes than DRB1*15 are taken into account, cf. Lekman et al. (2014) for a more extensive discussion. We also notice from Table 4 that the estimated APOR for DRB1*15 changes very little, to 0.718 and 0.704, when the confounding effect of A*02 is controlled for at levels 0 and 1, and to 0.728, 0.678 and 0.803 when A*02 and smoking are controlled for jointly, at levels (0,0), (1,0) and (1,1).

Next we estimate the joint effect of two or more variables, and notice from Table 4 that the estimated proportion of odds ratio attributable to DRB1*15 and A*02, is 0.821, 0.791 and 0.900 when smoking is not accounted for, among non-smokers and among smokers respectively. The estimated APOR is even larger, 0.925, for all three variables jointly. This corresponds to an odds ratio that drops from 13.340 to 1 when the joint effect of all factors is removed.

The framed numbers of Table 4 are estimates of additive APOR of interaction between the two genetic factors DRB1*15 and A*02 under various smoking scenarios. It is seen that APOR varies substantially; 0.248, 0.118 and 0.593, depending on whether smoking is not accounted for, if only non-smokers are considered, or if only smokers are included. In the first case the odds ratio drops from 5.576 to 4.196 when interaction between the two factors is removed, in the second case it drops from 4.782 to 4.220, and in the third case from 13.340 to 5.447. The corresponding confidence intervals in Table 4 agree fairly well with those in Table 5 of Hedström et al. (2011). We are also able to estimate the proportion of additive interaction (0.660) among all three factors jointly, which corresponds to an odds ratio that decreases from 13.340 to 4.555 when interaction among the three variables is removed. The corresponding fractions of odds ratio attributable to multiplicative interaction between two or three factors are given in Table 4 as well, with accompanying confidence intervals. For instance, the proportion of multiplicative interaction between DRB1*15 and A*02 is –0.048 when smoking is not accounted for, corresponding to an odds ratio that decreases from 5.856 to 5.576 when interaction among the two variables is added. The multiplicative interaction is often much weaker than the additive, in particular for the two genetic risk factors. This implies that the joint effects estimates of APOR are less sensitive to the levels of the confounding variables, see formula [51] of Appendix E.

Suppose, finally, that the two labels of A*02 are switched (before looking at data). Then this factor becomes protective, with xA∗02=1 if the genotype is 2/Z or 2/2, and the estimated odds ratio between A*02 and MS changes from 1.63 to 1/1.63. Consequently, the estimated marginal attributable proportion

APˆOR(1; {1})=1/1.63−1max(1/1.63,1)=−0.386

for A*02 becomes negative, with a 95 % confidence interval (−0.490,−0.273) for the logit delta method, and (−0.487,−0.267) for the bootstrap method. The proportion of odds ratio due to additive interaction between DRB1*15 and A*02 becomes negative as well (–0.280) when smoking is not controlled for, with 95 % confidence intervals (−0.482,−0.051) and (−0.468,−0.022) for the logit delta and bootstrap methods.

8 Discussion

We studied the attributable proportion (AP) and its population counterpart (PAF) for models of disease that involve a number of binary predictor variables. The definition of AP and PAF was extended in several ways. First, we put AP and PAF for joint effects, marginal effects and interaction into a general framework. Second, we introduced a novel normalization of AP and PAF between –1 and 1, with a natural interpretation in terms of proportion of risk attributable to removing or adding joint or interaction effects. Third, we allowed for an arbitrary number of co-factors and subsets thereof for which exposure is of main concern, whereas the other confounders are controlled for. Fourth, our framework incorporates not only additive models of no interaction, but also various other models. Fifth, we outlined a general approach for point estimates and various types of confidence intervals for prospective or case-control studies.

Several extensions are possible. First, we assume that a null model M of no interaction is embedded into a larger saturated model Msat. For a logit parametrization of risk, the multiplicative model is a linear subspace of Msat, but other null models are more complicated curved exponential families (Lehmann and Casella 1998). An advantage of our approach is that no parameter estimates are needed for M, whereas those for Msat are straightforward; binomial proportions or output from standard logistic regression software. Apart from this we only need formulas for how M predicts risks or odds, and the derivatives of these predictions with respect to model parameters. Our approach requires data for all 2p exposure profiles in order to estimate the parameters of the saturated model. If such data is not available, it is better to use a smaller alternative model than Msat∖M, or to reduce p.

Second, our confidence intervals for the attributable proportion can be used to define Wald type goodness of fit tests, which reject no joint effects or interaction among the tested factors whenever the confidence intervals do not include 0. One may also employ a likelihood ratio test between the smaller null model and Msat∖M. But this approach is less feasible when the null model is a complex no interaction model M, with a maximum likelihood estimator that is possibly much more complicated than the estimate for Msat.

Third, there is sometimes no á priori knowledge of which co-factor levels to use as baseline. For instance, among a pair of SNPs it may not be known which one that increases disease risk. Then one may use the symmetrized version

(33)APsymm=|θ(1,1)+θ(0,0)−θ(0,1)+θ(1,0)|maxθ(1,1)+θ(0,0),θ(0,1)+θ(1,0)

to quantify departures from an additive model of no interaction, since it is invariant with respect to labeling of the two factors, and takes a value between 0 and 1. A value of 0 indicates no additive interaction, and positive values that additive interaction is present. Since the labeling of at least one risk factor is arbitrary, we cannot decide whether interaction departs negatively or positively from the additive model. The odds ratio version APORsymm replaces θ(x1,x2) by OR(x1,x2) in eq. (33). Because of the singularity at 0, it is not straightforward to produce confidence intervals for the risk or odds ratio based versions of APsymm though, and a Bayesian approach might be preferable.

Fourth, attributable fractions have been used for models where some or all of the p covariates are continuous, see Eide (2001) and references therein. Our new definitions eqs. (4), (5) and (8) of the attributable proportion and population attributable fraction apply to these settings as well. However, the parameter vector ψ becomes infinite dimensional, and a challenging topic for future research is to modify the inference procedure of Section 5 to this setting. See also Moore (2004) and Moore et al. (2006) for nonparametric machine learning and data mining approaches to quantify interaction for binary predictors.

Fifth, the synergy index is a measure of additive interaction of two risk factors x. It has been advocated by Skrondal (2003) as less sensitive to additional covariates associated to the disease but not to x=(x1,x2), see also Greenland (1993). A generalized version

S(x; J,M)=θ(x)−θ(xJ)θrem(x; J,M)−θ(xJ)=RR(x)−RR(xJ)RRrem(x; J,M)−RR(xJ)

of the synergy index could be defined for any no interaction model M and number of risk factors, or a version SOR(x; J,M) that uses odds ratios instead of relative risks. They predict increased risk or OR, with values exceeding 1, when interaction among the variables in J is positive or synergetic. Point estimates and confidence intervals for S(x; J,M) and SOR(x; J,M) are obtained as in Section 5. The latter assume asymptotic normality of Sˆ, or of log(Sˆ), if it is known á priori that S>0 (Rothman 1976b).

In summary, we argue our unified theory for AP and PAF of joint effects or interactions among an arbitrary number of factors contributes with significant knowledge to understand synergies of biological and environmental factors that influence health and disease.

Appendix

A General definitions of attributable proportions and its generalizations in terms of risk ratios and odds ratios

A.1 Marginal and joint effects

Consider a subset J⊂{1,…,p} of factors, and an exposure vector x=(x1,…,xp). The joint attributable proportion of risk in eq. (4) due to the factors in J, can be expressed in terms of the risk ratio [16], as

(34)AP(x; J)=RR(x)−RR(xJ)maxRR(x),RR(xJ),

where xJ is the vector for which exposure to all factors in J has been removed (see Section 3.2). This follows by first substituting θrem(x)=θ(xJ) into eq. (4), and then dividing the numerator and denominator of this equation by θ(0).

In order to define the joint attributable proportion of risk eq. [5] among the factors in J, we associate to each x another vector xJ=(x1J,…,xpJ), whose exposure to all factors i∈J has been added or turned on (xiJ=1), whereas all confounding factors i∉J remain at the levels specified by x (xiJ=xi). Let also X refer to the exposure vector of a randomly chosen individual. Then eq. (5) takes the form

(35)AP(J)=ERR(XJ)−ERR(XJ)maxERR(XJ),ERR(XJ),

with expectation referring to the average effect of the confounders. This follows by substituting θadd(x)=θ(xJ) into eq. (5), and then dividing the numerator and denominator of this equation by θ(0). The joint population attributable fraction [8] due to the factors in J, is similarly expressed as

(36)PAF(J)=ERR(X)−ERR(XJ)maxERR(X),ERR(xJ).

A.2 Interaction

The versions eq. (4) and (5) of the attributable fraction due to interaction among the variables in J can be expressed in terms of the risk ratios [16] and [18], as

(37)AP(x; J,M)=RR(x)−RRrem(x; J,M)maxRR(x),RRrem(x; J,M)

for a fixed exposure vector x, and as

(38)AP(J; M)= E[RR(XJ)]−ERRrem(XJ; J,M)max{E[RR(XJ)],E[RRrem(XJ; J,M)]}

when averaging over confounders. The population attributable fraction [8] due to interaction among the factors in J is similarly rewritten as

(39)PAF(J,M)=ERR(X)−ERRrem(x; J,M)maxERR(X),ERRrem(x; J,M).

A.3 Odds ratio based quantities

The odds ratio versions of attributable proportion use odds ratios [19] and [20] instead of relative risks in eqs (34)–(35) and eqs (37)–(38). By making such a substitution, we find that

(40)APOR(x; J)=OR(x)−OR(xJ)maxOR(x),OR(xJ),

(41)APOR(J)=EOR(XJ)−EOR(XJ)maxEOR(XJ),EOR(XJ),

(42)APOR(x; J,M)=OR(x)−ORrem(x; J,M)maxOR(x),ORrem(x; J,M)

and

(43)APOR(J; M)=EOR(XJ)−EORrem(XJ; J,M)maxEOR(xJ),EORrem(XJ; J,M).

B Delta based confidence interval

In this appendix we describe how to produce delta based confidence intervals for ξ in eq. (27). This quantity either corresponds to one of the five notions AP(x; J), AP(J), PAF(J), AP_OR(x; J), AP_OR(J) of attributable proportion or fraction for joint effects in eqs (34)–(36) and eqs (40)–(41), and the corresponding quantities AP(x; J,M), AP(J,M), PAF(J,M), AP_OR(x; J,M), AP_OR(J,M) for interaction in eqs (37)–(39) and eqs (42)–(43). For each of these quantities, ξ=ξ(ψ) involves the parameter vector ψ and the two functions a=a(ψ) and b=b(ψ), as shown in Table 5. The population based quantities AP(J), PAF(J), APOR(J) of joint effects, and AP(J; M), PAF(J; M), APOR(J; M) of interaction, involve the exposure distribution q as well. In this appendix it is assumed to be known, but in Appendix C we extend the analysis to situations where q is estimated from a prospective study.

Let ψˆ the the maximum likelihood estimator of ψ, and ξˆ=ξ(ψˆ) the accompanying plug in estimate of ξ. The two delta based confidence intervals (29) and (32) involve the standard error eq. (30) of ξˆ. In order to find the standard error we need an estimate Σˆ of the covariance matrix of ψˆ. For a linear parametrization of the risk function, eq. (31) describes how to compute Σˆ, and for a logit parametrization, it is available from standard logistic regression software. The other quantity Dˆ of the standard error is an estimate of D=∂ξ(ψ)/∂ψ, the derivative of ξ with respect to the parameter vector ψ. In order to compute D we introduce A=da(ψ)/dψ and B=db(ψ)/dψ). Differentiating eq. (27) with respect to ψ, we get

(44)D(ψ)=A(ψ)−B(ψ)max[a(ψ),b(ψ)]−1{a(ψ)>b(ψ)}[a(ψ)−b(ψ)]A(ψ)a(ψ)2−1{a(ψ)< (ψ)}[a(ψ)−b(ψ)]B(ψ)b(ψ)2,

with 1Ω the indicator function of Ω, which equals 1 when Ω holds and 0 otherwise. Although there is a point of discontinuity of the derivative of the denominator of eq. (27), D is still differentiable, since the numerator vanishes at such points. We estimate eq. (44) by

Dˆ=D(ψˆ),

and insert this estimate into eq. (30) in order to obtain the standard error of ξˆ.

The two parametrizations ψ1=ψx1; x∈{0,1}p and ψ2=ψx2; x∈{0,1}p of the saturated logistic regression model in eqs (23) and (24), only differ by a linear transformation

ψx2={ψ01,x=0,∑0<y≤xψy1,x≠0,

or equivalently

(45)ψ2=ψ1C,

for a square matrix C of order 2p. This implies ξ1(ψ1)=ξ2(ψ1C), ψˆ2=ψˆ1C, Σˆ2=CTΣˆ1C and Dˆ1=Dˆ2CT. Therefore the two parametrizations give identical point estimates ξˆ=ξi(ψˆi), standard errors σˆ=(DˆiΣˆiDˆiT)1/2, delta based and logit delta based confidence intervals. It can be shown, by a Taylor expansion argument, that the linear parametrization (22) gives identical standard errors and (logit) delta based confidence intervals as well.

Example 1 (Odds ratio based AP of interaction between two predictors.). We illustrate the two logit parametrizations for p=2 factors, where ξ=APOR(1,1); J,M quantifies interaction between two predictors J={1,2} by means of a no interaction model M. We suppress J in the notation and notice from Table 5 that eq. (27) holds with a(ψ)=OR(1,1),ψ and b(ψ)=ORrem(1,1),ψ; M. Since none of the two logit parametrizations include the baseline risk ψ00, we write their parameter vectors as ψi=(ψ01i,ψ10i,ψ11i) for i=1,2. It follows from eqs (25)–(26) that these reduced parameter vectors satisfy the same type of linear relationship [45] as the non-reduced vectors, with

C=101011001,

so that ψ011=ψ012=ψ01, ψ101=ψ102=ψ10, and ψ112=ψ01+ψ10+ψ111. From eqs (25)–(26) we find that

a(ψ1)=exp(ψ011+ψ101+ψ111),a(ψ2)=exp(ψ112),

are functions of the chosen parametrization, but not of the no interaction model M. Taking the derivative with respect to ψ, it follows that

A(ψ1)=a(ψ1)(1,1,1),A(ψ2)=a(ψ2)(0,0,1).

By Table 7, b(ψ) only depends on first two coordinates ψ01 and ψ10 of the parameter vector, and hence is the same for both parametrizations. We have that

b(ψ1)=b(ψ2)={exp(ψ01)+exp(ψ10)−1,M=Additiveodds,exp(ψ01+ψ10),M=Multiplicative.

Differentiating with respect to ψ we obtain

B(ψ1)=B(ψ2)={exp(ψ01),exp(ψ10),0,M=Additiveodds,b(ψ2)(1,1,0),M=Multiplicative.

C Delta based confidence interval with unknown exposure distribution

In the definition of ξˆ for the population based quantities AP, PAF and APOR, it is assumed that the regression parameters ψ are unknown, whereas the exposure distribution q is known. Since this is rarely the case, q has to be estimated. Fortunately it is possible to enlarge the parameter vector ψ to include q as well. We will outline how the standard error σˆ for a prospective study can be computed, so that formulas [29] and [32] still apply for the two delta based confidence intervals.

Suppose the exposure distribution q={q(x); x∈{0,1}p} is unknown, represented as a probability vector with 2p−1 free parameters. This gives an enlarged parameters vector

ψˉ=(ψ,q)

that includes effect as well as exposure distribution quantities. Since AP, PAF and APOR all depend on the exposure distribution, we write these quantities as

(46)ξ=ξ(ψˉ)=a(ψˉ)−b(ψˉ)maxa(ψˉ),b(ψˉ)

instead of eq. (27). For a prospective study with data {(xi,Yi); i=1,…,N}, we estimate the exposure distribution as

qˆ(x)=NxN,

for all binary vectors x of length p, with Nx the number of subjects i with xi=x. Let ψ¯^=(ψ^,q^) be the vector of estimated effect and exposure distribution parameters, with qˆ={qˆ(x); x∈{0,1}p}. Insertion into eq. (46) gives a plug in estimate

ξ^=ξ(ψ¯^)=a(ψ¯^)−b(ψ¯^)max[a(ψ¯^),b(ψ¯^)]

of ξ. The two delta based confidence intervals eqs (29) and (32) still apply, with a standard error computed as

(47)σ^=(D¯^Σ¯^D¯^T)1/2

instead of eq. (30). Here D¯^=D¯(ψ¯^) is an estimate of Dˉ(ψˉ)=dξ(ψˉ)/dψˉ, computed similarly as in eq. (44), and

Σ¯^= (Σ^0 0 Σ^q)

is an estimate of the covariance matrix Σˉ of the enlarged parameter vector, using the fact that the effect and exposure parameters are asymptotically orthogonal. Since Nqˆ has a multinomial distribution, the estimated covariance matrix of the exposure parameters qˆ, has elements

Σˆq,xy={qˆ(x)1−qˆ(x)/N,x=y,−qˆ(x)qˆ(y)/N,x≠y.

D Bootstrap BCa confidence interval

We use nonparametric bootstrap (Efron and Tibshirani 1993) and define for each resampled data set (x1∗,Y1∗),…,(xN∗,YN∗) a corresponding estimate ξ^* =ξ(ψ^*), where

ψˆ∗=ψˆ{xi∗,Yi∗}.

The resampled data points (xi∗,Yi∗) are drawn randomly with replacement from the original data set {(xi,Yi)}i=1N for a prospective study, and separately with replacement within controls and cases for a retrospective study, keeping the total number of controls and cases fixed in each resample. The bias-corrected accelerated percentile method of Efron (1987) is based on B independent replicates ξˆ1∗,…,ξˆB∗ of ξˆ∗, and a confidence interval

(48)I= ξ^ *([Bπ1]),ξ^ *([Bπ2])

with asymptotic coverage 1−α, where ξˆ(1)∗≤…≤ξˆ(B)∗ are the ordered ξˆi∗,

π1=Φzˆ+zˆ+zα/21−aˆ(zˆ+zα/2),π2=Φzˆ+zˆ+z1−α/21−aˆ(zˆ+z1−α/2),

Φ is the cumulative distribution function of a standard normal,

zˆ=z∑i=1B1ξˆi∗≤ξˆ}/B

a bias correction term, and

a^= ∑i=1N(ξ^(⋅)−ξ^(−i))36{∑i=1N(ξ^(⋅)−ξ^(−i))2}3/2,

an acceleration term that is a rescaled estimate of the skewness of the influence function of ξˆ (Hampel et al. 1986), with ξ^(−i) the leave one out estimate of ξ for the original data set, when one observation (xi,Yi) is removed, and ξ^(⋅)=∑i=1Nξ^(−i)/N. Notice that π1=α/2 and π2=1−α/2 when zˆ=aˆ=0.

E Risks and odds ratios when interaction is removed

Suppose a specific model of no interaction M is chosen, and that we want to assess the amount of interaction among the co-factors in J. For a prospective study we use either of the three quantities AP(x; J,M), AP(J,M) or PAF(J,M), and it follows from Table 5 that this requires expressions for θrem(x,ψ; J,M), the disease risk when interaction among the variables in J is absent. For a retrospective study we use either of the two quantities APOR(x; J,M) and APOR(J,M), which requires analogous formulas for the odds ratio ORrem(x,ψ; J,M) when there is no interaction among the variables in J. In this appendix we provide the required expressions for θrem(x,ψ; J,M) and ORrem(x,ψ; J,M).

E.1 Prospective studies

Table 6 provides expressions for θrem(x,ψ; J,M) for some of the generalized linear models M of no interaction in eq. (17). Formulas are given for the linear and the two logistic parametrizations of ψ in eqs (22)–(24). Due to presence of the baseline risk parameter ψ0, several expressions are quite complicated. For this reason, it is important to adapt parametrization to the assumed model of no interaction, using the linear parametrization for an additive model of no interaction, and any of the two logit parametrizations for a multiplicative model of no interaction. More generally, it is appropriate parametrize ψ using the same link function transformation g of risk, as in eq. (17).

Table 6:

Disease risk θrem(x,ψ; J,M) for exposure profile x when interaction among the factors in J has been removed.

p	M	x	J	ψ	θrem(x,ψ; J,M)
2	Mult	(1,1)	{1,2}	lin	ψ011−ψ01⋅ψ101−ψ10⋅1−ψ00ψ00/1+ψ011−ψ01⋅ψ101−ψ10⋅1−ψ00ψ00
				logit	exp(ψ00+ψ01+ψ10)1+exp(ψ00+ψ01+ψ10)
	Add	(1,1)		lin	ψ01+ψ10−ψ00
				logit	exp(ψ00+ψ01)1+exp(ψ00+ψ01)+exp(ψ00+ψ10)1+exp(ψ00+ψ10)−exp(ψ00)1+exp(ψ00)
	Disj	(1,1)		lin	1−(1−ψ01)(1−ψ10)/(1−ψ00)
				logit	1−exp−log(1+eψ00+ψ01)−log(1+eψ00+ψ10)+log(1+eψ00)
	Mrisk	(1,1)		lin	ψ01ψ10/ψ00
				logit	exp(ψ00+ψ01)1+exp(ψ00+ψ01)⋅exp(ψ00+ψ10)1+exp(ψ00+ψ10)⋅1+exp(ψ00)exp(ψ00)
3	Mult	(1,1,0)	{1,2,3}	lin	ψ0101−ψ010⋅ψ1001−ψ100⋅1−ψ000ψ000/1+ψ0101−ψ010⋅ψ1001−ψ100⋅1−ψ000ψ000
				logit	exp(ψ000+ψ010+ψ100)1+exp(ψ000+ψ010+ψ100)
		(1,0,1)		logit	exp(ψ000+ψ001+ψ100)1+exp(ψ000+ψ001+ψ100)
		(0,1,1)		logit	exp(ψ000+ψ001+ψ010)1+exp(ψ000+ψ001+ψ010)
		(1,1,1)		logit	exp(ψ000+ψ001+ψ010+ψ100)1+exp(ψ000+ψ001+ψ010+ψ100)
		(1,1,0)	{1,2}	logit	exp(ψ000+ψ010+ψ100)1+exp(ψ000+ψ010+ψ100)
		(1,1,1)		[28]	exp(ψ000+ψ001+ψ010+ψ100+ψ011+ψ101)1+exp(ψ000+ψ001+ψ010+ψ100+ψ011+ψ101)
				[29]	exp(ψ000+ψ011+ψ101−ψ001)1+exp(ψ000+ψ011+ψ101−ψ001)
	Add	(1,1,0)	{1,2,3}	lin	ψ010+ψ100−ψ000
				logit	exp(ψ000+ψ010)1+exp(ψ000+ψ010)+exp(ψ000+ψ100)1+exp(ψ000+ψ100)−exp(ψ000)1+exp(ψ000)
		(1,0,1)		lin	ψ001+ψ100−ψ000
				logit	exp(ψ000+ψ001)1+exp(ψ00+ψ001)+exp(ψ000+ψ100)1+exp(ψ000+ψ100)−exp(ψ000)1+exp(ψ000)
		(0,1,1)		lin	ψ001+ψ010−ψ000
				logit	exp(ψ000+ψ001)1+exp(ψ00+ψ001)+exp(ψ000+ψ010)1+exp(ψ000+ψ010)−exp(ψ000)1+exp(ψ000)
		(1,1,1)		lin	ψ001+ψ010+ψ100−2ψ000
				logit	exp(ψ000+ψ001)1+exp(ψ000+ψ001)+exp(ψ000+ψ010)1+exp(ψ000+ψ010)+exp(ψ000+ψ100)1+exp(ψ000+ψ100)−2exp(ψ000)1+exp(ψ000)
		(1,1,0)	{1,2}	lin	ψ010+ψ100−ψ000
				logit	exp(ψ000+ψ010)1+exp(ψ000+ψ010)+exp(ψ000+ψ100)1+exp(ψ000+ψ100)−exp(ψ000)1+exp(ψ000)
		(1,1,1)		lin	ψ011+ψ101−ψ001
				[28]	exp(ψ000+ψ001+ψ010+ψ011)1+exp(ψ000+ψ001+ψ010+ψ011)+exp(ψ000+ψ001+ψ100+ψ101)1+exp(ψ000+ψ001+ψ100+ψ101)−exp(ψ000+ψ001)1+exp(ψ000+ψ001)
				[29]	exp(ψ000+ψ011)1+exp(ψ000+ψ011)+exp(ψ000+ψ101)1+exp(ψ000+ψ101)−exp(ψ000+ψ001)1+exp(ψ000+ψ001)

Note: Each disease risk θrem(x,ψ; J,M) is based on an assumed model of no interaction M among a subset J⊂{1,…,p} of the p co-factors, for a given exposure profile x=(x1,…,xp). The vector ψ=(ψx) contains the 2p parameters, either on the linear [22] or logit risk scale [23]–[24]. For each model, the table includes those x for which at least two factors in J are turned on. An entry ’logit’ in the parameter column means that both eqs (23) and (24) apply. All models M are from Table 1, with Mrisk=Multiplicative risk, Disj=Disjunctive, Add=Additive and Mult=Multiplicative.

Table 7:

Odds ratios ORrem(x,ψ; J,M) for exposure profile x when interaction among the factors in J is removed.

p	M	x	J	ψ	ORrem(x,ψ; J,M)
2	Multiplicative	(1,1)	{1,2}	logit	exp(ψ01+ψ10)
	Additive odds	(1,1)		logit	exp(ψ01)+exp(ψ10)−1
3	Multiplicative	(1,1,0)	{1,2,3}	logit	exp(ψ010+ψ100)
		(1,0,1)		logit	exp(ψ001+ψ100)
		(0,1,1)		logit	exp(ψ001+ψ010)
		(1,1,1)		logit	exp(ψ001+ψ010+ψ100)
		(1,1,0)	{1,2}	logit	exp(ψ010+ψ100)
		(1,1,1)		[30]	exp(ψ001+ψ010+ψ100+ψ011+ψ101)
		(1,1,1)		[31]	exp(ψ011+ψ101−ψ001)
	Additive odds	(1,1,0)	{1,2,3}	logit	exp(ψ010)+exp(ψ100)−1
		(1,0,1)		logit	exp(ψ001)+exp(ψ100)−1
		(0,1,1)		logit	exp(ψ001)+exp(ψ010)−1
		(1,1,1)		logit	exp(ψ001)+exp(ψ010)+exp(ψ100)−2
		(1,1,0)	{1,2}	logit	exp(ψ010)+exp(ψ100)−1
		(1,1,1)		[30]	exp(ψ010+ψ001+ψ011)+exp(ψ100+ψ001+ψ101)−exp(ψ001)
		(1,1,1)		[31]	exp(ψ011)+exp(ψ101)−exp(ψ001)

Note: The odds ratios ORrem(x,ψ; J,M) are based on an assumed model M of no interaction among the factors in J⊂{1,…,p}, for a given exposure profile x, where ψ=(ψx; x≠0) contains the 2p−1 non-baseline parameters of the saturated logistic regression model [25] or [26]. An entry ’logit’ for ψ means that both parametrizations (25) or (26) apply. For each model, the table includes those x for which at least two factors in J are turned on.

But our framework includes other ways of choosing the model of no interaction. When no variables are confounders (J={1,…,p}), two possible models are

(49)θremx,ψ; {1,…,n},M={ψ0+ψ11{x≠0},Mdominant,ψ0+ψ11{x=1},Mrecessive.

They both involve two parameters ψ0 and ψ1 of the linear saturated model [22]; the disease risk for an individual exposed to no (x=0=(0,…,0)) or all (x=1=(1,…,1)) factors. It is also possible to divide the population into strata with different mechanisms of interaction (Westerlind et al. 2014), viewed as a Bayesian network (Koski and Noble 2009). Other models of no interaction impose monotonicity constraints on θrem(x) (Traskin et al. 2013).

E.2 Retrospective studies

Table 7 provides ORrem(x,ψ; J,M) expressions for two models M of no interaction and either of the two logit parametrizations (25)–(26) of ψ. Since the baseline parameter is absent, many of the formulas in Table 7 are simpler than those of Table 6.

We can use Table 7 (or more generally formula (17)) to deduce that APOR(x; J) of joint effects is independent of the levels of the confounding variables when there is multiplicative non-interaction among the variables in J. Indeed, assuming an odds ratio parametrization (25), we notice that

(50)OR(x,ψ)=ORrem(x,ψ; J,Multiplicative)=expψxJ+∑i∈JxiψxJ+ei.

Inserting eq. (50) into eq. (40), we find that

(51)APOR(x; J)=O(x,ψ)/O(xJ,ψ)−1maxO(x,ψ)/O(xJ,ψ),1=exp(∑i∈JxiψxJ+ei)−1maxexp(∑i∈JxiψxJ+ei),1,

independently of the levels xi of the confounding variables i≠J.

F Details from simulation study

The retrospective simulation study of Section 6 (and Table 3) contains a number of data sets with N/2 cases and N/2 controls, for three models I, II and III of no interaction. For each simulated data set, exposure profiles were generated independently from

(52)P(x|Y=y)={q(x)1−θ(x)/1−E(θ(X)),y=0,q(x)θ(x)/E(θ(X)),y=1,

and some conveniently chosen prior distribution q. In most runs of the present simulation study, the prior eq. (52) was uniform q(x)≡2−p.

If N is too small, estimates of ψ may be ill-conditioned for data sets that lack some exposure profile x among cases (Nx1=0) or among controls (Nx0=0). This may happen if either θ(x) or 1−θ(x) is small. Since the saturated model is used for inference, N≫2p is required. For simulated data sets, the prior distribution q in eq. (52) can be adjusted in order to minimize the required N. This is particularly important for the bootstrap confidence interval method of Appendix D, since each simulated data set must be resampled B times, and some of the resampled estimates ψ^* may be even more ill-conditioned than ψˆ.

In order to avoid ill-conditioned estimates, we employed non-uniform priors in Table 3 for the following runs: For Model I we used q(0,0)=4/7, q(1,0)=q(0,1)=q(1,1)=1/7 for all three methods when N=500, and for the bootstrap also when N=1000. For Model II we chose q(0,0)=2/7, q(1,0)=3/7, q(0,1)=q(1,1)=1/7 for all three methods when N=500, and for the bootstrap also when N=1000. For Model III we defined q(x) to be inversely proportional to minθ(x),1−θ(x) for the bootstrap when N=1000.

Acknowledgements

Support of Ola Hössjer from the Swedish Research Council, contract 621-2013-4633 is acknowledged, as well as valuable comments from an associate editor and a reviewer.

References

Agresti, A. (2013). Categorical Data Analysis. 3rd edition. Hoboken, NJ: Wiley.Search in Google Scholar

Andersson, T., Alfredsson, L., Källberg, H., Zdravkovic, S. and Ahlbom, A. (2005). Calculating measures of biological interaction. European Journal of Epidemiology, 20:575–579.10.1007/s10654-005-7835-xSearch in Google Scholar PubMed

Aschengrau, A. and Seage, G. R. (2003). Essentials of Epidemiology in Public Health. Epidemiology Series. Burlington, MA: Jones & Bartlett Publishers.Search in Google Scholar

Assman, S. F., Hosmer, D. W., Lemeshow, S. and Mundt, K. A. (1996). Confidence intervals for measures of interaction. Epidemiology, 7(3):286–290.10.1097/00001648-199605000-00012Search in Google Scholar PubMed

Benichou, J. (2001). A review of adjusted estimators of attributable risk. Statistical Methods in Medical Research, 10:195–216.10.1177/096228020101000303Search in Google Scholar PubMed

Bookman, E. B., McAllister, K., Gillanders, E., Wanke, K., Balshaw, D., Rutter, J., Reedy, J., Shaughnessy, D., Agurs-Collins, T., Paltoo, D., Atienza, A., Bierut, L., Kraft, P., Fallin, M. D., Perera, F., Turkheimer, E., Boardman, J., Marazita, M. L., Rappaport, S. M., Boerwinkle, E., Suomi, S. J., Caporaso, N. E., Hertz-Picciotto, I., Jacobson, K. C., Lowe, W. L., Goldman, L. R., Duggal, P., Gunnar, M. R., Manolio, T. A., Green, E. D., Olster, D. H., Birnbaum, L. S., for the NIH GxE Interplay Workshop Participants. (2011). Gene-environment interplay in common complex diseases: Forging an integrative model – recommendations from an NIH workshop. Genetic Epidemiology, 35:217–225.10.1002/gepi.20571Search in Google Scholar PubMed PubMed Central

Cordell, H. J. (2002). Epistasis: What it means, what it doesn’t mean, and statistical methods to detect it in humans. Human Molecular Genetics, 11:2463–2468.10.1093/hmg/11.20.2463Search in Google Scholar PubMed

Cornfield, J. (1951). A method of estimating comparative rates from clinical data. Application to cancer of the lung, breast and cervix. Journal of the National Cancer Institute, 11:1269–1275.Search in Google Scholar

Cree, B. (2014). Multiple sclerosis genetics. Handbook of Clinical Neurology, 122:193–209.10.1016/B978-0-444-52001-2.00009-1Search in Google Scholar PubMed

Deubner, D. C., Wilkinson, W. E., Helms, M. J., Tyroler, H. A. and Hames, C. G. (1980). Logistic model estimation of death attributable to risk factors for cardiovascular disease in Evans County, Georgia. American Journal of Epidemiology, 112:135–143.10.1093/oxfordjournals.aje.a112963Search in Google Scholar PubMed

Efron, B. (1987). Better bootstrap confidence intervals (with discussion). Journal of the American Statistical Association, 82:171–200.10.1080/01621459.1987.10478410Search in Google Scholar

Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57. Boc Raton, Florida: Chapman and Hall/CRC.10.1007/978-1-4899-4541-9Search in Google Scholar

Eichler, E. E., Flint, J., Gibson, G., Kong, A., Leal S. M., Moore, J. H. and Nadeau, J. H. (2010). Missing heritability and strategies for finding the underlying causes of complex disease. Nature Reviews Genetics, 11(6):446–450.10.1038/nrg2809Search in Google Scholar PubMed PubMed Central

Eide, G. E. (2001). Attributable fractions: Fundamental concepts and their visualization. Statistical Methods in Medical Research, 10:159–193.10.1177/096228020101000302Search in Google Scholar PubMed

Greenland, S. (1983). Tests for interaction in epidemiological studies: A review and a study of power. Statistics in Medicine, 2(2):243–251.10.1002/sim.4780020219Search in Google Scholar PubMed

Greenland, S. (1993). Additive risk versus additive relative risk models. Epidemiology, 4(1):32–36.10.1097/00001648-199301000-00007Search in Google Scholar PubMed

Greenland, S. and Robins, J. (1988). Conceptual problems in the definition and interpretation of attributable fractions. American Journal of Epidemiology, 128:1185–1197.10.1093/oxfordjournals.aje.a115073Search in Google Scholar PubMed

Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. New York: Wiley.Search in Google Scholar

Hedström, A. K., Sundqvist, E., Bäärnhielm, M., Nordin, N., Hillert, J., Kockum, I., Olsson, T. and Alfredsson, L. (2011). Smoking and two human leukocyte antigen genes interact to increase the risk for multiple sclerosis. Brain, 134:653–664.10.1093/brain/awq371Search in Google Scholar PubMed

Hosmer, D. W. and Lemeshow, S. (1992). Confidence interval estimation of interaction. Epidemiology, 3(5):452–456.10.1097/00001648-199209000-00012Search in Google Scholar PubMed

Kalilani, L. and Atashili, J. (2006). Measuring additive interaction using odds ratios. Epidemiologic Perspectives & Innovations, 3:5.10.1186/1742-5573-3-5Search in Google Scholar PubMed PubMed Central

Knol, M. J., VanderWeele, T. J., Groenwold, R. H. H., Klungel, O. H., Rovers, M. M. and Grobbee, D. E. (2011). Estimating measures of interaction on an additive scale for preventive exposures. European Journal of Epidemiology, 26:433–438.10.1007/s10654-011-9554-9Search in Google Scholar PubMed PubMed Central

Koski, T. and Noble, J. (2009). Bayesian Networks: An Introduction. Chichester, United Kingdom: John Wiley and Sons.10.1002/9780470684023Search in Google Scholar

Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation. New York: Springer.Search in Google Scholar

Lekman, M, Hössjer O., Andrews, P., Källberg, H., Uvehag, D., Charney, D., Manji, H., Rush, J. A., McMahon, F. J., Moore, J. H. and Kockum, I. (2014). The genetic interacting landscape of 63 candidate genes in major depressive disorder: An explorative study. Biodata Mining, 7:19.10.1186/1756-0381-7-19Search in Google Scholar PubMed PubMed Central

Levin, M. L. (1953). The occurrence of lung cancer in man. Acta – Unio Internationalis Contra Cancrum, 9(3):531–541.Search in Google Scholar PubMed

Maher, B. (2008). The case of the missing heritability. Nature, 456(6):18–21.10.1038/456018aSearch in Google Scholar PubMed

McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. 2nd edition. London: Chapman & Hall.10.1007/978-1-4899-3242-6Search in Google Scholar

Miettinen, O. S. (1976). Estimability and estimation in case-referent studies. American Journal of Epidemiology, 103:226–235.10.1093/oxfordjournals.aje.a112220Search in Google Scholar PubMed

Moore, J. H. (2004). Computational analysis of gene-gene interactions using multifactor dimensionality reduction. Expert Review of Molecular Diagnostics, 4:795–803.10.1586/14737159.4.6.795Search in Google Scholar PubMed

Moore, J. H., Gilbert, J. C., Tsai, C. -T., Chiang, F. -T., Holden, T., Barney, N. and White, B. C. (2006). A flexible computational framework for detecting, characterizing and detecting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology, 241:252–261.10.1016/j.jtbi.2005.11.036Search in Google Scholar PubMed

Northridge, M. E. (1995). Public health methods- attributable risk as a link between causality and public health action. American Journal of Public Health, 85(9):1202–1203.10.2105/AJPH.85.9.1202Search in Google Scholar PubMed

Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann.Search in Google Scholar

Phillips, P. C. (2008). Epistasis – the essential role of genetic interactions in the structure of evolution of genetic systems. Nature Reviews Genetics, 9:855–867.10.1038/nrg2452Search in Google Scholar PubMed PubMed Central

Prentice, R. L. and Pyke, R. (1979). Logistic disease incidence models and case-control studies. Biometrika, 66(3):403–411.10.1093/biomet/66.3.403Search in Google Scholar

Richardson, D. B. and Kaufman, J. S. (2009). Estimation of the relative excess risk due to interaction and associated confidence intervals. American Journal of Epidemiology, 169:756–760.10.1093/aje/kwn411Search in Google Scholar PubMed PubMed Central

Rothman, K. J. (1976a). Causes. American Journal of Epidemiology, 104:587–592.10.1093/oxfordjournals.aje.a112335Search in Google Scholar PubMed

Rothman, K. J. (1976b). The estimation of synergy or antagonism. American Journal of Epidemiology, 103:506–511.10.1093/oxfordjournals.aje.a112252Search in Google Scholar PubMed

Rothman, K. J. (2012). Epidemiology. An introduction. 2nd edition. NY: Oxford Univ. Press.Search in Google Scholar

Sjölander, A., Lee, W., Källberg, H. and Pawitan, Y. (2014). Bounds on causal interaction for binary outcomes. Biometrics, 70:500–505.10.1111/biom.12166Search in Google Scholar PubMed

Sjölander, A. and Vansteelandt, S. (2011). Doubly robust estimation of attributable fractions. Biostatistics, 12:112–121.10.1093/biostatistics/kxq049Search in Google Scholar PubMed

Skrondal, A. (2003). Interaction as departure from additivity in case-control studies: A cautionary note. American Journal of Epidemiology, 158:251–258.10.1093/aje/kwg113Search in Google Scholar PubMed

Traskin, M., Wang, W., Ten Have, T. R. and Small, D. S. (2013). Efficient estimation of the attributable fraction when there are monotonicity constraints and interactions. Biostatistics, 14(1):173–188.10.1093/biostatistics/kxs019Search in Google Scholar PubMed

VanderWeele, T. J. (2009). Sufficient cause interactions and statistical interactions. Epidemiology, 20(1):6–13.10.1097/EDE.0b013e31818f69e7Search in Google Scholar PubMed

VanderWeele, T. J. (2010). Epistatic interactions. Statistical Applications in Genetics and Molecular Biology, 9(1):Article 1.10.2202/1544-6115.1517Search in Google Scholar PubMed PubMed Central

VanderWeele, T. J. (2013). Reconsidering the denominator of the attributable proportion for interaction. European Journal of Epidemiology, 28:779–784.10.1007/s10654-013-9843-6Search in Google Scholar PubMed PubMed Central

VanderWeele, T. J. and Knol, M. (2014). A tutorial on interaction. Epidemiologic Methods, 3(1):33–72.10.1515/em-2013-0005Search in Google Scholar

VanderWeele, T. J. and Robins, J. M. (2008). Empirical and counterfactual conditions for sufficient cause interactions. Biometrika, 95:49–61.10.1093/biomet/asm090Search in Google Scholar

Westerlind, H., Jääskinen, V., Corander, J., Hillert, J. and Koski, T. (2014). The learning for mixtures of multicausal interaction networks. In Westerlund: Modeling genetic susceptibility to Multiple Sclerosis, PhD thesis, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm.Search in Google Scholar

Zou, G. Y. (2008). On the estimation of additive interaction by use of the four-by-two table and beyond. American Journal of Epidemiology, 168(2):212–224.10.1093/aje/kwn104Search in Google Scholar PubMed

Published Online: 2016-12-23

Articles in the same Issue

https://doi.org/10.1515/em-2015-0028

Keywords for this article

attributable proportion; confidence interval; models of no interaction; normalization; population attributable fraction