Identification of an additive interaction using parameter regularization and model selection in epidemiology

Chanchan Hu; Zhifeng Lin; Zhijian Hu; Shaowei Lin

doi:10.7717/peerj.18304

Identification of an additive interaction using parameter regularization and model selection in epidemiology

Chanchan Hu¹, Zhifeng Lin¹, Zhijian Hu ^1,2, Shaowei Lin ¹

1 Department of Epidemiology and Health Statistics, Fujian Medical University, Fuzhou, Fujian, China

2 Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fuzhou, Fujian, China

DOI: 10.7717/peerj.18304

Published: 2024-10-14
Accepted: 2024-09-23
Received: 2024-06-21

Academic Editor: Aslı Suner

Subject Areas: Epidemiology, Statistics
Keywords: Additive interactions, Parameter regularization, Model selection, Epidemiology, Real data

Copyright: © 2024 Hu et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Hu C, Lin Z, Hu Z, Lin S. 2024. Identification of an additive interaction using parameter regularization and model selection in epidemiology. PeerJ 12:e18304 https://doi.org/10.7717/peerj.18304

Abstract

Background

In epidemiology, indicators such as the relative excess risk due to interaction (RERI), attributable proportion (AP), and synergy index (S) are commonly used to assess additive interactions between two variables. However, the results of these indicators are sometimes inconsistent in real world applications and it may be difficult to draw conclusions from them.

Method

Based on the relationship between the RERI, AP, and S, we propose a method with consistent results, which are achieved by constraining $e^{θ_{3}} - e^{θ_{1}} - e^{θ_{2}} + 1 = 0$ , and the interpretation of the results is simple and clear. We present two pathways to achieve this end: one is to complete the constraint by adding a regular penalty term to the model likelihood function; the other is to use model selection.

Result

Using simulated and real data, our proposed methods effectively identified additive interactions and proved to be applicable to real-world data. Simulations were used to evaluate the performance of the methods in scenarios with and without additive interactions. The penalty term converged to 0 with increasing λ, and the final models matched the expected interaction status, demonstrating that regularized estimation could effectively identify additive interactions. Model selection was compared with classical methods (delta and bootstrap) across various scenarios with different interaction strengths, and the additive interactions were closely observed and the results aligned closely with bootstrap results. The coefficients in the model without interaction adhered to a simplifying equation, reinforcing that there was no significant interaction between smoking and alcohol use on oral cancer risk.

Conclusion

In summary, the model selection method based on the Hannan-Quinn criterion (HQ) appears to be a competitive alternative to the bootstrap method for identifying additive interactions. Furthermore, when using RERI, AP, and S to assess the additive interaction, the results are more consistent and the results are simple and easy to understand.

Introduction

In epidemiology, it is important to determine the interaction between two factors of disease risk, as this information is closely related to disease prevention and appropriate interventions (Rothman, Greenland & Lash, 2008; Diaz-Gallo et al., 2021). Generally, the interactions are evaluated by including a product interaction term of the two factors in the model. However, epidemiologists frequently employ exponential models, such as logistic regression or Cox regression, to analyze the disease rates and risks, in which the product interaction term is thought to be the multiplicative interaction (Rothman, Greenland & Lash, 2008). As Rothman and others have pointed out, it is more meaningful and interesting to evaluate the interaction on an additive scale instead of a multiplicative scale (Rothman, 1976; VanderWeele & Robins, 2007; Whitcomb & Naimi, 2023).

For example, let A and B denote two binary risk factors measured in an epidemiologic study, with their presence and absence reflected by 1 and 0, respectively. $R R_{i j}$ denotes the risk when $A = i$ and $B = j$ . The additive interaction is defined as the difference between the risk difference of A when moving across levels of B and the risk difference of B when moving across levels of A (Rothman, Greenland & Lash, 2008). Thus:

$(R R_{11} - R R_{01}) - (R R_{10} - R R_{00}) = (R R_{11} - R R_{10}) - (R R_{01} - R R_{00}) .$

The additive interaction can be measured by $I C = R R_{11} - R R_{10} - R R_{01} + R R_{00}$ , which is known as an interaction contrast. $I C = 0$ , if and only if the risk differences for A are constant across B and the risk differences for B are constant across A; that is, $R R_{11} - R R_{01} = R R_{10} - R R_{00}$ or $R R_{11} - R R_{10} = R R_{01} - R R_{00}$ , which correspond to additivity, thus there is no additive interaction between factors A and B. Departure from additivity implies the presence of an additive interaction. Specifically, superadditivity (or synergy) is defined as a “positive” departure, which corresponds to IC > 0. Subadditivity (or antagonism) is a “negative” departure, which corresponds to IC < 0.

Rothman (1986) proposed three measures to estimate departure from additivity: the relative excess risk due to interaction (RERI),

$R E R I = R R_{11} - R R_{10} - R R_{01} + 1,$ the attributable proportion due to interaction (AP),

$A P = \frac{R E R I}{R R_{11}} = \frac{R R_{11} - R R_{10} - R R_{01} + 1}{R R_{11}},$ and the synergy index (S)

$S = \frac{R R_{11} - 1}{(R R_{10} - 1) + (R R_{01} - 1)} = \frac{R R_{11} - 1}{R R_{10} + R R_{01} - 2} .$

If there is no additive interaction, both RERI and AP will be equal to 0, and S will be equal to 1. There are few studies concerning appropriate statistical methods for calculating the confidence interval of these measures (Hosmer & Lemeshow, 1992; Assmann et al., 1996; Knol et al., 2007; Zou, 2008; Richardson & Kaufman, 2009; Kuss, Schmidt-Pokrzywniak & Stang, 2010; Nie et al., 2010; Chu, Nie & Cole, 2011). However, results based on the confidence interval of each measure may be inconsistent. For example, the confidence interval of RERI contains a 0, but AP does not. This can cause confusion as to whether the additive interaction is present.

It is noted that $R R_{00} = 1$ and IC can rewritten as:

$I C = (R R_{11} - R R_{00}) - (R R_{10} - R R_{00}) - (R R_{01} - R R_{00}) .$

Hence, all three measures are based on IC. $I C = 0$ if and only if RERI = AP = 0 and S = 1. This article describes two methods of using IC directly to identify the additive interaction. The first method is based on regularization (Bickel et al., 2006), and the other is based on model selection (Burnham & Anderson, 2004). The regularization method includes the regularization term $| I C |$ based on L1-norm in the model to estimate the parameters. By the nature of L1-norm, it may result in parameters shrinkage and variable selection (Tibshirani, 2011). Once the tuning parameter is determined IC = 0 can be inferred. The model selection method employs model selection to infer whether IC = 0. We tested two models, with and without IC = 0. Then, the two models were compared using the generalized Akaike information criterion (GAIC) (Lv & Liu, 2014) or likelihood ratio test (Peers, 1971). There is no additive interaction between the two factors when IC = 0 is favored.

Methods

Estimating the additive interaction using logistic regression

In this method, we let y denote the binary outcome variable labeled by 0 and 1. Z is a vector of the potential confounders which is adjusted in the model and whose dimension, p, is dependent on the number of potential confounders.

Using the method proposed by Rothman (1986), the estimates of RERI, AP, and S can be estimated from the output of a multiple logistic regression

(1) $l o g i t (p) = \log (\frac{p}{1 - p}) = β_{0} + A β_{1} + B β_{2} + A B β_{3} + Z γ$ where $p = P (y = 1 | A, B, z)$ is the corresponding probability of outcome variable given the factors A, B and Z, and $β_{i} (i = 0, 1, 2, 3)$ and $γ \in R^{p}$ is the parameter vector corresponding to the two risk factors and confounders respectively. if ${\hat{β}}_{i} (i = 0, 1, 2, 3)$ are used to stand for the estimated logistic regression coefficients of $β_{i} (i = 0, 1, 2, 3)$ , then the RERI, AP, and S can be estimated as

$\hat{R E R I} = e^{{\hat{β}}_{1} + {\hat{β}}_{2} + {\hat{β}}_{3}} - e^{{\hat{β}}_{1}} - e^{{\hat{β}}_{2}} + 1$

$\hat{A P} = \frac{e^{{\hat{β}}_{1} + {\hat{β}}_{2} + {\hat{β}}_{3}} - e^{{\hat{β}}_{1}} - e^{{\hat{β}}_{2}} + 1}{e^{{\hat{β}}_{1} + {\hat{β}}_{2} + {\hat{β}}_{3}}}$ and

$\hat{S} = \frac{e^{{\hat{β}}_{1} + {\hat{β}}_{2} + {\hat{β}}_{3}} - 1}{e^{{\hat{β}}_{1}} + e^{{\hat{β}}_{2}} - 2} .$

The confidence interval estimator can be calculated based on normal distribution (RERI and AP) or log-normal (S) assumption. To derive the necessary estimator of the variance for these three measures, we used the delta method which based on the first order approximation of a Taylor Series expansion (Hosmer & Lemeshow, 1992). The variance and covariance of ${\hat{β}}_{i} (i = 1, 2, 3)$ are easily found from the logistic regression outcome in most packages.

As Rothman (1986) suggested before running the logistic regression, the formula can be simplified by combining the two risk factors into one variable with four levels. Specifically, given the two risk factors, A and B, the new variable is recoded as $A_{i} B_{j} (i, j = 0, 1)$ , where the subscript indicates the corresponding value of the variable. Then, the new variable is seen as a categorical variable. Using dummy variable encoding and taking the $A_{0} B_{0}$ as reference level, the new variable is incorporated into the logistic regression model:

(2) $l o g i t (p) = θ_{0} + θ_{1} I (A_{1} B_{0}) + θ_{2} I (A_{0} B_{1}) + θ_{3} I (A_{1} B_{1}) + z γ$ where $I (\cdot)$ is the indicator function. Let ${\hat{θ}}_{i} (i = 0, 1, 2, 3)$ denotes the estimated coefficients of $θ_{i} (i = 0, 1, 2, 3)$ . In Eqs. (1) and (2) both models are saturated and the two estimations are equivalent. Furthermore, it is clear that ${\hat{θ}}_{1} = {\hat{β}}_{1}$ , ${\hat{θ}}_{2} = {\hat{β}}_{2}$ and ${\hat{θ}}_{3} = {\hat{β}}_{1} + {\hat{β}}_{2} + {\hat{β}}_{3}$ . In terms of $\hat{θ}$ ’s, the RERI, AP, and S can be rewritten as

$\hat{R E R I} = e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1$

$\hat{A P} = \frac{e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1}{e^{{\hat{θ}}_{3}}}$ and

$\hat{S} = \frac{e^{{\hat{θ}}_{3}} - 1}{e^{{\hat{θ}}_{1}} + e^{{\hat{θ}}_{2}} - 2},$ which yield the same point estimates given above. Based on the variances and covariance of ${\hat{θ}}_{i} (i = 1, 2, 3)$ . The confidence intervals of these measures are calculated as described above, using the delta method.

When the confidence interval of RERI and AP contain a 0, or of S contains 1, this indicates a lack of the additive interaction between two risk factors. If these values are not present, there is an additive interaction.

Detect additive interaction based on regularized estimation of logistic regression

To establish notation, consider the logistic regression with sample size $n$ , and assume that ${y_{i}, A_{i}, B_{i}, z_{i}} (i = 1, 2, \dots, n)$ with $n$ observations which have identical but independent distribution. To make estimation and expression simple, Eq. (2) is adopted and rewritten. Let $X_{i} = ({\tilde{X}}_{i}, z_{i})$ , where ${\tilde{X}}_{i}$ is the ith row of the design matrix corresponding to the new combined variable, which is composed of zeros and ones denoting the combination of levels for two risk factors A and B. Logistic regression models show the conditional probability $p (X_{i}) = P (Y = 1 | X_{i})$ by

$l o g i t (p (X_{i})) = η (X_{i})$ with

$η (X_{i}) = θ_{0} + {\tilde{X}}_{i} θ_{a b} + z_{i} γ = θ_{0} + X_{i} \tilde{θ}$ where $θ_{0}$ is the intercept and $θ_{a b} = (θ_{1}, θ_{2}, θ_{3})^{T} \in R^{3}$ , $\tilde{θ} = (θ_{a b}^{T}, γ^{T})^{T} \in R^{3 + p}$ is the parameter vector corresponding to the risk factors and confounders. For simplicity, let $θ \in R^{4 + p}$ denote the whole parameter vector, i.e., $θ = (θ_{0}, {\tilde{θ}}^{T})^{T} .$

The standard maximum likelihood estimation for the logistic regression estimates the coefficients $θ$ by solving the following equation

(3) $\hat{θ} = \arg m i n_{θ} {- l (θ)}$ where $l (\cdot)$ is the log-likelihood function:

$l (θ) = \sum_{i = 1}^{n} y_{i} η (X_{i}) - \log [1 + e x p {η (X_{i})}] .$

The LASSO reduced the relative excess risk due to the additive interaction (Tibshirani, 2011) and a penalty terms can then be added to Eq. (3):

(4) $\hat{θ} = \arg m i n_{θ} {- l (θ)} \begin{matrix} s . t . \begin{matrix} | e^{θ_{3}} - e^{θ_{1}} - e^{θ_{2}} + 1 | \end{matrix} \end{matrix} \leq t$ where $t \geq 0$ is the tuning parameter which controls the amount of penalization. Solving $\hat{θ}$ in Eq. (4) is equivalent to the “Lagrange” version of the problem

(5) $\hat{θ} = \arg m i n_{θ} {- l (θ) + λ | e^{θ_{3}} - e^{θ_{1}} - e^{θ_{2}} + 1 |}$ where $λ \geq 0$ . There is a one-to-one correspondence between $t$ and $λ$ , whose values are regularly chosen by a model selection procedure such as K cross-validation (Fushiki, 2011), Akaike information criterion (AIC) or Bayesian information criterion (BIC) (Burnham & Anderson, 2004). If $\hat{θ} (λ)$ minimizes Eq. (5), then it also solves Eq. (4) with $t = | e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 |$ .

It should be noted that $e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 = 0$ when $t$ equals 0 or $λ$ is large enough. This means that IC = 0, thus all three additive measures will equal 0 and it indicates the lack of additive interaction. Therefore, if there is no additive interaction, it hopes that $λ$ would be large in model selection. When $e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 \neq 0$ , the RERI, AP, and S can be calculated and the additive interaction can be inferred that there is an additive scale interaction without requiring additional information, such as the confidence interval.

Detect additive interaction based on model selection

Although the regularized estimation of logistic regression can determine if there is an additive scale interaction, it depends on the choice of tuning parameter $t$ or $λ$ . Whatever cross-validation or information criterion, it fixes the value through trial and error. In detail, given a collection of tuning parameter values in advance, for example $λ = {λ_{1}, \dots, λ_{l}}$ . For each value of $λ$ , Eq. (5) is solved. After $λ = {λ_{1}, \dots, λ_{l}}$ is executed, the best $λ^{*}$ can be determined based on K cross-validation or information criterion.

Model selection is key for determining whether there is an additive interaction (IC = 0). Specifically, two models are fitted; one (model $M_{1}$ ) is the logistic regression without any constraint, which is solving the Eq. (3); another (model $M_{2}$ ) is the logistic regression with the constraint $e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1 = 0$ , which is equivalent to solving Eq. (5) with large $λ$ . In contrast to regularized estimation, it only fits two logistic models.

In regard to model selection, let $l (\hat{θ})$ be the maximum value of the likelihood function for the model. For a parametric statistical model and within a likelihood-based inferential procedure, the fit of model can be assessed by its fitted global deviance (GDEV), defined as $G D E V = - 2 l (\hat{θ})$ . $M_{1}$ and $M_{2}$ may be evaluated with a fitted global deviance $G D E V_{1}$ and $G D E V_{2}$ , and degrees of freedom $d f_{1}$ and $d f_{2}$ , respectively. The two models, $M_{1}$ and $M_{2}$ , are nested and may be compared using the likelihood ratio test or the generalized Akaike information criterion (GAIC) (Lv & Liu, 2014; Peers, 1971). The test statistic of the likelihood ratio is

$Λ = G D E V_{2} - G D E V_{1}$ which has an asymptotic $χ_{d}^{2}$ distribution under the null hypothesis so the model $M_{2}$ is recommended, where $d = d f_{2} - d f_{1}$ . The GAIC is obtained by adding a penalty $κ$ for each degree of freedom $d f$ used in the model to the fitted deviance as follows:

(6) $G A I C (κ) = G D E V + κ \times d f .$

Then the model with the smallest value of the criterion $G A I C (κ)$ is selected. AIC and BIC are used most often. $κ$ corresponds with $κ = 2$ and $κ = \ln (n)$ in Eq. (6), respectively:

$A I C = - 2 \cdot l (\hat{θ}) + 2 d f,$ and

$B I C = - 2 \cdot l (\hat{θ}) + \ln (n) d f .$

The AIC typically leads to overfitting in model selection, while the BIC leads to underfitting (McLachlan & Peel, 2000). HQ is another special case of $G A I C (κ)$ to consider (Burnham & Anderson, 2002),

$H Q = - 2 \cdot l (\hat{θ}) + 2 \ln (\ln (n)) \cdot d f .$

When $n \geq 15$ , $κ = 2 \ln (\ln (n))$ is between 2 and $\ln (n)$ , it is considered to be the compromise of AIC and BIC in the penalty terms.

There is only one constraint equation in model $M_{2}$ , so the difference of the degree freedom of the two models equals one. Therefore, when the difference of the GDEV between the two models, $\ln (n)$ and $2 \ln (\ln (n))$ , is greater than 2 for AIC, BIC, and HQ, respectively, the logistic regression without constraint is chosen and there is an additive interaction between the two risk factors A and B.

Simulation

We conducted simulation studies whose additive interactions were present or absent based on a priori knowledge to assess the performance of detecting the additive interaction based on regularized estimation of logistic regression or model selection. For simplicity, only two risk factors A and B were considered, regardless of the potential confounders, Z, in all simulation scenarios. The data were generated from the model

$Y \sim B e r n o u l l i (p)$

(7) $l o g i t (p) = θ_{0} + θ_{1} I (A_{1} B_{0}) + θ_{2} I (A_{0} B_{1}) + θ_{3} I (A_{1} B_{1})$ where $B e r n o u l l i (\cdot)$ is Bernoulli distribution, and $θ = (θ_{0}, θ_{1}, θ_{2}, θ_{3})^{T} \in R^{4}$ is the coefficients with intercept.

Regularized estimation

According to Zou, Hastie & Tibshirani (2007), the effective degree of freedom of the LASSO was the number of nonzero coefficients. However, unless the condition $e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 = 0$ was met, the degree of the model $d f$ would not change. This would make the information criterion fail to determine the $λ$ . Thus, the value of $λ$ was tuned using generalization error which calculated GDEV on new data and can be seen as simplified of the cross-validation (Golub, Heath & Wahba, 1979).

Two datasets were generated with $n = 600$ according to Eq. (7), where the coefficients were set to be $θ = (0, 1, 1, 1.49)^{T}$ and $θ = (0, 1, 1, 1.90)^{T}$ , corresponding to models without and with interaction, respectively. The AP is 1/3 in interaction scenario, which can be regarded as weak synergy (Hosmer & Lemeshow, 1992). For each dataset, it was split into the training set and testing set with 400 and 200 cases, respectively. The coefficients were estimated using the training set, but the generalization error was calculated on the testing set.

Figure 1 showed the relation between the coefficients and tuning parameter $λ$ , along with the penalty term $| e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 |$ . The penalty term $| e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 |$ gradually converged to 0 as the tuning parameter $λ$ increased. However, the primary concern was whether there was an additive scale interaction rather than the path of the coefficients. Based on the tuning parameter $λ$ selected by generalization error, the final models matched the a priori information in two scenarios. Specifically, in the scenario in which there is no interaction, the penalty term $e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 = 0$ is met, which indicated there was not additive interaction. While, in the scenario with an interaction, the penalty term $e^{{\hat{θ}}_{3} (λ)} - e^{{\hat{θ}}_{1} (λ)} - e^{{\hat{θ}}_{2} (λ)} + 1 = 0$ disobeyed, which indicated there was additive scale interaction.

Plot of the coefficients and penalty term for different values of
$\lambda$λ
. — Figure 1: Plot of the coefficients and penalty term for different values of $λ$ .
The dashed line indicates the $λ$ tuned *via* generalization error. (A) Without interaction; (B) with interaction.
Download full-size image
DOI: 10.7717/peerj.18304/fig-1

Model selection

In order to assess the performance of detecting additive scale interactions based on model selection, a simulation study was conducted to compare the classical method, such as the delta and bootstrap methods. Model selection used AIC, BIC, and HQ. Both the delta and bootstrap methods employed the three indices: RERI, AP, and S, as denoted by RERI_D, AP_D, S_D, RERI_B, AP_B, and S_B.

For the simulation study, seven scenarios were concerned (Table 1). With the exception of scenarios S2 and S7, all scenarios were derived from the study by Assmann et al. (1996). Due to RERI = AP = 0 and S = 1 in scenario S4, there was no additive interaction. The model interaction occurred in all other models. Thus, those scenarios determined which models had strong synergy, weak synergy, no interaction, weak antagonism, or strong antagonism.

Table 1:

Scenarios setting for simulation.

Scenario	$O R_{10}$	$O R_{01}$	$O R_{11}$	RERI	AP	S
S1	4	5	20	12	0.60	2.71
S2	4	5	16	8	0.50	2.14
S3	4	5	12	4	0.33	1.57
S4	4	5	8	0	0.00	1.00
S5	4	5	6	−2	−0.33	0.71
S6	4	5	4	−4	−1.00	0.43
S7	4	5	2	−6	−3.00	0.14

DOI: 10.7717/peerj.18304/table-1

In each scenario, 1,000 samples were generated based on Eq. (7) with sample sizes 400, 600 and 1,000, respectively. The parameters $θ$ were the logarithm of the ORs. For example, in scenario S1, parameters $θ$ were set in Eq. (7) as

$θ = [0, \log (4), \log (5), \log (20)]^{T} = (0, 1.386, 1.609, 2.996)^{T} .$

For each sample, the interaction detected model selection with $G A I C (κ)$ and the confidence intervals of RERI, AP, and S with delta and bootstrap methods. Then the methods were compared based on the percentage of times with no additive interaction.

For the bootstrap method (Efron, 1982), 300 bootstrap samples were tested separately within $Y = 0$ and $Y = 1$ . Therefore, the number of $Y = 0$ and $Y = 1$ in the bootstrap samples were identical to the original sample.

The simulation results for models with no interaction are summarized in Table 2. The percentage was calculated as the number of times that model correctly identified in scenario S4 (without interaction), but it was the rate of times that model misidentified the other scenarios (with interaction). Thus, the correct model was chosen as that which was one minus the error rate in scenarios with interaction. Scenario S4 provided an assessment of the Type I error rate when testing for additive interactions using different methods, while the other scenarios provided an evaluation of the power of these methods when testing for additive interactions with varying effect sizes. Clearly, among all methods, the percentage became smaller and smaller as the sample size increased and/or the interaction effect was augmented. This was due to the fact that it was more likely to get significant results the larger the sample size and/or the effect size (Papoulis, 1990) in favor of the model with interaction. Additionally, when the interaction effect was weak it was due to a lack of power to detect the interaction.

Table 2:

The percentage of no interaction with 1,000 samples.

Sizes	Scenario	AIC	HQ	BIC	RERI_D	AP_D	S_D	RERI_B	AP_B	S_B
400	S1	0.076	0.168	0.345	0.614	0.100	0.207	0.185	0.184	0.185
	S2	0.161	0.299	0.503	0.762	0.199	0.345	0.302	0.302	0.302
	S3	0.378	0.547	0.747	0.937	0.434	0.591	0.557	0.556	0.556
	S4	0.790	0.919	0.975	0.997	0.873	0.930	0.911	0.910	0.910
	S4*	0.801	0.914	0.976	0.995	0.873	0.932	0.919	0.918	0.918
	S5	0.686	0.838	0.926	0.943	0.950	0.844	0.845	0.846	0.846
	S6	0.044	0.104	0.248	0.272	0.374	0.118	0.117	0.119	0.119
	S7	0.000	0.000	0.000	0.000	0.000	0.001	0.000	0.000	0.000
600	S1	0.011	0.051	0.151	0.147	0.025	0.055	0.053	0.053	0.053
	S2	0.054	0.128	0.319	0.321	0.080	0.145	0.135	0.136	0.136
	S3	0.231	0.411	0.636	0.664	0.312	0.433	0.413	0.412	0.412
	S4	0.787	0.918	0.975	0.980	0.873	0.924	0.911	0.911	0.910
	S4*	0.789	0.913	0.975	0.978	0.868	0.924	0.910	0.910	0.910
	S5	0.621	0.811	0.921	0.862	0.916	0.814	0.809	0.808	0.808
	S6	0.008	0.020	0.081	0.041	0.080	0.023	0.024	0.024	0.024
	S7	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
1,000	S1	0.001	0.002	0.012	0.003	0.001	0.002	0.001	0.001	0.001
	S2	0.004	0.012	0.095	0.040	0.009	0.013	0.015	0.015	0.015
	S3	0.087	0.211	0.442	0.324	0.147	0.215	0.198	0.198	0.198
	S4	0.742	0.895	0.981	0.950	0.834	0.906	0.884	0.888	0.887
	S4*	0.766	0.902	0.975	0.946	0.853	0.904	0.892	0.892	0.892
	S5	0.512	0.724	0.894	0.760	0.835	0.714	0.707	0.707	0.707
	S6	0.000	0.000	0.007	0.000	0.001	0.000	0.000	0.000	0.000
	S7	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000

DOI: 10.7717/peerj.18304/table-2

Note:

* The number of simulations is 10,000.

The bootstrap method is recommended for use in various applied fields (Jalali et al., 2022; Miwakeichi & Galka, 2023; Kaity et al., 2023) as well as for interaction detection (Assmann et al., 1996). Whatever measures were used, the percentages of no interaction were very close in various scenarios. Therefore, the bootstrap method was used as a benchmark. In contrast, however, there was a lot of variation from the delta method, especially when the sample size was relatively small. For example, in scenario S2 for sample size of 400, the percentages of no interaction corresponding to RERI_D, AP_D, and S_D were 0.762, 0.199, and 0.345, respectively, and its standard deviation was 0.292. Using bootstrap, the percentages of no interaction corresponding to RERI_B, AP_B. and S_B were 0.302, the standard deviation almost equaled to 0. This indicated that there was often conflict among the results in detecting interaction via RERI_D, AP_D, and S_D. However, based on model selection, due to $e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1 = 0$ , there were consistent results regardless of which measure used.

Using the bootstrap method as the criterion, the S_D had the smallest deviation and RERI_D had the largest deviation in almost all scenarios. Specifically, S_D was close to S_B, but RERI and AP were not. In contrast to the bootstrap method, RERI_D was favored without interaction, regardless of whether it was synergy or antagonism interaction. However, AP_D favored the model with a synergy interaction and without an antagonistic interaction.

Obviously, when $κ$ increased, the $G A I C (κ)$ also increased and tended to favor the model without interaction. Since $κ = 2$ , $2 \ln (\ln (n))$ and $\ln (n)$ for AIC, HQ, and BIC, respectively, for each scenario. The percentage of no interactions were rising, as shown in Table 2. It is noted that the results of HQ were almost the same as the results of the bootstrap method. The difference between the results of HQ and the bootstrap method were less than 2% in all scenarios. In contrast, there was a smaller percentage of no interaction for AIC, but larger for BIC. This meant that the AIC tended to overestimate the interaction, while the BIC tended to underestimate the interaction (McLachlan & Peel, 2000).

Real data

The data came from a case-control study of oral cancer, kindly supplied by Rothman & Keller (1972). The two factors were smoking and alcohol use. There were a total of 458 participants who were male veterans under the age of 60s. The study investigated the effects of smoking and alcohol use on oral cancer. The distribution of data is shown in Table 3. It was used to illustrate their methods on how to calculate the confidence interval for measures of interaction by several authors (Hosmer & Lemeshow, 1992; Zou, 2008; Richardson & Kaufman, 2009; Chu, Nie & Cole, 2011). The confidence intervals were different among their methods. However, their results consistently showed that there was not an additive scale interaction.

Table 3:

Distribution of exposures among cases and controls in the oral cancer example^a.

	Neither	Smoking only	Alcohol only	Smoking and alcohol
Cases	3	8	6	225
Controls	20	18	12	166

DOI: 10.7717/peerj.18304/table-3

Note:

a Data was from the example presented by Rothman & Keller (1972).

Using the interaction detection based on model selection, two logistic models were employed. $M_{u}$ and $M_{c}$ indicated which model had the constraint $e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1 = 0$ . The result is summarized in Table 4. The GDEV of model $M_{u}$ and $M_{c}$ were 605.930 and 607.727, respectively. Thus, the difference between them was 1.803, which was less than two. The model without an interaction was favored, regardless of which criterion was used (AIC, BIC, or HQ). Therefore, consensus had to be reached with other authors. Remarkably, the coefficients of model $M_{c}$ without an additive interaction met the following equation:

Table 4:

The coefficients and GDEV of the oral cancer example in logistic regression.

Model	${\hat{θ}}_{1}$	${\hat{θ}}_{2}$	${\hat{θ}}_{3}$	GDEV
$M_{u}$	1.204	1.086	2.201	605.927
$M_{c}$	1.882	1.497	2.306	607.730

DOI: 10.7717/peerj.18304/table-4

$I C = e^{2 .306} - e^{1 .882} - e^{1 .497} + 1 = 0,$ which would make the interpretation of the result simple. Additionally, we validated our algorithm in two other cases: a hypertension case (Zou, 2008) and a congenital heart disease case (Nie et al., 2016); the results were consistent with the authors’ analysis (Tables S1–S4).

Discussion

It is vital to recognize that disease prevention and intervention are closely related to its influencing factors and their interactions. In epidemiology, additive interaction plays an important role (Rothman, Greenland & Lash, 2008). There have been many discussions concerning appropriate methods for constructing confidence intervals or credible intervals (CIs) for RERI, AP, and S (Hosmer & Lemeshow, 1992; Assmann et al., 1996; Kuss, Schmidt-Pokrzywniak & Stang, 2010). If the CIs do not contain the null value (0 with RERI and AP, 1 with S), then the identification of the additive interaction is confirmed. In addition to CIs, this article proposes a novel method to assess the additive interaction. This method works because it is concerned with whether the constraint $e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1 = 0$ is met. When the constraint is met, this means the RERI and AP equals 0 and S equals 1, so there is no additive interaction between the two factors. Otherwise, the additive interaction exists.

In this article, there are two methods to verify that the constraint is met, using regularized estimation or model selection. However, it is easy to see that, when using the same information criterion in choosing the tuning parameter $λ$ and model selection, the conclusion would be the same, regardless of the method used. For example, with BIC, if the constraint $e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1 = 0$ is met in regularized estimation, the $G A I C (κ)$ would be in agreement with that of the model with constraint, so that the model with the constraint would be favor, i.e., there is no additive interaction between the two factors regardless of which method used, and vice versa. As the choice of tuning parameter $λ$ is computationally expensive, the model selection method is recommended.

From the simulation studies, the percentage of misidentification decreased as the sample size increased and/or the additive interaction effect was augmented among all methods. Based on the confidence interval method, the performance of the delta method was worse than that of bootstrap method; this effect has been well-documented (Assmann et al., 1996). The values of RERI_B, AP_B, and S_B were almost the same in each scenario, but RERI_D, AP_D, and S_D tended to be unusually volatile. Thus, the conclusion based on the delta method with different measures would be inconsistent, but there may be little impact on the bootstrap method. Based on this study, it is a surprise that the performance of S_D was like that of the bootstrap method, although the measurement indicator S was not proposed by a few authors. This might be attributable to confidence interval using the logarithmic function, which may can help the distribution of the S be more normal and stabilize the variance (Muhsam, 1946). Based on model selection methods, the percentages of misidentification of HQ were almost the same as that of the bootstrap method. However, AIC tended to overestimate the interaction, while the BIC tended to underestimate the interaction. Thus, HQ is a better choice for model selection.

Furthermore, when employing the confidence interval to assess the additive interaction, no matter what method is used, the result may be inconsistent among different measure indicators. However, using the model selection method proposed by this article, if $R E R I = e^{{\hat{θ}}_{3}} - e^{{\hat{θ}}_{1}} - e^{{\hat{θ}}_{2}} + 1 = 0$ , then AP = 0 and S = 1, and vice versa. Therefore, there is always a consistent result, regardless of the different measure indicators. Furthermore, the model selection method can make the interpretation of the result simple and straightforward. Specifically, when there is no additive interaction, using the confidence interval method, even the results of RERI, AP, and S are consistent. Their intervals contain the null values but not equal. In contrast, unlike the confidence interval which involves uncertainty, there is no additive interaction if and only if the interaction contrast equals 0 in the model selection method. For example, in the example of smoking and alcohol, there was not an additive interaction, thus the interaction contrast equals 0, i.e., $e^{2 .306} - e^{1 .882} - e^{1 .497} + 1 = 0$ .

In summary, a model selection method based on HQ appears to be a competitive alternative to the bootstrap method for identifying an additive interaction. Furthermore, when using RERI, AP, and S measures to assess the additive interaction, the result is a consistent conclusion with a clear interpretation of the results, which may be useful in practice.

Supplemental Information

Introduction

Methods

Estimating the additive interaction using logistic regression

Detect additive interaction based on regularized estimation of logistic regression

Detect additive interaction based on model selection

Simulation

Regularized estimation

Figure 1: Plot of the coefficients and penalty term for different values of λ.

Model selection

Real data

Discussion

Supplemental Information

Code.

Simulation and real world dataset.

Supplementary tables.

Figure 1: Plot of the coefficients and penalty term for different values of $λ$ .