Logistic regression is used frequently in cohort studies and clinical
trials. When the incidence of an outcome of interest is common in the study
population (>10%), the adjusted odds ratio derived from the logistic regression
can no longer approximate the risk ratio. The more frequent the outcome, the
more the odds ratio overestimates the risk ratio when it is more than 1 or
underestimates it when it is less than 1. We propose a simple method to approximate
a risk ratio from the adjusted odds ratio and derive an estimate of an association
or treatment effect that better represents the true relative risk.
RELATIVE RISK has become one of the standard measures in biomedical
research. It usually means the multiple of risk of the outcome in one group
compared with another group and is expressed as the risk ratio in cohort studies
and clinical trials. When the risk ratio cannot be obtained directly (such
as in a case-control study), the odds ratio is calculated and often interpreted
as if it were the risk ratio. Subsequently, the term relative
risk commonly refers to either the risk ratio or the odds ratio. However,
only under certain conditions does the odds ratio approximate the risk ratio. Figure 1 shows that when the incidence of
an outcome of interest in the study population is low (<10%), the odds
ratio is close to the risk ratio. However, the more frequent the outcome becomes,
the more the odds ratio will overestimate the risk ratio when it is more than
1 or underestimate the risk ratio when it is less than 1.
Logistic regression is a widely used technique to adjust
for confounders, not only in case-control studies but also in cohort studies.1 However, logistic regression yields an odds ratio
rather than a risk ratio, even in a cohort study. Under the same rule, when
the outcome of interest is common in the study population (though it could
be rare in the general population), the adjusted odds ratio from the logistic
regression may exaggerate a risk association or a treatment effect. For instance,
a previous study assessed the performance of neonatal units in Hospital A
and Hospital B by comparing neonatal mortality in very low birthweight neonates
between these 2 hospitals.2 At first glance,
Hospital A had a lower mortality rate than Hospital B (18% vs 24%, risk ratio,
18%:24% [0.75]). However, after adjusting for clinical variables and initial
disease severity using logistic regression, the adjusted odds ratio of Hospital
A vs Hospital B was 3.27 (95% confidence interval, 1.35-7.92). Can one therefore
conclude that neonates with very low birthweight in Hospital A had 3 times
the risk of death than those in Hospital B? Probably not, because the outcome
(neonatal death) was common in this study population. To provide a measure
that more accurately represents the concept of relative risk, correction of
the odds ratio may be desirable.
A modified logistic regression with special macro functions has been
developed to address this issue.3 However,
it is mathematically complex and uses a General Linear Interactive Modeling
System (Numerical Algorithms Group, Oxford, England). Consequently, this method
is rarely used. Another alternative is to use the Mantel-Haenszel method,4 which can adjust for 1 or 2 confounders and still
provide a risk ratio in a cohort study. However, this method becomes inefficient
when several factors, especially continuous variables, are being adjusted
for simultaneously. We herein propose an easy approximation with a simple
formula that can be applied not only in binary analysis5
but also in multivariate analysis.
In a cohort study, P0 indicates the incidence of the outcome
of interest in the nonexposed group and P1 in the exposed group;
OR, odds ratio; and RR, risk ratio: OR=(P1/1−P1)/(P0/1−P0); thus, (P1/P0)=OR/[(1−P0)+(P0×OR)]. Since RR=P1/P0,
We can use this formula to correct the adjusted odds ratio obtained from logistic regression and derive
an estimate of an association or treatment effect that better represents the
true relative risk. It can also be used to correct the lower and upper limits
of the confidence interval by applying this formula to the lower and upper
confidence limits of the adjusted odds ratio. In the above example, after
the odds ratio is corrected (where OR=3.27 and P0=0.24), the risk
ratio becomes 2.12 (95% confidence interval, 1.25-2.98), ie, very low birthweight
neonates in Hospital A had twice the risk of neonatal death than those in
To examine the validity of this correction method in various scenarios,
we simulated a series of hypothetical cohorts based on predetermined risk
ratios (called true RR). Each cohort consists of 1000 subjects with 1 binary
outcome (0,1), 1 exposure variable (0,1), and 2 confounders. Both confounders
have 3 levels (1,2,3). The true risk ratio is kept constant across strata
of the confounders. As expected, with an increase in incidence of outcome
and risk ratio, the discrepancy between risk ratio and odds ratio increases
(Table 1). The corrected risk
ratio, which is calculated based on the odds ratio from logistic regression
after having adjusted for the confounders, is very close to the true risk
ratio. This procedure can be applied to both unmatched and matched cohort
studies. It can further be used in cross-sectional studies, in which the prevalence
ratio rather than the risk ratio will be generated. It enables us to obtain
a corrected prevalence ratio very close to the one obtained from a complex
statistical model6 (data not shown).
Due to the differences in underlying assumptions between Mantel-Haenszel
risk ratio and logistic regression odds ratio, some discrepancy between the
Mantel-Haenszel risk ratio and the corrected risk ratio is expected (detailed
discussion of which is beyond the scope of this work). More importantly, the
validity of the corrected risk ratio relies entirely on the appropriateness
of logistic regression model, ie, only when logistic regression yields an
appropriate odds ratio will the correction procedure provide a better estimate.
Therefore, in a cohort study, whenever feasible, the Mantel-Haenszel estimate
should be used.
In summary, in a cohort study, if the incidence of outcome is more than
10% and the odds ratio is more than 2.5 or less than 0.5, correction of the
odds ratio may be desirable to more appropriately interpret the magnitude
of an association.
Register and get free email Table of Contents alerts, saved searches, PowerPoint downloads, CME quizzes, and more
Subscribe for full-text access to content from 1998 forward and a host of useful features
Activate your current subscription (AMA members and current subscribers)
Purchase Online Access to this article for 24 hours
Some tools below are only available to our subscribers or users with an online account.
Download citation file:
Web of Science® Times Cited: 1422
Customize your page view by dragging & repositioning the boxes below.
Users' Guides to the Medical Literature
Table 9.2-2 Refuted Evidence From Studies of Physiologic or Surrogate Endpoints
All results at
and access these and other features:
Enter your username and email address. We'll send you a link to reset your password.
Enter your username and email address. We'll send instructions on how to reset your password to the email address we have on record.
Athens and Shibboleth are access management services that provide single sign-on to protected resources. They replace the multiple user names and passwords necessary to access subscription-based content with a single user name and password that can be entered once per session. It operates independently of a user's location or IP address. If your institution uses Athens or Shibboleth authentication, please contact your site administrator to receive your user name and password.