0
Editorial |

Procedure Volume as a Predictor of Surgical Outcomes

Edward H. Livingston, MD; Jing Cao, PhD
[+] Author Affiliations

Author Affiliations: Division of Gastrointestinal and Endocrine Surgery, University of Texas Southwestern Medical Center, Dallas (Dr Livingston), and Department of Statistical Science, Southern Methodist University, Dallas (Dr Cao). Dr Livingston is also a Contributing Editor, JAMA.


JAMA. 2010;304(1):95-97. doi:10.1001/jama.2010.905
Text Size: A A A
Published online

How do you get to Carnegie Hall? the old joke goes. Practice. Practice makes for great performance. Gladwell quantitated this notion, hypothesizing that it takes 10 000 hours of practice to be an elite performer in any activity requiring mechanical and technical skills.1 In a similar fashion, surgeons require a certain degree of experience to develop proficiency in performing an operation. However, problems arise when trying to define when a surgeon is competent to perform a difficult operation. How can technical competence be measured? Despite the great practical and societal importance of this question, it has yet to be answered in a satisfactory way.

Since practice makes for great performance, the number of procedures a surgeon performs should predict outcomes. More than 30 years ago, Luft et al2 promoted this concept by predicting that a surgeon's procedure volume will reflect his or her outcomes. Birkmeyer et al3 - 4 extended this idea and established a statistical relationship between surgical volume at either hospital level or individual surgeon level and patient outcomes.

With the development and availability of very large databases, health outcomes researchers could examine the relationship between procedure volume and outcomes easily, resulting in thousands of articles demonstrating that volumes were related to outcomes. Subsequently, many policy-generating organizations (eg, Leapfrog) recommend selective referral to high-volume hospitals for patients undergoing technically complex operations.5 Claims were made that this referral was worthwhile because even though receiving treatment at a distant facility might separate a patient from his or her social support system when that support was most needed, thousands of lives could be saved by selective referral.6

Despite numerous reports claiming that better outcomes occur at high-volume centers, little attention has been given to the quality of statistical analysis used to support these claims. Review of a large number of volume-outcome studies demonstrated numerous serious flaws in the methods used to study the volume-outcome association.7 Most studies examine the relationship between procedure volume and some predefined outcome, and if statistical significance is reached for the volume variable, the conclusion is that volume is importantly associated with outcomes. With large databases having commensurate huge sample sizes, statistical significance is almost assuredly achieved.

However, it is necessary to assess the relative importance of the volume variable.8 Because the volume effect is determined from a statistical model, the ability of the models to explain the phenomenon they purport to characterize must be evaluated. However, very few publications using regression analysis (eg, 3.6%) report on how well a model actually fits the data being studied.9 Procedure volume does not directly affect procedure results; rather, it is a marker for other processes that influence outcomes. As such, if volume is used in statistical models as a proxy variable, it must fit the criteria for a proxy variable: it must have a strong (ie, large effect) relationship with the outcome and must provide substantial explanation of the outcomes variance (ie, the statistical model adequately fits the data). These statistical issues have received little attention in the volume-outcome literature.

In this issue of JAMA, Thabut and colleagues10 investigate the sources of variation in survival outcomes among 61 lung US transplant centers that performed 15 642 adult lung transplantation procedures between 1987 and 2008. Although the causes for variation were not all identified, annual procedural volume was found to be statistically significantly related to outcomes but accounted for only 15% of variability among center survival rates.

Perhaps the most important aspect of the analysis by Thabut et al10 is the provision of guidance for how to quantify the degree to which procedure volume contributes to outcomes. Accordingly, several items should be considered when using regression modeling to study the relationship between procedure volumes and outcomes.

First, volume should be modeled as a continuous variable. Most volume outcome studies use logistic regression analysis to examine how a set of variables might predict mortality or complications following an operation. One common mistake is to aggregate volumes into categories rather than capitalize on the greater explanatory power of using volume as a continuous variable. Aside from losing information that might affect regression results, categorization in this way has undesirable effects. For instance, if a variable is dichotomized, as was done in a study of bariatric surgery, there would be 2 groups: one with 2 to 149 cases per year and one with 150 cases or more per year. The “low-volume” group would include surgeons or centers that perform very few procedures along with those that perform appreciable numbers (eg, 100 per year).11 Very inexperienced surgeons are more likely have increased complication rates and would be included in the low-volume category along with more experienced surgeons. The occasional low-volume bariatric surgeon may have a high complication rate, causing the entire low-volume group to have statistically significant bad outcomes. Very few outlying cases are required to shift this type of regression analysis to be statistically significant when volume is treated as categorical or dichotomous rather than as a continuous variable.11

Another problem associated with variable categorization is magnification of the apparent effect volume has on outcomes when expressed as an odds ratio.12 Because odds ratios are expressed in terms of the unit of measure, categorization makes the odds ratio appear larger than it is. When the volume is categorized, the odds ratio is expressed in terms of the entire category. In the bariatric surgery example, when volume was categorized into terciles comparing low volume (<50 cases per year) to high volume (>150 cases per year), the odds ratio for mortality in one model was 1.52 and became 2.15 when the center volumes were dichotomized to low volume (<125 cases per year) and high volume (≥125 cases per year).11 These findings would be interpreted as low volume being associated with a 152% increased in mortality in one instance and a 215% increase in another simply because the volume categories were divided into categories differently in the 2 models. Reporting odds ratios for continuous variables would level the playing field, enabling comparison of effects across studies without introducing confusion related to volume categories of varying sizes.

Second, model fit statistics should be reported. Statistical significance of a variable should never be the last step of data analysis. Rather, determination of a significant P value only indicates that the investigator needs to assess how important the variable is in explaining the outcome studied and test the statistical model to ensure that it adequately describes the phenomenon. Thabut et al10 displayed their univariate and multivariate regression studies in a table (see Table 2 in the article) along with results from likelihood ratio (LR) tests and Akaike information criteria (AIC) (which provides a measure of the contribution of each variable to the outcome). Displayed this way, the relative importance of all variables entered into the regression equation can be determined in terms of their ability to explain outcomes. For instance (as shown in Table 2 of the article by Thabut et al10 ), inclusion of procedure volume in a univariate analysis increased the LR by 56.6 whereas inclusion of the transplant recipient's functional status increased the LR by 115.8, suggesting that functional status has a larger contribution to outcomes than center volume. Similarly, in multivariate analysis, omission of the center volume variable from the model yields an AIC of 82 641 whereas the full model that includes all variables yields an AIC of 82 639. Center volume only reduced AIC by 2 whereas recipient functional status reduced it by 87, showing the differing magnitudes these variables play in explaining lung transplant survival. Volume has an effect on outcomes, but it is relatively small.

Third, the volume-outcome analysis should include hierarchical modeling. Hierarchical modeling is now considered standard for volume-outcome studies to account for clustering of outcomes that occur within a hospital.13 With these models, the hospital is treated as a random effect, allowing for the differential influence a variable (such as volume or functional status) has on outcomes at any given hospital. Clustering of patient populations and hospital characteristics at individual hospitals influences regression analysis in ways that must be accounted for. Statistically, this is done by entering the hospital in the regression equation as a random variable with a mean of 0. The resultant standard deviation provides a sense for how much variation in outcomes exists between hospitals. Thabut et al10 capitalized on this by assessing how this variation was affected by accounting for the center volume. The analysis (detailed in the eAppendix accompanying the article) involved calculating how much the variance changed when the volume variable was excluded from the regression equation. When this was done, Thabut et al10 showed that only 15% of the variance in lung transplant outcomes between centers could be accounted for by annual procedure volume.

How does such a small effect become statistically significant? Thabut et al10 provided a plot of the hazard ratio (HR) for death for each hospital (Figure 3 in their article). Centers with an HR more than 1 have an increased risk of death after transplantation. Among the 15 highest volume centers, 6 centers have a significant HR below 1 (the entire 95% confidence interval is less than 1) and no center has an HR significantly above 1. In contrast, among the other 46 centers with moderate or small volume, 4 centers have a significant HR less than 1 and 7 centers have a significant HR greater than 1. However, among the 15 centers ranked lowest by volume, none has a significant HR above 1 or below 1. These observations indicate that very-high-volume centers have a better performance than the other centers; that the outcomes from very-low-volume centers are comparable on average with those from the other centers; and that there is substantial variation regarding the outcome in the moderate-volume centers, which are more than 50% of all the centers considered in the study. Moreover, the outcome of these centers is quite random (with little association with center volume) and is not above the average level. Thus, the conclusion is that the overall significance of center volume in the model is mainly influenced by the 15 very-high-volume centers. For the majority of centers with a moderate or small volume, the positive association between center volume and center outcome does not necessarily hold. This demonstrates how the results of a few hospitals can disproportionally influence a regression analysis and result in statistically significant findings when the actual effect of volume on outcomes is small.

Despite the overwhelming number of publications supporting a volume-outcome relationship, the vast majority are methodologically flawed7 precluding acceptance of volume as a basis for selective patient referral. Given the absence of agreed-on standards for the statistical approach necessary for performing this type of analysis, the following approach is suggested. Volume should only be entered into regression analysis as a continuous variable. Risk adjustment variables must have proof that they are contextually relevant to the adverse outcomes associated with a procedure and that the risk adjustment variable is quantitatively important in explaining some proportion of explained variance of the resultant statistical model. Goodness of fit of the model should be reported, along with some measure of the importance of volume (such as AIC or LR testing). The proportion of variance explained by procedure volume should be reported to demonstrate the relative importance volume has in explaining outcomes relative to other potential sources for that variation. There should be graphical representation of the association between the outcome (ie, HR) and volume. Such presentation can reveal whether the association holds for the entire volume range or only a small volume range; graphical representation of the influence procedure volume has relative to other potential explanatory variables (eg, using the rank hazard plot) can be also helpful to demonstrate the influence of the volume variable on the ability of the model to predict outcomes relative to other factors.

Including these approaches in the analysis and reporting of volume-outcome studies will help improve the quality of this literature. Providing detailed information that reveals the actual effects of volume on outcomes could help avoid adoption of polices that might needlessly direct patients away from health care facilities closer to their base of social support without a substantial benefit.

AUTHOR INFORMATION

Corresponding Author: Edward H. Livingston, MD, Division of Gastrointestinal and Endocrine Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390-9156 (edward.livingston@utsouthwestern.edu).

Financial Disclosures: Dr Livingston reported that in 2008 he received an honorarium from Allergan for attending an advisory board meeting. He reported that he has had no direct or indirect financial relationship with Allergan before that single meeting and has had none since. No other disclosures were reported.

Funding/Support: This work was supported by the Hudson-Penn Endowment Funds and grant 5 PL1 DK081183 from the National Institutes of Health.

Role of the Sponsor: The funding sources had no role in the preparation, review, or approval of the manuscript.

Editorials represent the opinions of the authors and JAMA and not those of the American Medical Association.

Gladwell M. Outliers: The Story of Success. New York, NY: Little Brown; 2008
Luft HS, Bunker JP, Enthoven AC. Should operations be regionalized? the empirical relation between surgical volume and mortality.  N Engl J Med. 1979;301(25):1364-1369
PubMed
Birkmeyer JD, Siewers AE, Finlayson EV,  et al.  Hospital volume and surgical mortality in the United States.  N Engl J Med. 2002;346(15):1128-1137
PubMed
Birkmeyer JD, Stukel TA, Siewers AE, Goodney PP, Wennberg DE, Lucas FL. Surgeon volume and operative mortality in the United States.  N Engl J Med. 2003;349(22):2117-2127
PubMed
 Evidence-based hospital referral. Leapfrog Group. http://www.leapfroggroup.org/media/file/Leapfrog-Evidence-Based_Hospital_Referral_Fact_Sheet.pdf. Accessed June 7, 2010
Dudley RA, Johansen KL, Brand R, Rennie DJ, Milstein A. Selective referral to high-volume hospitals: estimating potentially avoidable deaths.  JAMA. 2000;283(9):1159-1166
PubMed
Halm EA, Lee C, Chassin MR. Is volume related to outcome in health care? a systematic review and methodologic critique of the literature.  Ann Intern Med. 2002;137(6):511-520
PubMed
Livingston EH, Elliot A, Hynan L, Cao J. Effect size estimation: a necessary component of statistical analysis.  Arch Surg. 2009;144(8):706-712
PubMed
Beyene J, Atenafu EG, Hamid JS, To T, Sung L. Determining relative importance of variables in developing and validating predictive models.  BMC Med Res Methodol. 2009;964
PubMed
Thabut G, Christie JD, Kremers WK, Fournier M, Halpern SD. Survival differences following lung transplantation among US transplant centers.  JAMA. 2010;304(1):53-60
Livingston EH, Elliott AC, Hynan LS, Engel E. When policy meets statistics: the very real effect that questionable statistical analysis has on limiting health care access for bariatric surgery.  Arch Surg. 2007;142(10):979-987
PubMed
Altman DG, Lausen B, Sauerbrei W, Schumacher M. Dangers of using optimal cutpoints in the evaluation of prognostic factors.  J Natl Cancer Inst. 1994;86(11):829-835
PubMed
Panageas KS, Schrag D, Riedel E, Bach PB, Begg CB. The effect of clustering of outcomes on the association of procedure volume and surgical outcomes.  Ann Intern Med. 2003;139(8):658-665
PubMed

First Page Preview

First page PDF preview

Figures

Tables

Interactive Graphics

Video

Country-Specific Mortality and Growth Failure in Infancy and Yound Children and Association With Material Stature

Use interactive graphics and maps to view and sort country-specific infant and early dhildhood mortality and growth failure data and their association with maternal

Gladwell M. Outliers: The Story of Success. New York, NY: Little Brown; 2008
Luft HS, Bunker JP, Enthoven AC. Should operations be regionalized? the empirical relation between surgical volume and mortality.  N Engl J Med. 1979;301(25):1364-1369
PubMed
Birkmeyer JD, Siewers AE, Finlayson EV,  et al.  Hospital volume and surgical mortality in the United States.  N Engl J Med. 2002;346(15):1128-1137
PubMed
Birkmeyer JD, Stukel TA, Siewers AE, Goodney PP, Wennberg DE, Lucas FL. Surgeon volume and operative mortality in the United States.  N Engl J Med. 2003;349(22):2117-2127
PubMed
 Evidence-based hospital referral. Leapfrog Group. http://www.leapfroggroup.org/media/file/Leapfrog-Evidence-Based_Hospital_Referral_Fact_Sheet.pdf. Accessed June 7, 2010
Dudley RA, Johansen KL, Brand R, Rennie DJ, Milstein A. Selective referral to high-volume hospitals: estimating potentially avoidable deaths.  JAMA. 2000;283(9):1159-1166
PubMed
Halm EA, Lee C, Chassin MR. Is volume related to outcome in health care? a systematic review and methodologic critique of the literature.  Ann Intern Med. 2002;137(6):511-520
PubMed
Livingston EH, Elliot A, Hynan L, Cao J. Effect size estimation: a necessary component of statistical analysis.  Arch Surg. 2009;144(8):706-712
PubMed
Beyene J, Atenafu EG, Hamid JS, To T, Sung L. Determining relative importance of variables in developing and validating predictive models.  BMC Med Res Methodol. 2009;964
PubMed
Thabut G, Christie JD, Kremers WK, Fournier M, Halpern SD. Survival differences following lung transplantation among US transplant centers.  JAMA. 2010;304(1):53-60
Livingston EH, Elliott AC, Hynan LS, Engel E. When policy meets statistics: the very real effect that questionable statistical analysis has on limiting health care access for bariatric surgery.  Arch Surg. 2007;142(10):979-987
PubMed
Altman DG, Lausen B, Sauerbrei W, Schumacher M. Dangers of using optimal cutpoints in the evaluation of prognostic factors.  J Natl Cancer Inst. 1994;86(11):829-835
PubMed
Panageas KS, Schrag D, Riedel E, Bach PB, Begg CB. The effect of clustering of outcomes on the association of procedure volume and surgical outcomes.  Ann Intern Med. 2003;139(8):658-665
PubMed
CME Course for:


You need to register in order to view this quiz.


To understand the clinical management of acute heart failure syndromes.
Accreditation Information The American Medical Association is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians.
The AMA designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 CreditTM per course. Physicians should claim only the credit commensurate with the extent of their participation in the activity.
Physicians who complete the CME course and score at least 80% correct on the quiz are eligible for AMA PRA Category 1 CreditTM.
Note: You must get at least of the answers correct to pass this quiz.
Note: You must get at least of the answers correct to pass this quiz.
You have not filled in all the answers to complete this quiz
The following questions were not answered:
Sorry, you have unsuccessfully completed this CME quiz with a score of
The following questions were not answered correctly:
For CME Course: A Proposed Model for Initial Assessment and Management of Acute Heart Failure Syndromes
Indicate what changes(s) you will implement in your practice, if any, based on this CME course.
To view and print your certificate and access a summary of your CME courses go to My CME.
NOTE:
Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s “Cited By” API will populate this tab (http://www.crossref.org/citedby.html).
Submit a Response

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

See Also...
Articles Related By Topic
Related Topics
PubMed Articles