0
Commentary |

Enhancing the Feasibility of Large Cohort Studies

Teri A. Manolio, MD, PhD; Rory Collins, FMedSci
[+] Author Affiliations

Author Affiliations: Office of Population Genomics, National Human Genome Research Institute, Bethesda, Maryland (Dr Manolio); and Clinical Trial Service Unit and Epidemiological Studies Unit, University of Oxford, Oxford, United Kingdom (Dr Collins).


JAMA. 2010;304(20):2290-2291. doi:10.1001/jama.2010.1686
Text Size: A A A
Published online

The identification of many hundreds of genetic variants associated with complex diseases and their potential for interactions with environmental factors have increased the need for prospective cohort studies involving several hundred thousand participants.1 - 2 Costs of such studies under conventional funding models are high because typically they are conducted by a consortia of academic centers, each responsible for recruitment, examination, and follow-up of a subcohort of participants in its geographic area.3 - 4 The costs and inefficiencies in 100-fold expansion of these standard models can be prohibitive; large studies should not be viewed simply as small studies made large. Rather, they require fundamentally different approaches in which minimizing cost is a primary consideration, and process proficiency to maximize efficiency is as important as scientific expertise.

UK Biobank is a large prospective study that relies on a centralized strategy for nearly all aspects of its conduct.5 This strategy, which UK Biobank adopted after rejecting a decentralized approach due to excessive cost, has achieved exceptional efficiencies while retaining scientific rigor. The main phase of recruitment started in April 2007 after a successful integrated pilot of the recruitment and assessment processes. The recruitment target of 500 000 individuals aged 40 to 69 years was achieved in July 2010, approximately 18 months ahead of schedule and within budget.5 The UK Medical Research Council and Wellcome Trust charity are the chief funders,5 and the total cost of recruitment and baseline assessments, along with establishing the sample and data storage infrastructure, was approximately $100 million. Annual costs for the subsequent phase of health outcome follow-up and adjudication, as well as maintenance of the sample store and development of the information technology systems to facilitate use by researchers, are estimated to be approximately $7 million. The success of the UK Biobank model may provide valuable lessons for the efficient operation of large prospective studies in the United States.

Centralized models of recruitment, data collection, sample processing, and follow-up can increase efficiency while promoting standardization, thus meeting both process and scientific imperatives. Rather than establishing and maintaining dozens or even hundreds of individual assessment centers, each of which must become expert in all study aspects and conduct them in (ideally) identical fashion throughout the study, participant recruitment can be focused in specific locales for a set period and then shifted to other areas. Cost efficiency can be achieved if assessment centers are inexpensive to establish and dismantle, temporary staff can be rapidly hired and trained, sizeable numbers of interested participants can easily reach the site, and economical space with good transport links is available for the period needed to “exhaust” an area of willing participants.

In UK Biobank, it was possible to meet all these conditions while maintaining good quality data collection. About 6 assessment centers were in operation at any one time, each recruiting approximately 100 participants per day for about 6 months, and a total of about 20 centers were required to recruit the complete cohort.5 By comparison, with distributed models typical of smaller studies, daily transmission of the data to the UK Biobank coordinating center facilitated real-time monitoring based on stable estimates of expected means, variances, missing rates, etc, allowing important aberrations to be detected and corrected rapidly. Centralizing responsibility for such day-to-day operations in skilled project managers has thus freed UK Biobank investigators to focus on science rather than being mired in logistical details.

Centralized models may also have drawbacks. Experienced investigators may have well-functioning recruitment and follow-up systems within their communities, as well as an understanding of unique local conditions. Academic centers accustomed to operational leadership in their geographic area may feel disenfranchised, risking the loss of their scientific and logistical input. Encouraging such investigators to provide procedural insights and advice for a centralized approach may make best use of their expertise, while streamlining lines of operational responsibility. Collaborating investigators can also assume primary responsibility for specific centralized aspects of study operations, such as enhancing recruitment of underrepresented groups, responding to participant or community concerns, or developing systems for assessment of health outcomes.

Identifying disease events after the baseline assessment is as critical to the success of prospective studies as recruiting the cohort in the first place. Ensuring effective follow-up has been a major reason for establishing semipermanent, localized study centers closely associated with area hospital record systems. In settings with 1 or only a few major sources of care and comprehensive medical records, follow-up may be possible with minimal participant input through record surveillance alone. Regional or national health care systems, such as the British National Health Service (NHS), can greatly simplify follow-up for clinically detected events, although this will miss asymptomatic outcomes and may misclassify those with incorrect clinical diagnoses. UK Biobank will be relying on NHS records for disease ascertainment with subsequent centralized adjudication, but US systems are fragmentary, nonstandardized, and challenging to access. If centralization is to work optimally, remote ascertainment of disease outcomes without ongoing contact with participants will almost certainly be necessary. This presents a considerable obstacle to large cohort studies in areas without central health data systems, although e-mail or Internet re-contact may become increasingly feasible. Studies in the United States will be greatly facilitated by the development of standardized electronic medical records nationwide.6 Embedding participant recruitment in an infrastructure for follow-up, as in studies conducted through established health care systems in the United States,7 - 8 necessarily constrains the population from which a study can draw and may limit the diversity of the resulting cohort, but not its generalizability.

A key consideration in limiting costs of large prospective studies is the vigor with which a high participation rate is pursued. Nationally representative surveys such as the National Health and Nutrition Examination Survey and disease-specific studies such as the Cardiovascular Health Study3 provide valuable population-based estimates of disease prevalence and incidence that require high response rates. Prospective cohorts need not, however, be representative of a population to be generalizable; for example, the British Doctors' study provided valuable insights on the disease risks due to smoking for the general population,9 and the Framingham study has provided information about blood pressure and cholesterol that go well beyond that one small Massachusetts town.10 If a cohort study focusing on disease risk associations has a sufficiently large base population and captures a diversity of exposures and backgrounds, the results can still be applicable to populations with different distributions of these exposures. For these reasons, UK Biobank chose to emphasize diversity rather than participation rates and accepted yields of roughly 10%.5 If enough potential invitees are available, substantial savings can be realized by not attempting to persuade undecided individuals to join.

The need for large prospective cohorts to assess genetic and environmental factors reliably requires nearly unprecedented levels of cost efficiency. The novel approaches successfully used by UK Biobank may not be directly transferable to all settings in the United States, where infrastructures for recruitment and follow-up differ, and diversity and distances are greater. Careful assessment and piloting will be needed to assess the feasibility of such models in the United States. Several large-scale US efforts are under way including major initiatives by Kaiser Permanente and the US Department of Veterans Affairs. However, if these large US cohort efforts are to be successful, lessons learned from approaches used by the UK Biobank may help point the way.

Corresponding Author: Teri A. Manolio, MD, PhD, Office of Population Genomics, National Human Genome Research Institute, Bldg 31, Room 4B-09, 31 Center Dr, MSC 2154, Bethesda, MD 20892-2154 (manolio@nih.gov).

Financial Disclosures: None reported.

Additional Information: This commentary evolved from the deliberations of a symposium convened by the National Institutes of Health on January 22, 2010, to examine new models for conducting large-scale prospective cohort studies.

Collins FS. The case for a US prospective cohort study of genes and environment.  Nature. 2004;429(6990):475-477
PubMed
Manolio TA, Bailey-Wilson JE, Collins FS. Genes, environment and the value of prospective cohort studies.  Nat Rev Genet. 2006;7(10):812-820
PubMed
Fried LP, Borhani NO, Enright P,  et al.  The Cardiovascular Health Study: design and rationale.  Ann Epidemiol. 1991;1(3):263-276
PubMed
The Women's Health Initiative Study Group.  Design of the Women's Health Initiative clinical trial and observational study.  Control Clin Trials. 1998;19(1):61-109
PubMed
UK Biobank.  Welcome to UK Biobank. http://www.ukbiobank.ac.uk. Accessed November 1, 2010
US Department of Health and Human Services.  Nationwide health information network: overview. Office of the National Coordinator for Health Information Technology Web site. http://www.healthit.hhs.gov/portal/server.pt?open=512&objID=1142&parentname=CommunityPage&parentid=25&mode=2&in_hi_userid=11113&cached=true. Accessed October 27, 2010
McCarty CA, Wilke RA, Giampietro PF, Wesbrook SD, Caldwell MD. Marshfield Clinic Personalized Medicine Research Project (PMRP): design, methods and recruitment for a large population-based biobank.  Per Med. 2005;249-79doi:
CrossRef

Roden DM, Pulley JM, Basford MA,  et al.  Development of a large-scale de-identified DNA biobank to enable personalized medicine.  Clin Pharmacol Ther. 2008;84(3):362-369
PubMed
Doll R, Peto R, Boreham J, Sutherland I. Mortality from cancer in relation to smoking: 50 years observations on British doctors.  Br J Cancer. 2005;92(3):426-429
PubMed
Kannel WB, Dawber TR, Kagan A, Revotskie N, Stokes J III. Factors of risk in the development of coronary heart disease—six year follow-up experience: the Framingham Study.  Ann Intern Med. 1961;5533-50
PubMed

First Page Preview

First page PDF preview

Figures

Tables

Interactive Graphics

Video

Country-Specific Mortality and Growth Failure in Infancy and Yound Children and Association With Material Stature

Use interactive graphics and maps to view and sort country-specific infant and early dhildhood mortality and growth failure data and their association with maternal

Collins FS. The case for a US prospective cohort study of genes and environment.  Nature. 2004;429(6990):475-477
PubMed
Manolio TA, Bailey-Wilson JE, Collins FS. Genes, environment and the value of prospective cohort studies.  Nat Rev Genet. 2006;7(10):812-820
PubMed
Fried LP, Borhani NO, Enright P,  et al.  The Cardiovascular Health Study: design and rationale.  Ann Epidemiol. 1991;1(3):263-276
PubMed
The Women's Health Initiative Study Group.  Design of the Women's Health Initiative clinical trial and observational study.  Control Clin Trials. 1998;19(1):61-109
PubMed
UK Biobank.  Welcome to UK Biobank. http://www.ukbiobank.ac.uk. Accessed November 1, 2010
US Department of Health and Human Services.  Nationwide health information network: overview. Office of the National Coordinator for Health Information Technology Web site. http://www.healthit.hhs.gov/portal/server.pt?open=512&objID=1142&parentname=CommunityPage&parentid=25&mode=2&in_hi_userid=11113&cached=true. Accessed October 27, 2010
McCarty CA, Wilke RA, Giampietro PF, Wesbrook SD, Caldwell MD. Marshfield Clinic Personalized Medicine Research Project (PMRP): design, methods and recruitment for a large population-based biobank.  Per Med. 2005;249-79doi:
CrossRef

Roden DM, Pulley JM, Basford MA,  et al.  Development of a large-scale de-identified DNA biobank to enable personalized medicine.  Clin Pharmacol Ther. 2008;84(3):362-369
PubMed
Doll R, Peto R, Boreham J, Sutherland I. Mortality from cancer in relation to smoking: 50 years observations on British doctors.  Br J Cancer. 2005;92(3):426-429
PubMed
Kannel WB, Dawber TR, Kagan A, Revotskie N, Stokes J III. Factors of risk in the development of coronary heart disease—six year follow-up experience: the Framingham Study.  Ann Intern Med. 1961;5533-50
PubMed
CME Course for:


You need to register in order to view this quiz.


To understand the clinical management of acute heart failure syndromes.
Accreditation Information The American Medical Association is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians.
The AMA designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 CreditTM per course. Physicians should claim only the credit commensurate with the extent of their participation in the activity.
Physicians who complete the CME course and score at least 80% correct on the quiz are eligible for AMA PRA Category 1 CreditTM.
Note: You must get at least of the answers correct to pass this quiz.
Note: You must get at least of the answers correct to pass this quiz.
You have not filled in all the answers to complete this quiz
The following questions were not answered:
Sorry, you have unsuccessfully completed this CME quiz with a score of
The following questions were not answered correctly:
For CME Course: A Proposed Model for Initial Assessment and Management of Acute Heart Failure Syndromes
Indicate what changes(s) you will implement in your practice, if any, based on this CME course.
To view and print your certificate and access a summary of your CME courses go to My CME.
NOTE:
Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s “Cited By” API will populate this tab (http://www.crossref.org/citedby.html).
Submit a Response

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Articles Related By Topic
Related Topics