Author Affiliations: Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts (Dr Navathe); Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia (Dr Navathe); and Department of Health and Human Services, Washington, DC (Drs Navathe, Clancy, and Glied).
Patient-centered outcomes research, which aims to assist clinicians and patients in making informed decisions regarding prevention, diagnosis, and treatment, is essential for improving the delivery of quality health care. Much of patient-centered outcomes research relies on observational and quasi-experimental methods applied to data generated as a byproduct of providing care. While existing data sources have improved, there remain important data-related barriers to rapid, efficient research. Recent changes in the policy environment, coupled with significant technological progress, provide an opportunity to surmount some of these obstacles.
Achieving the goals of patient-centered outcomes research (research on priority populations, interventions, and conditions) requires an accessible, integrated infrastructure that spans clinical and administrative data systems. The current infrastructure reflects the fragmentation of care delivery and it neither allows the repurposing of clinical data to answer research questions nor enables an effective and efficient linkage of clinical data with nonclinical data. To generate meaningful patient-centered outcomes research, researchers need higher-quality data, including greater clinical detail, longitudinal follow-up, and linkages among data sets. Access to resources should be timely and cost-effective, while protecting the confidentiality and privacy of patients.
Given the significant role that Medicare and Medicaid play in the US health system, it is imperative that data from these programs are available for patient-centered outcomes research. The Department of Health and Human Services is taking steps to make data more readily available to researchers and to those involved in quality improvement, including waived fees for projects funded by the American Recovery and Reinvestment Act of 2009. Related to this, the Centers for Medicare & Medicaid Services recently released a proposed rule to make Medicare claims data available for combining with private sector data to qualified entities that demonstrate specified capabilities including protection of beneficiary privacy and confidentiality.
Moving forward, 3 key principles must be upheld in considering data infrastructure enhancements. First, patient privacy and confidentiality must be protected. Second, the data infrastructure also must ensure the confidentiality of individual data providers. Third, multiple types of users (academic, commercial, and others) must be able to use the infrastructure.
Two technological advances provide new opportunities to address challenges while upholding these key principles. The first is virtual research data access, which allows researchers to analyze data without possessing them (ie, to ask questions of the data set without physically moving the data from a secure location). Researchers submit computer programming code with statistical analyses, view the results, and generate graphs just as if the data set resided on a local computer or server. Privacy is safeguarded by software programs that scan the results and remove information that can be used to identify patients. Furthermore, these privacy settings are customizable, and institutions and researchers who are better able to ensure confidentiality will be afforded greater flexibility. For example, results for a research enterprise with sophisticated information technology security might include the names of counties, whereas a less secure user may only see in which state a particular claim took place.
Analytic tools can further enhance the potential of data access in virtual research. Advances in software for business intelligence (analysis of up-to-the-minute data to inform business decisions) can be applied to health data for research. These packages can facilitate the early stages of the research process by enabling investigators to visualize data trends and perform simplified analyses toward formulating research questions and generating hypotheses. These tools allow researchers to refine their data requests before applying for raw data extracts, thereby reducing the amount of data shared and the level of privacy risk.
The second key opportunity to advance research data infrastructure is data sharing through formal distributed data networks. A key challenge in providing researchers access to data across settings and sources is that this approach has traditionally required physically locating all of the data on the same server. Doing so increases privacy risk and also infringes upon the business interests of commercial entities providing data.
Distributed data networks hold great potential because they circumvent this fundamental problem. They enable access to diverse data as if they are integrated, while permitting data to remain in their original secure locations. This capability is made possible by technologies that enable secure access to the various data sources, convert the disparate data into a common format, integrate the data into one “small” database, remove information that can be used for identification, and perform all of these actions on a request-by-request basis. A central coordinating center could connect all of the data sources (health plans and providers such as hospitals and physician practices), broker agreements between data partners and researchers, ensure the security and integrity of statistical programming code, protect the identities of data partners, and serve as the access point for researchers.
While these technological opportunities offer great potential, they do involve some challenges for researchers. For example, virtual data access will magnify the importance of research protocol development because researchers will have less opportunity for trial and error in their analysis of data sets. Some have argued for registration of observational cohorts like randomized controlled trials, a concept that has not yet reached its full potential. Rerunning data request modifications through distributed data networks will be costly. One way to mitigate these problems may be through construction of dummy data sets on which to test programming code. These issues highlight the synergy between virtual data access and distributed data networks. Together they enable an interactive process between the data partners and researchers in defining data sets and performing analyses, which is an advantage that could extend beyond data infrastructure toward greater research partnerships in many areas.
Virtual data access and distributed data network technologies are now transitioning from pilot phases into implementation. The National Center for Health Statistics at the US Centers for Disease Control and Prevention, for example, already makes some data available through its research data centers. Many distributed data network initiatives are now under way such as the HMO Research Network.1
The Department of Health and Human Services has begun several initiatives that push forward implementation of virtual data access and distributed data networks. In one innovative model of distributed data networks (the US Food and Drug Administration's Sentinel Initiative2 ), all analyses are performed at data partner sites within the protection of their information technology security; this is an opportunity made possible by the latest statistical techniques. A similar project (Assistant Secretary for Planning and Evaluation and the Centers for Medicare & Medicaid Services Multi-payer Claims Database) aims to create and operate a database, building on a foundation of public and private payer claims data without patient identifiable information. This project includes a limited central database of deidentified claims data and a more expansive distributed data network. To drive its own capabilities forward and maximize privacy of its beneficiaries, the Centers for Medicare & Medicaid Services will participate as a data partner on the distributed network. The Multi-payer Claims Database user interface for researchers will enable performance of basic analyses used to refine their data needs without giving them access to detailed data. This project is actively exploring release of data through a virtual research data center model and will be completed by September 2013.
Patient-centered outcomes research is dependent on access to high-quality data to drive improved patient care but that access cannot come at the expense of patient privacy. The latest developments in information technology (enabling the creation of distributed data networks and virtual data access) provide avenues to address important concerns and facilitate a renewed focus on public-private partnerships in the pursuit of the public good. Researchers and leaders of health care organizations should exploit and enhance these technologies to create a data infrastructure that supports patient-centered care, research, and rapid learning.
Corresponding Author: Amol S. Navathe, MD, PhD, Brigham and Women's Hospital, PBB-B4, 75 Francis St, Boston, MA 02115 (anavathe@partners.org).
Conflict of Interest Disclosures: All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest and none were reported.
Country-Specific Mortality and Growth Failure in Infancy and Yound Children and Association With Material Stature
Use interactive graphics and maps to view and sort country-specific infant and early dhildhood mortality and growth failure data and their association with maternal
Instructions
Comments are moderated and will appear on the site at the discretion of the Journal of American Medical Association editors. Comments should not exceed 500 words of text and 10 references.
Do not submit personal medical questions or information that could identify a specific patient, questions about a particular case, or general inquiries to an author. Only content that has not been published, posted, or submitted elsewhere should be submitted. By submitting this Comment, you and any coauthors transfer copyright to the journal if your Comment is posted.
* = Required Field
Disclosure of Any Conflicts of Interest* Indicate all relevant conflicts of interest of each author below, including all relevant financial interests, activities, and relationships within the past 3 years including, but not limited to, employment, affiliation, grants or funding, consultancies, honoraria or payment, speakers’ bureaus, stock ownership or options, expert testimony, royalties, donation of medical equipment, or patents planned, pending, or issued. If all authors have none, check "No potential conflicts or relevant financial interests" in the box below. Please also indicate any funding received in support of this work. The information will be posted with your response.
Register and get free email Table of Contents alerts, saved searches, PowerPoint downloads, CME quizzes, and more
Subscribe for full-text access to content from 1998 forward and a host of useful features
Activate your current subscription (AMA members and current subscribers)
Some tools below are only available to our subscribers or users with an online account.
Download citation file:
Customize your page view by dragging & repositioning the boxes below.
and access these and other features:
Register Now
Enter your username and email address. We'll send you a reminder to the email address on record.
Athens and Shibboleth are access management services that provide single sign-on to protected resources. They replace the multiple user names and passwords necessary to access subscription-based content with a single user name and password that can be entered once per session. It operates independently of a user's location or IP address. If your institution uses Athens or Shibboleth authentication, please contact your site administrator to receive your user name and password.