0
Special Communication |

Implications of the Human Genome for Understanding Human Biology and Medicine

G. Subramanian, MD, PhD; Mark D. Adams, PhD; J. Craig Venter, PhD; Samuel Broder, MD
JAMA. 2001;286(18):2296-2307. doi:10.1001/jama.286.18.2296.
Text Size: A A A
Published online

Clinical researchers, practicing physicians, patients, and the general public now live in a world in which the 2.9 billion nucleotide codes of the human genome are available as a resource for scientific discovery. Some of the findings from the sequencing of the human genome were expected, confirming knowledge presaged by many decades of research in both human and comparative genetics. Other findings are unexpected in their scientific and philosophical implications. In either case, the availability of the human genome is likely to have significant implications, first for clinical research and then for the practice of medicine. This article provides our reflections on what the new genomic knowledge might mean for the future of medicine and how the new knowledge relates to what we knew in the era before the availability of the genome sequence. In addition, practicing physicians in many communities are traditionally also ambassadors of science, called on to translate arcane data or the complex ramifications of biology into a language understood by the public at large. This article also may be useful for physicians who serve in this capacity in their communities. We address the following issues: the number of protein-coding genes in the human genome and certain classes of noncoding repeat elements in the genome; features of genome evolution, including large-scale duplications; an overview of the predicted protein set to highlight prominent differences between the human genome and other sequenced eukaryotic genomes; and DNA variation in the human genome. In addition, we show how this information lays the foundations for ongoing and future endeavors that will revolutionize biomedical research and our understanding of human health.

Figures in this Article

Sign In to Access Full Content

Don't have Access?

Register and get free email Table of Contents alerts, saved searches, PowerPoint downloads, CME quizzes, and more

Subscribe for full-text access to content from 1998 forward and a host of useful features

Activate your current subscription (AMA members and current subscribers)

Purchase Online Access to this article for 24 hours

Figures

Figure 1. Example of Segmental Duplication Between Chromosomes in the Human Genome1
Graphic Jump Location
Schematic of a large duplicated segment between chromosome 18 (18q22) and 20 (20q13) to show examples of the genes and their predominantly colinear distribution on both duplicated segments, with the gene names of 7 of the 56 gene pairs shown. The chromosome 18 segment represents 13 million base pairs (bp) of genomic DNA sequence, whereas the chromosome 20 segment represents 1.4 million bp of genomic DNA. These genes represent a diverse set of proteins, including nuclear transcription factors (ZNF236 and Kruppel-related: Kruppel family transcription factors; NFATC1 and NFAT-related: nuclear factor of activated T-cells; GATA6 and GATA-related: GATA transcription factors; TALE homeobox family members, involved in nuclear protein transcription) as well as potassium channel-related factors (KCNG1 and KCNG2: potassium voltage-gated channels, subfamily G); RAB31 and Ras (RAB)-related: ras oncogene superfamily, involved in protein trafficking. The precise clinical associations of these proteins with human disease remain to be ascertained, though other members of these protein classes have been implicated in developmental and cardiovascular conduction abnormalities, for example.49
Figure 2. Duplications Within the Genome
Graphic Jump Location
Segmental duplications comparable to those in chromosomes 18 and 20 (see Figure 1) occur throughout the human genome. Chr indicates chromosome.
Figure 3. Prominent Differentiating Features in the Domain Architectures of Representative Human Proteins
Graphic Jump Location
A protein domain is a structural and functional unit that shows evolutionary conservation and, by convention, is represented as a distinct geometric shape. Thus, proteins are made up of 1 or more such building blocks or "domains" and, depending on the types and numbers of domains, proteins with different biological capabilities are created. Many of these domains have seemingly arbitrary nomenclature that, in many cases, reflects the experimental nuances of their initial description. A library of curated protein domains with their biological descriptions is available through the Pfam52 and SMART53 databases.
A, The extensive domain shuffling seen in the plasma proteases of the coagulation and complement systems. The "ancient" trypsin family serine protease domain occurs in combination with a myriad of protein interaction domains. Most of these domains are evolutionarily ancient, that is, with the exception of the Gla domain (see below); they are also observed in the fly and the worm. These include: (1) AP: Apple, originally described in the coagulation factors, predicted to possess protein- and/or carbohydrate-binding functions; (2) Kr: Kringle, named after a Danish pastry, has an affinity for lysine-containing peptides; (3) E: epidermal growth factor (EGF)-like; (4) CUB: domain first described in complement proteins and a diverse group of developmental proteins; (5) CCP: complement control protein repeats, also known as "sushi" repeats, first recognized in the complement proteins; and (6) Gla: a hyaluron-binding domain, contains γ-carboxyglutamate residues, and is seen in proteins associated with the extracellular matrix. Of note is the observation that apolipoprotein (a) likely represents a primate-specific evolutionary event. There is a tremendous expansion of the Kringle domain (dashed segment represents a total of 29 copies of the Kringle domain) in a trypsin family serine protease.
B, Examples of domain accretion in nuclear regulators in the human compared with the fly.1,2 Domain accretion refers to greater numbers of a specific domain in a multidomain protein or addition of new domains to a multidomain protein. These domains include: (1) BTB: broad-complex, tramtrack, and bric-a-brac (a name that reflects its early descriptions in Drosophila), a protein interaction domain; (2) Zf: C2H2 class of DNA-binding zinc finger; (3) KRAB: Kruppel-associated box, a vertebrate-specific nuclear protein interaction domain; (4) HD: histone deacetylase, an important class of chromatin-modifying enzymes; (5) U: ubiquitin finger, a domain that targets proteins for proteolytic degradation. There is a major expansion of the numbers of C2H2 zinc fingers in the BTB or KRAB transcription factor (dashed segment represents a total of 3 copies of the Zf domain) families in the human, a feature that may reflect increased ability to mediate regulatory interactions with DNA.
Figure 4. Representative Examples of the Major Differences Between the Predicted Protein Sets of the Human Compared With the Fly and the Worm
Graphic Jump Location
The numbers of proteins containing the specified Pfam domain or protein family for each of the animal genomes were derived by computational analysis.1 Representative protein domains or protein families that show a 2-fold or greater expansion in the human were categorized into cellular processes (eg, developmental regulators; neural structure and function; or hemostasis, complement system, and immune response) for representation. A detailed biological description of each of these protein domains may be obtained from the Pfam52 or SMART53 databases. TGF-β indicates transforming growth factor-β; TSP, thrombospondin; CCP, complement control protein; and TIR, toll interleukin receptor.
Notable examples from this list of proteins that are unique to the human (when compared with the fly and worm) include connexins (constitutive subunits of intercellular channels, providing the structural basis for electrical coupling); neuropilin, a key mediator in axonal guidance along with the semaphorins and plexin molecules; fibronectin type 1 (FN1) domain, a fibrin-binding domain found in certain proteins of the coagulation cascade; fibronectin type 2 (FN2) domain, a collagen-binding domain found in a diverse set of hemostatic regulators; membrane-attack complex/perforin (MACPF), a domain found in certain complement proteins; C1q, a domain found in complement 1q and in many collagens; cytokines and tumor necrosis factor (TNF), 2 of the central families of secreted proteins that mediate a wide spectrum of immune-related functions.
*Voltage-gated (VG) ion channels include VG-sodium, -calcium, and -potassium channels.

Tables

References

CME
Meets CME requirements for:
Browse CME for all U.S. States
Accreditation Information
The American Medical Association is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians. The AMA designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 CreditTM per course. Physicians should claim only the credit commensurate with the extent of their participation in the activity. Physicians who complete the CME course and score at least 80% correct on the quiz are eligible for AMA PRA Category 1 CreditTM.
Note: You must get at least of the answers correct to pass this quiz.
You have not filled in all the answers to complete this quiz
The following questions were not answered:
Sorry, you have unsuccessfully completed this CME quiz with a score of
The following questions were not answered correctly:
Commitment to Change (optional):
Indicate what change(s) you will implement in your practice, if any, based on this CME course.
Your quiz results:
The filled radio buttons indicate your responses. The preferred responses are highlighted
For CME Course: A Proposed Model for Initial Assessment and Management of Acute Heart Failure Syndromes
Indicate what changes(s) you will implement in your practice, if any, based on this CME course.
NOTE:
Citing articles are presented as examples only. In non-demo SCM6 implementation, integration with CrossRef’s "Cited By" API will populate this tab (http://www.crossref.org/citedby.html).

Multimedia

Some tools below are only available to our subscribers or users with an online account.

Web of Science® Times Cited: 75

Sign In to Access Full Content

Related Content

Customize your page view by dragging & repositioning the boxes below.

Articles Related By Topic
Related Topics
PubMed Articles
Jobs
JAMAevidence.com

Users' Guides to the Medical Literature
The Genetic Blueprint

brightcove.createExperiences();