A protein domain is a structural and functional unit that shows evolutionary
conservation and, by convention, is represented as a distinct geometric shape.
Thus, proteins are made up of 1 or more such building blocks or "domains"
and, depending on the types and numbers of domains, proteins with different
biological capabilities are created. Many of these domains have seemingly
arbitrary nomenclature that, in many cases, reflects the experimental nuances
of their initial description. A library of curated protein domains with their
biological descriptions is available through the Pfam52
A, The extensive domain
shuffling seen in the plasma proteases of the coagulation and complement systems.
The "ancient" trypsin family serine protease domain occurs in combination
with a myriad of protein interaction domains. Most of these domains are evolutionarily
ancient, that is, with the exception of the Gla domain (see below); they are
also observed in the fly and the worm. These include: (1) AP: Apple, originally
described in the coagulation factors, predicted to possess protein- and/or
carbohydrate-binding functions; (2) Kr: Kringle, named after a Danish pastry,
has an affinity for lysine-containing peptides; (3) E: epidermal growth factor
(EGF)-like; (4) CUB: domain first described in complement proteins and a diverse
group of developmental proteins; (5) CCP: complement control protein repeats,
also known as "sushi" repeats, first recognized in the complement proteins;
and (6) Gla: a hyaluron-binding domain, contains γ-carboxyglutamate
residues, and is seen in proteins associated with the extracellular matrix.
Of note is the observation that apolipoprotein (a) likely represents a primate-specific
evolutionary event. There is a tremendous expansion of the Kringle domain
(dashed segment represents a total of 29 copies of the Kringle domain) in
a trypsin family serine protease.
B, Examples of domain accretion in nuclear
regulators in the human compared with the fly.1,2
Domain accretion refers to greater numbers of a specific domain in a multidomain
protein or addition of new domains to a multidomain protein. These domains
include: (1) BTB: broad-complex, tramtrack, and bric-a-brac (a name that reflects
its early descriptions in Drosophila
), a protein
interaction domain; (2) Zf: C2H2 class of DNA-binding zinc finger; (3) KRAB:
Kruppel-associated box, a vertebrate-specific nuclear protein interaction
domain; (4) HD: histone deacetylase, an important class of chromatin-modifying
enzymes; (5) U: ubiquitin finger, a domain that targets proteins for proteolytic
degradation. There is a major expansion of the numbers of C2H2 zinc fingers
in the BTB or KRAB transcription factor (dashed segment represents a total
of 3 copies of the Zf domain) families in the human, a feature that may reflect
increased ability to mediate regulatory interactions with DNA.