Previous Article | Next Article ![]()
Clinical Microbiology Reviews, July 2004, p. 581-611, Vol. 17, No. 3
0893-8512/04/$08.00+0 DOI: 10.1128/CMR.17.3.581-611.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.
Department of Microbiology, School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6076,1 Department of Medical Microbiology and Immunology, College of Medicine, Texas A & M University System Health Science Center, College Station, Texas 77843-11142
SUMMARY INTRODUCTION PHASE-VARIABLE PHENOTYPES AND MOIETIES Colony Morphology and Opacity Capsule Fimbriae and Pili Flagella Other Surface-Exposed Proteins LPS and LOS Modification: Variation in Expression of Surface Epitopes DNA Restriction-Modification Systems Regulatory Proteins Metabolism-Associated Genes Phage Genes Concluding Remarks MOLECULAR MECHANISMS OF PHASE VARIATION Genetic Regulation Short sequence repeats and slipped-strand mispairing Homologous (general) recombination. Site-specific recombination. (i) Inversion of a DNA element by CSSR. (a) Type 1 fimbrial phase variation. (b) Other CSSR-dependent types of phase variation. (ii) Insertion and excision of genetic elements from the chromosome. Epigenetic Regulation Pap phase variation Ag43 phase variation Phase Variation as Part of the Cell's Regulatory Network Cross regulation Environmental regulation Concluding Remarks GENOMICS AND PHASE VARIATION BIOLOGICAL SIGNIFICANCE OF PHASE VARIATION Persistence through Surface Variation Evasion of Cross-Immunity DNA Restriction-Modification Systems DIAGNOSTIC AND EXPERIMENTAL SIGNIFICANCE OF PHASE VARIATION ACKNOWLEDGMENTS REFERENCES
| SUMMARY |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Antigenic variation refers to the expression of functionally conserved moieties within a clonal population that are antigenically distinct. The genetic information for producing a family of antigenic variants is available in the cell, but only one variant is expressed at a given time. In an excellent chapter on antigenic variation in bacteria, Barbour listed three criteria that must be fulfilled for variation to be considered as antigenic variation (12). These are (i) that the antigenic change must be involved in avoidance of immune or niche selection, (ii) that it is a multiphasic change, and (iii) that the mechanism is consistent with gene conversion. In this review, the term is used in a broader sense, including biphasic variation, antigenic variation for which the biological significance is not clear, and modification of the antigenic identity of a cell surface structure as a result of a phase-varying enzyme. Antigenic variation in eukaryotic pathogens, including Plasmodium falciparum and African trypanosomes, is not addressed here but is discussed in recent reviews (15, 37, 70, 77, 217, 358).
We will refer the reader to reviews that present in-depth discussions on specific topics relating to phase and antigenic variation, where relevant. These focus, for example, on a single bacterial species, regulatory mechanism, or biological role (29-31, 68, 85, 119, 127, 140, 160, 177, 364, 368, 410). In this review, we hope to provide the reader with awareness and basic understanding of the prevalence, mechanisms, and significance of phase variation in bacteria, with an emphasis on recent developments and insights.
| PHASE-VARIABLE PHENOTYPES AND MOIETIES |
|---|
|
|
|---|
|
In Haemophilus influenzae type b strains, three colony variants, opaque, intermediate, and translucent, have been found. These vary in important virulence properties like colonization in the nasospharynx and serum resistance. The colony phenotype appears to be a result of specific combinations of expression of multiple phase-variable proteins, which include variation in the level of capsule production and in the level of a cell envelope protein encoded by oapA (244, 281, 296, 388) (Table 1). A clear relation between colony phenotype and colonization or virulence also exists in Streptococcus pneumoniae (172, 387). In an animal model, opaque variants were more virulent on systemic infection whereas the translucent variants were more successful colonizers of the nasospharynx. The two variants also differed in an in vitro assay of invasion and transcytosis of endothelial cells. Opaque variants produced up to sixfold more capsule and twofold less teichoic acid compared to the transparent form (292). Similarly, Streptococcus gordonii colony morphology and virulence-associated properties, including hemolysin production, phase vary (163, 374). In contrast, in Helicobacter pylori, a variable-colony phenotype is a result of phase variation of expression of phospholipase A, which indirectly affects virulence by release of urease and VacA (347, 348) (Table 1). In the pathogen Salmonella enterica serotype Typhimurium, variable colony morphology is correlated with the coordinated control of phase variation of at least four proteins (153). However, the soil isolate Pseudomonas aeruginosa also showed variable colony morphology, which was correlated with phase variation of multiple traits including aggregation and motility (71).
Color variation in colonies grown on specific media can be caused by phase variation of proteins that interact with a dye. For example, in certain strains of Staphylococcus epidermidis, phase variation of a polysaccharide adhesin leads to variable colony color when grown on Congo red agar (419, 420) (Table 1; also see "Molecular mechanisms of phase variation" below). In summary, any reversible change in colony morphology, opacity, or color indicates that the expression of one or more proteins phase varies. Further analysis of these and related phenotypic variations may provide us with new insights into bacterial survival strategies, and the strong correlation between virulence and colony morphology in pathogens suggests that characterizing the underlying molecular basis may also provide valuable insights into bacterial pathogenesis.
A single species, or even a single isolate, can express multiple fimbriae that each can phase vary. The genomes of different pathogenic E. coli isolates encode different fimbriae, some of which phase vary, as well as type 1 fimbriae, which is common to all isolates and which phase varies (Table 1). The S. enterica serotype Typhimurium genome encodes at least 11 fimbrial operons, among which phase-variable expression has been identified for pef, lpf, and fim (53, 146, 251, 256) (Table 1). The biological role of phase variation of fimbriae is discussed in "Biological significance of phase variation" (below).
Fimbrial phase variation control mechanisms and the fimbrial structural genes may have evolved as separate modules. For example, the common feature of the pap-like family of fimbrial operons is the regulatory mechanism of phase variation, but the fimbrial structural genes within this family are not all related. Conversely, the subunit of the MR/P fimbriae in Proteus mirabilis resembles that of Pap fimbriae in E. coli, but the two phase variation control mechanisms are different (128, 202) (see "Molecular mechanisms of phase variation" below) (Table 1).
Type IV pili function as adhesins and include conjugative pili. Phase variation, antigenic variation of the structural subunits, and phase-variable modification have been described. Sequence variation in the type IV conjugative pili encoded by plasmids R64, R721, and ColI-P9 occurs as a result of incorporation of only one of a set of distinct C termini in the PilV tip proteins of the pilus in an individual cell. This sequence variation is associated with different receptor specificity, thereby dictating the species that will be preferred as a DNA recipient in a conjugation reaction (154, 177, 178). N. gonorrhoeae can theoretically produce over a million different, antigenically distinct pilin subunits for its type IV pili (99, 180, 240, 319, 346; reviewed in reference 12) (see "Molecular mechanisms of phase variation" below) (Table 1). In addition, the pilus-associated protein PilC phase varies (164) (Table 1). These pili are involved in interaction with eukaryotic cells, and thus these variations are probably important for pathogenesis (28, 312). In S. enterica serotype Typhi, phase-variable expression of the PilV subunit of the type IVB pili affects the pilus-associated property of cellular autoaggregation (243) (see "Molecular mechanisms of phase variation" below). Pili can also be modified, but whether modification occurs can vary within a clone, due to phase variation in the expression of one of the enzymes involved. This is the case for glycosylation by PgtA in N. gonorrhoeae (10) (Table 1).
In Campylobacter coli (272), Campylobacter jejuni (106, 169), and Helicobacter pylori (264), flagellar expression and motility phase vary (41, 73, 165) (Table 1). The underlying reason for this can differ. For example, C. coli expression of FlhA, which is required for the expression of flagellin, phase varies (272), whereas in H. pylori, expression of the fliP gene, encoding the flagellar basal body, phase varies (165) (Table 1). In Bordetella pertussis, phase-variable expression of the regulatory system BvgAS results in flagellar phase variation (340) (Table 1) (see "Regulatory proteins" below). In this species, flagellar synthesis is not required for virulence and may even be detrimental (2).
In the M1inv+ clone of the gram-positive group A Streptococcus (S. pyogenes), expression of the cell wall-associated surface proteins C5a peptidase, M protein, and type IIa IgG Fc receptor phase vary, as well as expression of the capsule and pyrogenic exotoxin (52, 195). This is in part due to phase-variable expression of the DNA binding, regulatory protein Mga (see "Regulatory proteins" below) (36, 233-235, 329). Expression of the collagen-like surface protein SclB is under the control of a separate phase variation control mechanism (285) (see "Molecular mechanisms of phase variation" below) (Table 1).
In gram-negative N. gonorrhoeae and N. meningitidis strains, expression of various outer membrane proteins phase varies, including that of members of the family of outer membrane opacity proteins (opa) that facilitate adhesion (339) (see "Biological significance of phase variation" below) (Table 1) and the porin PorA (class I outer membrane protein) in serogroup B N. meningitidis (Table 1). PorA is one of the candidates for a protein-based vaccine, and phase variation, as well as the naturally occurring antigenic variation of this protein, may affect efficacy (16, 51, 355, 365). In addition, in Neisseria spp., the expression of outer membrane proteins that are involved in iron acquisition phase varies, including the siderophore receptor FetA in N. gonorrhoeae (42) and two hemoglobin receptors in N. meningitidis DNM2 (201) (Table 1). This may reflect a need to balance iron acquisition during growth in the host and to evade the immune system. Phase-varying colonizing factors include Ag43 in E. coli (64, 270) (Table 1) and Oap H. influenzae (281, 388).
In Campylobacter fetus, which is a pathogen of domestic and wild animals, a class of proteins known as surface layer proteins (SLPs) are exported to the cell surface and are noncovalently attached to the lipopolysaccharide (LPS) (80). SLPs are important virulence factors, and the absence of SLP leads to increased sensitivity to complement activity and decreased infectivity (27). These SLPs undergo extensive antigenic variation, which is achieved by so called "nested DNA inversion" (81, 83, 84, 286; reviewed in reference 82) (see "Molecular mechanisms of phase variation" below) (Table 1). Other examples of phase variation of surface proteins in gram-negative bacteria are included in Table 1.
Mycoplasma species do not have a cell wall, and lipoproteins constitute part of the surface proteins. Many of these are under the control of phase and antigenic variation (19, 20, 26, 325, 407). This includes a substrate binding component of an ABC transporter in Mycoplasma fermentans (351). Interestingly, the pMGA family of hemagglutinins phase varies in Mycoplasma gallisepticum (207, 253, 401), whereas the homologous proteins in Mycoplasma synoviae undergo antigenic variation (254). Expression of the Vlp family of lipoproteins in Mycoplasma hyorhini undergoes both phase variation and antigenic variation (47-50, 299, 300, 408). The combination of these two regulatory systems and the fact that there is a family of six related Vlp proteins that each are subject to these controls lead to a large repertoire of Vlp proteins that can be expressed (409). In Mycoplasma species that are human commensals and pathogens, phase or antigenic variation is indicated for M. hominis (190) and M. penetrans (250, 301).
Perhaps the best-studied example of multiphasic antigenic variation is that of lipoproteins in Borrelia spirochetes that are the causative agents of relapsing fever (reviewed in references 12 and 13). These lipoproteins are divided in two groups, Vlp and Vsp for large and small proteins, respectively, and can be further divided into different families, each with about 70% sequence identity (see "Biological significance of phase variation" below). One of the mechanisms involves recombination between an extensive repertoire of silent, variable vlp and vsp loci ("archival loci") and an expression site (reviewed in reference 12). The closely related protein VlsE in B. burgdorferi, the causative agent of Lyme disease, is also under the control of antigenic variation through a similar combinatorial variation (181, 232, 411) (Table 1).
The chemical identity of LPS or LOS is defined by the addition of side groups, for example as a result of the activity of glycosyltransferases or sialyltransferases, or by the addition of phosphorylcholine (ChoP). These traits can vary within a clonal population as a result of phase variation of one or more enzymes involved in the modification. An in-depth review of the chemical nature of modification of the O antigen of LPS is presented in reference 199. LPS modifications can impact antigenicity but can also affect serum sensitivity and adhesion. This is discussed in more detail below for the LOS of N. meningitidis (see "Biological significance of phase variation"). Below, additional examples are given relating to several important pathogens. Since variable LPS modification is not easily identified, it is quite possible that it occurs in other species as well.
Ganglioside mimicry of the LOS by Campylobacter jejuni is thought to be an important factor in the development of Guillain-Barré and Miller-Fisher syndromes after infection and is associated with specific modification of the LOS. Expression of the enzymes involved in the modification can phase vary. Alternate synthesis of gangliosides GM-2 and GM-1, like LOS in C. jejuni NCTC 11168, is a result of phase variation of expression of ß-1,3-galactosyltransferase encoded by wlaN (206); in C. jejuni strain 81-176, reversible conversion between the GM-2 and GM-3 LOS correlates with phase-variable expression of the cgtA gene (111) (Table 1). Based on our understanding of these phase variation mechanisms, Linton et al. were able to show that a different C. jejuni strain, which had been characterized as producing only GM-2 like LOS, was able to convert to producing GM-1 like LOS (206). This illustrates how an understanding of phase variation at the molecular level may have a significant impact on the understanding of the pathogenicity and epidemiology of a pathogen.
Another well-studied example of apparent host mimicry occurs in the human gastric pathogen Helicobacter pylori, which can incorporate into its LPS variable carbohydrate modifications that resemble structures of the Lewis group of antigens of human blood groups. This variation occurs distinct from, and in addition to, LPS microheterogeneity (reviewed in reference 8, 384). Three fucosyl transferase genes (futA, futB, and futC, also referred to as the fucT genes), which are each under the control of a phase variation mechanism, we involved in the LPS modification (7, 385) (Table 1). Other genes involved in LPS modification may phase vary as well (309).
In both encapsulated and nonencapsulated H. influenzae, modification can occur by phase-variable Lic3A or LgtC, a sialyltransferase and a glycosyltransferase, respectively (136, 137, 226, 391) (Table 1). Interestingly, these enzymes compete for the same lactose disaccharide moiety on LOS. Thus, the activity of one enzyme affects the substrate availability for the other. Phase variation of lgt-mediated glycosyl modification in N. meningitidis and N. gonorrhoeae is discussed in more detail below (see "Biological significance of phase variation") (Table 1).
The LOS of H. influenzae can also be decorated with ChoP. Phase-variable expression of the kinase encoded by licA results in phase-variable ChoP decoration, even though other genes may also contribute (315, 394) (Table 1). The presence of ChoP appears to confer sensitivity to serum-mediated killing caused by C-reactive protein, whereas in an animal model of the nasospharynx, this may confer a competitive advantage (214, 392, 395). Genetic analysis of three genes of the lic locus was performed to determine the on or off state of genes involved in LOS modification in bacteria isolated from the nasospharynx, blood, and cerebrospinal fluid in an animal model. Tissue-specific combinations were prevalent, supporting the idea that certain combinations of LOS modification may facilitate colonization or survival in different host environments (141).
Among Neisseria species, phase-variable ChoP modification of the LPS can occur, but type IV pili also contain ChoP (389, 394). Analysis of ChoP expression in a large group of isolates suggests that the commensal isolates decorate the LPS whereas pathogenic isolates decorate the pili (321, 322, 389) (Table 1).
Other phase-variable LOS modifications in H. influenzae, as well as in the related bovine pathogen H. somnus, exist but have not been characterized further (151, 152, 155, 402). Antigenic or phase variation of LPS also occurs in certain Legionella pneumophila isolates (see "Molecular mechanism of phase variation" below) (212, 213) and in S. enterica serotype Typhimurium (188). Modification of LOS or LPS in Francisella tularensis (61), Coxiella burnetii (98, 117), and Chlamydia spp. (211) varies, but the contribution of a phase variation control mechanism versus environmental regulation or mutation has not been determined.
Sequence analysis of other bacterial genomes suggests that phase variation of (putative) DNA R/M systems occurs in a variety of species (66, 68, 307, 309, 354) (see "Genomics and phase variation" below). Confirmation of phase variation has been obtained for expression of the mod gene in H. influenzae (66), for a type III modification system in Pasteurella haemolytica (304), for both the modification and restriction enzymes of a type III R/M system in H. pylori (70), and for a modification system in S. pneumoniae (275, 349) (Table 1).
In the gram-positive soil bacterium Streptomyces coelicolor A3(2), the phage growth limitation system determines reversible sensitivity and resistance to
C31 phage. The complete molecular basis is not clear, but it is associated with phase variation of a DNA methyltransferase, PgIX (192, 344). Thus, some phase-variable DNA modification systems may have evolved as a protection mechanism against phage infection.
Phase-variable expression of operons is often the result of a mechanism that regulates the initiation of transcription at the main promoter of the operon (see "Molecular mechanism of phase variation" below). As a result, expression of local regulators encoded by genes in the operon also phase varies. For example, in E. coli, expression of the local regulator PapB, encoded by the pap operon, phase varies (Table 1) (33). This not only affects pap expression through an autoregulatory loop, but also affects type 1 fimbrial expression (97, 403) (see "Cross regulation" below) (Table 1). Together, these examples demonstrate that phase variation of a single regulatory protein can lead to coordinated, phase-variable expression of multiple cellular proteins and can establish an interdependent network of phase-variable gene expression.
| MOLECULAR MECHANISMS OF PHASE VARIATION |
|---|
|
|
|---|
Short sequence repeats and slipped-strand mispairing Multiple contiguous repeats of units of DNA sequence can be subject to expansion or contraction of the number of repeats. A universal SSM mechanism is invoked in which misalignment of the repeat sequences occurs between the mother and daughter strands during DNA synthesis that occurs in either DNA replication or DNA repair. Misalignment between the daughter and parent DNA strands can occur on the leading or lagging strand at the repeat region, which results in an increase or decrease in the number of repetitive units in the newly synthesized DNA (200, 360, 362). These changes in the number of unit repeats can lead to phase-variable expression of a protein, if the location of these repeats is such that either transcription or translation of a gene is affected. Phase variation has been associated with repeat units that consist of 1 to as many as 7 nucleotides (nt). Repetitive sequence units are also referred to as short sequence repeats, microsatellites or variable number of tandem repeats (360).
Regulation at the level of transcription occurs when the repeats are located in the promoter region between the 10 and 35 sites for RNA polymerase binding (Fig. 1A, region 2). The spacing of these sites is critical for the level of transcription, and even a single-nucleotide deviation from the optimal 17-nucleotide (nt) spacing has an effect. Phase variation of the fimbriae encoded by hif in H. influenzae occurs as a result of variation between 9, 10, or 11 repeats of the dinucleotide TA located between the 10 and 35 sequences for the overlapping, divergent promoters for the hifA and hifB genes. These encode the major fimbrial subunit and chaperone, respectively. Not only does the change in strength of the promoter result in an "off" phase and an "on" phase, but also the "on" phase is represented by clones that have either a low or a high level of expression (372). A variation of this principle results in a variable level of production of the high-molecular-weight adhesins in H. influenzae, due to altered spacing between two promoters by SSM at 16 to 28 repeats of a 7-nt unit (65).
|
Translation of a protein can be affected by SSM if the unit repeats are located within its coding sequence (Fig. 1A, region 4). The open reading frame is disrupted if SSM results in a change in nucleotide number that is not a multiple of three. In this case, a nonfunctional, usually truncated protein is synthesized. This is, for example, the basis of phase variation of the expression of the mod gene of H. influenzae, which contains over 30 repeats of the tetranucleotide (5'-AGTC) in its coding sequence (Fig. 1B) (66). The reading frame is altered, and, in addition, a premature stop codon is formed as a result of one tetranucleotide addition within the coding sequence. To summarize, SSM can cause a change in the number of unit repeats consisting of 1 to 7 nt and can affect transcription initiation, a posttranscriptional initiation event, or translation.
Bayliss et al. have addressed the molecular mechanisms underlying phase variation-associated SSM. The effect of specific mutations on the switching frequency was determined for the fimbrial genes hifAB and mod, encoding a DNA modification enzyme in H. influenzae (18). Mutations in either of two genes that affect mismatch repair, dam and mutH, increased the frequency of change at the dinucleotide repeat tract of hifAB but not at the tetranucleotide repeat region of mod (18). A role for the mismatch repair system is also implicated in phase variation by SSM at single-nucleotide repeats in N. meningitidis, since increased switching rates were observed in both mutS and mutL backgrounds (291). These data indicate that a functional mismatch repair system can contribute to minimizing the occurrence of SSM at mono- and dinucleotide repeats but not at tetranucleotide repeats. In contrast, a mutation in polI increased the switching frequency only at the mod tetranucleotide repeat sequence. This suggests that incorrect processing of the Okazaki fragments results in increased instability of the region, but further details are not yet known (18). During misalignment of the strands, small loops of DNA are formed that may be stabilized by the formation of H-DNA, which was shown to form at a 5-nt repeat sequence (21). These studies show that different molecular mechanisms may be involved in stabilizing repeat regions of different unit lengths in H. influenzae and, furthermore, suggest that factors or conditions that affect mismatch repair or DNA replication may also affect SSM-dependent phase variation. How much of this can be extrapolated to other species remains to be determined. Thus, the (in)stability of a given repeat sequence in a specific bacterial species cannot yet be predicted.
Even in the absence of mutations like those described above, SSM-dependent switching frequencies can vary within an isolate. The frequency of variation at mod in H. influenzae, for example, increases with increasing numbers of unit repeats at mod (66, 291), which is probably a general correlation. The switching frequency can also be modulated by active transcription, as was shown for SSM at a poly(dC) tract in the siaD gene of N. meningitidis. The formation of a premature stop codon in siaD as a result of SSM results in disruption of the coupling of transcription and translation, and this in turn facilitates Rho-dependent termination of transcription. This transcriptional termination correlated with an increase in the frequency of change in the length of the poly(dC) tract (196). It will be a challenging but important issue to determine whether regulation of the frequency of the occurrence of SSM is a common occurrence and if this biologically significant.
Expansion or contraction of the number of nucleotides in multiples of three can cause a size polymorphism of a protein if this is located within a coding sequence. An intriguing example was described for the AhpC protein in E. coli. After a single triplet expansion in its coding sequence, the enzymatic function changed from a peroxiredoxin to a disulfide reductase. This change was observed under stress conditions that give a growth advantage to cells that had acquired this change but was nevertheless a reversible event (294). Whether phase variation by SSM at the level of translation significantly affects the biological function or antigenicity of other proteins is not known.
Additional examples of SSM-dependent phase variation are listed in Table 1. It is interesting that SSM-dependent phase variation of virulence factors has not been identified in E. coli and Salmonella sp., even though there does not appear to be a mechanistic constraint (294, 356). In these species, the potential to establish complex regulatory systems, which is facilitated by the large genome size, and a preference for stringent (environmental) regulation associated with their diverse natural habitats, may have influenced the acquisition or evolution of the more complex, phase variation mechanisms.
Short-sequence repeats that are not associated with phase variation, but may cause antigenic variation or other phenotypes, are discussed in an excellent review by van Belkum et al. (360). Two related topics, the identification of SSM-dependent phase-varying genes from genome sequence analyses, and the use of sequence repeats in strain identification, are discussed below in "Genomics and phase variation," and "Diagnostic and experimental significance of phase variation" respectively.
Homologous (general) recombination. Homologous or general recombination in general occurs at long (>50-bp) regions of homology and is dependent on numerous proteins that constitute part of the general DNA repair and maintenance machinery of the cell. Recombination between two alleles of a gene can lead to a gene conversion when this results in a unidirectional exchange of DNA. Gene conversion that is associated with antigenic variation in bacteria involves recombination between one of a repertoire of silent alleles of the gene and the gene located at the expression site. When alleles undergo constant changes as a result of recombination, this can be referred to as combinatorial variation. The mechanism(s) leading to gene conversion in bacteria may vary between species, but in general it requires the machinery of homologous recombination. However, several features distinguish it from most other RecA-dependent homologous recombination events. The frequency of this recombination is much higher, it occurs between regions of much lower homology than is usually considered necessary for RecA mediated recombination, and additional special cis-acting factors or unidentified processes appear to be involved.
Most of our understanding of the mechanism underlying gene conversion leading to antigenic variation is a result of studies of type IV pilin antigenic variation in N. gonorrhoeae (reviewed in reference 319). The pilin proteins that form antigenic variants of the pili are conserved for two-thirds of the N terminus but vary at the remaining C terminus. This variation is a result of unidirectional transfer to the expression locus pilE of a sequence from one of the silent pilS loci. There can be one to six copies of the silent loci on the genome, and these pilS loci can be separated from pilE by as much as 900 kb. The copies at the silent pilS loci consist mainly of variable regions of the gene, whereas the gene in the expressed pilE locus contains both conserved and variable regions. Recombination appears to require only 2 bp of conserved sequence and occurs at a high frequency (>103), which are both unusual traits for RecA-dependent recombination. However, RecA is required for antigenic variation, and the RecF-like recombination pathway, in which RecA plays a role, appears to play an essential role in this unidirectional exchange (142, 182, 237, 319, 332, 413). The frequency decreases in a recX mutant (342). This is discussed in more detail, in the context of general recombination, repair, and replication pathways, in an excellent recent review (176). Efficient pilin gene conversion furthermore requires a conserved sequence located at the 3' end of all pil loci (Sma/Cla repeat) that may be a site for recombination but also appears to be a recognition sequence for an as yet unidentified DNA binding protein. Interestingly, these proteins may be present only in pathogenic Neisseria (376, 377).
An intriguing aspect of this recombination is that despite the unidirectional exchange of DNA, chromosomal fidelity is maintained and the sequence at the pilS loci is unaltered. Gene conversion can occur if the donor sequence for recombination is obtained by DNA transformation, in which case both aspects are readily resolved. However, this is not the case for the second mechanism, which appears to be predominant. In this case, DNA exchange occurs between the two copies of the genome formed by DNA replication (143, 319). The recently proposed "hybrid intermediate" model addresses how this genetic exchange occurs, and critical aspects of this model have been verified experimentally (142; reviewed in reference 176). The first step involves a RecA-independent recombination event in the donor chromosome between very short sequences of homology of a pilS locus and pilE sequence, forming an extrachromosomal circular hybrid pilE-pilS molecule but presumably also an additional, undefined intermediate molecule that is critical for the next step (Fig. 2A). This hybrid molecule donates pilS sequence to the pilE locus in the recipient chromosome in a second recombination event. This step requires RecA and involves recombination at a larger region of homology flanking the pil sequences, as well as recombination at a short region of homology within the gene (Fig. 2B) (142). The recombination events result in a unidirectional exchange of variable pilS sequence to the pilE locus without altering the donor pilS sequence. The available experimental data do not rule out other models for gene conversion in Neisseria, and identifying the predominant intermediate molecular structure in the recombination step will be critical in resolving this important recombination mechanism (176).
|
Concerning the biological role, antigenic variation of the variable major lipoprotein (Vmp) in Borrelia hermsii (238, 288, 289) and of the VlsE surface proteins in B. burgdorferi (411, 412) is well understood (reviewed in reference 12) (see "Biological significance of phase variation" below). Less is known about the mechanisms and molecular pathways underlying this antigenic variation. Detailed analysis of the genetic exchanges underlying specific seroconversion events in B. hermsii has, however, led to the identification of four mechanisms. The first mechanism is consistent with gene conversion and involves a nonreciprocal recombination event of genes from silent (archival) loci from a linear plasmid to an expressed locus, vlp7, near the telomere on plasmid lp28-1. A second mechanism involves a less frequent occurrence of intraplasmidic recombination at a region containing duplicated sequence. This results in loss of a fragment of DNA but occurs only at the expression site. A third mechanism results in introduction of point mutations at the expressed locus, but these may also originate from archival loci. Finally, by mechanisms that are not clear, transcription of the gene at the expression site on lp28-1 can be silenced in conjunction with expression specifically of vsp33 from a site internally located on a 53-kb plasmid. More details can be found in the excellent recent review by Barbour and in the references therein (12).
Antigenic variation is also extensive in Mycoplasma species (reviewed in reference 48). This includes antigenic variation of the Vsa and Vsp lipoprotein family in M. pulmonis (26, 325) and M. bovis (216) and of the VlhA hemagglutinin in M. synoviae (254). This antigenic variation results from genomic rearrangements and is commonly associated with DNA inversion events, which may be the result of either homologous recombination or site-specific recombination (see below). Molecular details for most molecular mechanisms remain to be elucidated for these important pathogens (138, 171, 215, 254, 325, 330). DNA inversion as a result of homologous recombination leads to antigenic variation of the SLPs in Campylobacter fetus. Antigenic variation of SLPs involves reassortment of eight sap genes, each encoding an antigenically distinct SLP, and a single sap promoter. The DNA inversion involves a fragment of 6.2 kb with the single promoter or, in addition, a flanking region with one or more of the variable sap cassettes. The DNA rearrangement positions the one sap promoter to transcribe one of the eight sap genes. These inversion events, also referred to as nested DNA rearrangements, decreased in a recA mutant, suggesting partial dependence on RecA (81, 84, 286).
Gene duplication by recombination is invoked in modulating the level of expression of a gene. In H. influenzae type b, a heritable variation in the level of capsule production occurs as a result of gene duplication of the cap genes, which may be enhanced by the flanking IS-like sequence. In addition, in type 1 H. influenzae type b, an irreversible switch to the nonexpressing type can occur when the bexA gene, which is essential for capsular synthesis, is lost as a result of recombination between duplicated cap sequences flanking bexA (184-186, 296; reviewed in reference 297). The latter event can be reversed only by transformation with DNA from a bex+ isolate (184, 185). Phase variation of capsule production in several Streptococcus pneumonia serotypes is also associated with DNA duplication and excision. In this case, tandem duplication and precise excision of random fragments of 11 to 239 bp occur in genes essential for capsule production. Duplication of the sequence leads to disruption of the open reading frame and thus to a switch to a nonexpressing phenotype. The recombination mechanism has not been characterized but is likely to be RecA dependent since the repeat region is long (378, 379). This is reminiscent of RecA-dependent variation in the number of long repeat units (over 200 nt) in the coding sequence of the alpha C surface proteins in group B S. pneumoniae, the M proteins in group A Streptococcus, and the Esp protein in Enterococcus faecalis. However, in these genes the coding sequence remains in frame and the change in the number of unit repeats affects the antigenicity of the protein (107, 284).
Site-specific recombination. Nonhomologous, site-specific recombination requires specific enzymes that act at cognate DNA sequences that may have sequence identity, but often in a region of no more than 30 bp. Here the distinction is made between conservative site-specific recombination (CSSR) that can lead to inversion, insertion or excision of a DNA region, and transposition. These recombination events can lead to a variety of genetic rearrangements, some of which will lead to phase or antigenic variation (also reviewed in references 120, 160, and 177).
Based on biochemical properties like sequence, structure, and mechanism of recombination, the CSSR recombinase enzymes associated can be divided into two major families, which are the serine and tyrosine families of recombinases, formerly designated the resolvase-invertase and
integrase families, respectively. There is a significant amount of functional overlap among these enzymes, and enzymes of either group can mediate phase and antigenic variation. A third family consisting of two enzymes has recently also been identified, but little is known about the molecular mechanism of recombination (353). For additional details about the biochemical properties of these recombinases and the molecular mechanisms of transposition and site-specific recombination, see two recent reviews (120, 160).
(i) Inversion of a DNA element by CSSR. Recombination mediated by members of the serine and tyrosine families of recombinases occurs at short regions of DNA that contain some sequence similarity or identity required for enzyme recognition and result in reciprocal DNA strand exchange. This recombination is the molecular basis of many DNA inversion events that are involved in creating clonal antigenic diversity in bacteria and phages. Depending on whether the genetic information of this inverted element contains regulatory sequence or coding sequence, inversion can lead to on/off phase variation, biphasic antigenic variation, or even multiphasic antigenic variation.
The well-studied Cre recombinase can mediate recombination between two lox sequences in the absence of other factors. In contrast, in most cases where site-specific recombination leads to phenotypic variation, there is a requirement for cellular proteins in addition to the recombinase, presumably to form a recombination-proficient protein nucleocomplex. Through these factors, control of the recombination event can be exerted.
Inversion of a DNA element causes phase variation of expression of the fim and fot operons in E. coli and mrp in P. mirabilis, encoding type 1, CS18, and MR/P fimbriae, respectively (1, 135, 202, 416). In each case, the invertible element contains a promoter that is essential to transcribe the structural operon. This promoter is correctly positioned for this transcription in only one of the two orientations of the invertible element. The recombinase enzymes mediating the inversion, two for fim and one each for fot and mrp, have homology to each other and are members of the tyrosine recombinase family (135, 160). The most extensively studied system is that of fim, encoding type 1 fimbriae. Essential features of the fim system are outlined below. For a more detailed discussion, the reader is referred to an excellent review and the references therein (31).
(a) Type 1 fimbrial phase variation. Type 1 fimbriae, encoded by the fim operon, are the most common fimbrial adhesins in E. coli isolates. These fimbriae are thought to be of particular importance in mediating attachment to host tissue during bacterial colonization of the bladder, which can lead to cystitis, and can also mediate bacterial invasion into bladder epithelial cells (89, 194, 313). Expression of type 1 fimbriae phase varies a result of the inversion of a regulatory element that contains the essential promoter for transcription of the fimbrial structural genes, the fimA promoter (Fig. 3). The invertible element consists of 296 bp flanked by two 9-bp inverted repeats (IRR and IRL). The main subunit of the fimbriae is FimA, and the fimA promoter is properly orientated for transcription of fimA when the inverted element is in the "on" orientation. In the "off" orientation, the promoter is incorrectly oriented for transcribing fimA and fimbriae are not synthesized. Thus, the inversion event is the main feature of this phase variation system. This DNA inversion is mediated by the two site-specific recombinases, FimB and FimE. These have 48% amino acid identity, but their DNA specificity and activity differ. FimB mediates inversion in both directions, whereas FimE mediates the inversion predominantly in the "on" to the "off" direction (174, 230). The FimE bias is due in part to its substrate preference for DNA in the "on" orientation (102, 187). The frequency of inversion mediated by FimB is on the order of 103 to 104, whereas the FimE-mediated inversion frequency is as high as 101.
|
The level of FimE expression also is affected by posttranscriptional regulation. Specifically, the level of expression is higher when the orientation of the downstream invertible element is in the "on" orientation than when it is inverted. This "orientation control" is a result of differential stability of the fimE transcript. The transcript is quite stable when the invertible element is in the "on" orientation, and the transcript extends into the switch region. The fimE transcript is less stable when the element is in the "off" orientation. Differential stability may be the result of the formation of a Rho-dependent terminator in the "off" orientation or that of a secondary structure that stabilizes the transcript in the "on" orientation (166, 335). Thus, the invertible element exerts its control at two levels, first by affecting the orientation of the main fimA promoter and second by affecting the level of FimE production.
The relative amounts of FimB and FimE are important in current models of regulation, but a much more complex picture of fim regulation has emerged. For example, a bias to the "off" phase occurs as a direct result of transcription originating at the fimE promoter, even in the absence of functional FimE (260). Several lines of evidence also suggest that transcription from the fimA promoter and the DNA inversion event are mutually exclusive (260). In addition, an "off" phenotype in which fimbriae are not produced can be obtained even with the DNA in the "on" orientation, as a result of posttranscriptional regulation (229). Furthermore, various cellular factors influence the recombination reaction, including the host factors Lrp, integration host factor, and H-NS, presumably by assisting in the formation and stabilization of the recombination-proficient protein nucleocomplex (32, 78, 90, 101, 103, 260, 262). The interaction of these regulators with the fim DNA can be modulated, as illustrated by the effect of the branched-chain amino acids and alanine on the Lrp binding affinity (298). Thus, environmental factors can affect fim phase variation, not only by affecting the level of recombinase(s) but also by directly influencing the inversion reaction.
Differences in fim phase variation rates have been found among clinical isolates. One of the identified differences may lie in variable FimB expression due to its dependence on the availability of the minor tRNA LeuX (293). The leuX locus is often linked to pathogenicity islands, and its expression may vary in concert with that of virulence factors (74, 75). In addition, sequence variation of the invertible element and the putative presence of an additional, strain-specific transcriptional activator have been implicated in strain-dependent variation. These differences in fim regulation may influence the relative success of a strain in different environments.
Our understanding of the molecular mechanism underlying the fim expression system has made it possible to address the occurrence and role of fim phase variation in vivo. The percentage of bacteria that contained the element in the "on" position was determined in animal models of urinary tract infection (114, 145, 343) and in bacteria isolated from women with urinary tract infections (205). In addition, using the mouse model, the effect on bacterial colonization of preventing inversion of the fim invertible DNA element was examined (113). Both lines of research suggest that phase variation itself is an important feature during different stages of infection. With a phase-varying isolate, a bias to the "on" state was observed in bacteria in the bladder at specific times of infection, but the relative contributions of regulation of phase variation and of host-driven selection against a specific expression phase are not yet clear. Applying this general approach to other infection models should yield valuable insights into bacterium-host interactions and the role of phase variation in this interaction.
(b) Other CSSR-dependent types of phase variation. DNA inversion-mediated site-specific recombination causes an interesting combination of antigenic and phase variation of the PilV protein of the type IVB pilus in S. enterica serotype Typhi (415). Inversion is mediated by the Rci recombinase, which is a member of the tyrosine recombinase family. Rci was first determined to be involved in inversion associated with a shufflon located on plasmids pR64 and pR721. This shufflon determines which of seven C-terminal ends is incorporated into the PilV subunit of the pilus, and this determines the host range for plasmid transfer. The Rci-mediated inversion at the pilV gene of S. enterica serotype Typhi, however, results in biphasic antigenic variation of the type IVB pilus. Inversion of a 490-bp fragment causes one of two variable C termini to be fused to a constant N terminus of PilV, which is a minor component of the pilus. However, under conditions favoring very rapid inversion, specifically when DNA is highly supercoiled, neither PilV protein is expressed. It is thought that RNA polymerase becomes detached during the inversion process and that therefore during very rapid inversion there is insufficient time for RNA polymerase to synthesize a full-length transcript (243, 414). The antigenic variation affects receptor recognition of the pilus, whereas pilus-mediated autoaggregation occurs in the absence of PilV expression.
DNA inversion is also mediated by members of the serine recombinase family (also called the DNA invertase family), for example, Hin-mediated recombination leading to flagellar H1/H2 antigenic variation in S. enterica serotype Typhimurium. The main difference with tyrosine recombinase-mediated DAN inversion lies in the details of the biochemistry and mechanism of recombination (reviewed in references 120 and 160). H1-H2 antigenic variation is a result of expression of either FliC or FljB, respectively, and is a result of Hin-mediated site-specific recombination at two 26-bp inverted repeats. This causes inversion of a 995-bp region that contains the promoter for fljB and fljA (formerly designated rH1) (316, 327, 328). Thus, when the promoter is oriented toward transcription of the fljBA operon, FljB (H2) flagella are expressed concomitant with the repressor fljA. FljA represses fliC expression by affecting both transcription and translation and may interact with the 5' untranslated region of fliC (35). On Hin-mediated inversion, fljB and fljA are not transcribed, fliC repression is abrogated, and H1 flagella are expressed. Similar to Fim-mediated inversion, cellular factors facilitate the formation of the protein nucleocomplex that must be formed to allow the recombination to occur and is required to activate the recombinase (125). These factors will influence the rate of inversion, as illustrated by a 150-fold increase in the rate of Hin-mediated inversion by Fis binding to the recombination enhancer element and Fis-Hin interactions (161, 239, 364). Other recombinases in this family include the phage-encoded Gin and Cin of E. coli Mu and P1, respectively. These enzymes cause a DNA inversion event that determines the composition of the phage tail fiber and thus the host range of the bacteriophage (110, 149, 363).
The recombinase Piv in Moraxella bovis and M. lancunata also mediates a DNA inversion event, which causes antigenic and phase variation of type IV pili (197, 222, 223). Piv is not a member of the tyrosine or serine recombinase family but has sequence similarity to MooV transposase, which is involved in phase variation in Pseudoalteromonas atlantica (see below) (353). Piv causes DNA inversion between the coding region of tfpQ and tfpI, which determines which of these pilins are expressed and incorporated in type IV pili. In M. lancunata, TfpI contains a mutation that inactivates the gene product, and therefore inversion in this species leads to on/off phase variation of TfpQ (302).
Eight loci have been identified in B. fragilis that encode eight different capsular polysaccharides. Expression of each is under the control of on/off phase variation. At seven loci, the expression phase correlates to the orientation of an invertible DNA element in the locus. This element contains the promoter of a regulatory gene for the corresponding locus. One specific invertible element consists of 193 bp that is flanked on each side by a 19-bp inverted repeat sequence. The other invertible elements are of similar size and are flanked by similar inverted repeats. The corresponding recombinase(s) has, however, not been identified yet (183a).
(ii) Insertion and excision of genetic elements from the chromosome. Transposition can mediate reversible phase variation only if excision is precise, with restoration of the original sequence of the recipient DNA; in most transposition events, the original sequence of the recipient DNA is not restored after excision of the transposing element. Furthermore, classic transposition in general does not target a specific DNA sequence. In contrast, transposition mediated by the putative transposase MooV does lead to phase variation; indeed, this transposition requires short sequence identity between the insertion element and the target sequence (277). Therefore, transposition-mediated phase variation can occur but may be limited to a specific group of transposable elements and recombinases.
MooV appears to mediate phase variation of the extracellular polysaccharide encoded by the eps locus in certain isolates<