CMR FigSearch
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by van der Woude, M. W.
Right arrow Articles by Bäumler, A. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by van der Woude, M. W.
Right arrow Articles by Bäumler, A. J.

 Previous Article  |  Next Article 

Clinical Microbiology Reviews, July 2004, p. 581-611, Vol. 17, No. 3
0893-8512/04/$08.00+0     DOI: 10.1128/CMR.17.3.581-611.2004
Copyright © 2004, American Society for Microbiology. All Rights Reserved.

Phase and Antigenic Variation in Bacteria

Marjan W. van der Woude1* and Andreas J. Bäumler2

Department of Microbiology, School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6076,1 Department of Medical Microbiology and Immunology, College of Medicine, Texas A & M University System Health Science Center, College Station, Texas 77843-11142

SUMMARY
INTRODUCTION
PHASE-VARIABLE PHENOTYPES AND MOIETIES
    Colony Morphology and Opacity
    Capsule
    Fimbriae and Pili
    Flagella
    Other Surface-Exposed Proteins
    LPS and LOS Modification: Variation in Expression of Surface Epitopes
    DNA Restriction-Modification Systems
    Regulatory Proteins
    Metabolism-Associated Genes
    Phage Genes
    Concluding Remarks
MOLECULAR MECHANISMS OF PHASE VARIATION
    Genetic Regulation
        Short sequence repeats and slipped-strand mispairing
        Homologous (general) recombination.
        Site-specific recombination.
        (i) Inversion of a DNA element by CSSR.
            (a) Type 1 fimbrial phase variation.
            (b) Other CSSR-dependent types of phase variation.
        (ii) Insertion and excision of genetic elements from the chromosome.
    Epigenetic Regulation
        Pap phase variation
        Ag43 phase variation
    Phase Variation as Part of the Cell's Regulatory Network
        Cross regulation
        Environmental regulation
    Concluding Remarks
GENOMICS AND PHASE VARIATION
BIOLOGICAL SIGNIFICANCE OF PHASE VARIATION
    Persistence through Surface Variation
    Evasion of Cross-Immunity
    DNA Restriction-Modification Systems
DIAGNOSTIC AND EXPERIMENTAL SIGNIFICANCE OF PHASE VARIATION
ACKNOWLEDGMENTS
REFERENCES

   SUMMARY
 Top
 Next
 References
 
Phase and antigenic variation result in a heterogenic phenotype of a clonal bacterial population, in which individual cells either express the phase-variable protein(s) or not, or express one of multiple antigenic forms of the protein, respectively. This form of regulation has been identified mainly, but by no means exclusively, for a wide variety of surface structures in animal pathogens and is implicated as a virulence strategy. This review provides an overview of the many bacterial proteins and structures that are under the control of phase or antigenic variation. The context is mainly within the role of the proteins and variation for pathogenesis, which reflects the main body of literature. The occurrence of phase variation in expression of genes not readily recognizable as virulence factors is highlighted as well, to illustrate that our current knowledge is incomplete. From recent genome sequence analysis, it has become clear that phase variation may be more widespread than is currently recognized, and a brief discussion is included to show how genome sequence analysis can provide novel information, as well as its limitations. The current state of knowledge of the molecular mechanisms leading to phase variation and antigenic variation are reviewed, and the way in which these mechanisms form part of the general regulatory network of the cell is addressed. Arguments both for and against a role of phase and antigenic variation in immune evasion are presented and put into new perspective by distinguishing between a role in bacterial persistence in a host and a role in facilitating evasion of cross-immunity. Finally, examples are presented to illustrate that phase-variable gene expression should be taken into account in the development of diagnostic assays and in the interpretation of experimental results and epidemiological studies.


   INTRODUCTION
 Top
 Previous
 Next
 References
 
Phenotypic variation in bacteria has long been recognized and has been a focus of study mainly in bacterial pathogens. This interest has been fueled by the observation that phenotypic variation in pathogens, most readily visible as colony variation, is often associated with the virulence of the strain. Alternating between two phenotypes in a heritable and reversible manner can be classified as phase variation or antigenic variation. These terms, phase variation and antigenic variation, however, have been used in various ways. Phase variation in general refers to a reversible switch between an "all-or-none" (on/off) expressing phase, resulting in variation in the level of expression of one or more proteins between individual cells of a clonal population. What distinguishes this variation from genetic noise and classical gene regulation is that there is a genetic or epigenetic mechanism that allows the variability to be heritable. This means that a daughter cell will inherit the expression phase of the parent. However, the phase of expression must also be reversible between generations, and the frequency of this reversion should exceed that of a random mutation. Thus, in a clonal population after cell division, the majority of daughter cells will retain the expression phase of the parent but a minority will have switched expression phase. The switch is a stochastic event, even though the chance that it occurs in some cases can be influenced by external factors; in other words, the switching frequency can be modulated. The frequency with which this occurs is characteristic for the gene, the bacterial species, and the regulatory mechanism. This can be as high as a change in 1 cell per 10 per generation but more often is on the order of 1 change per 103 cells per generation. The term "phase variation" is used in other contexts as well, describing phenotypic "phase variants" of a species in which the change is irreversible or variants that are a result of environmental regulation, selection, or unidirectional mutation. In this review, however, we adhere to the definition in the sense that the expression phase must be inherited by a genetic of epigenetic mechanism and that this change must be reversible. The actual switching frequencies are not reported here, because the methods that are used to determine them vary significantly and because they can be modulated by growth conditions. Differences are therefore difficult to interpret.

Antigenic variation refers to the expression of functionally conserved moieties within a clonal population that are antigenically distinct. The genetic information for producing a family of antigenic variants is available in the cell, but only one variant is expressed at a given time. In an excellent chapter on antigenic variation in bacteria, Barbour listed three criteria that must be fulfilled for variation to be considered as antigenic variation (12). These are (i) that the antigenic change must be involved in avoidance of immune or niche selection, (ii) that it is a multiphasic change, and (iii) that the mechanism is consistent with gene conversion. In this review, the term is used in a broader sense, including biphasic variation, antigenic variation for which the biological significance is not clear, and modification of the antigenic identity of a cell surface structure as a result of a phase-varying enzyme. Antigenic variation in eukaryotic pathogens, including Plasmodium falciparum and African trypanosomes, is not addressed here but is discussed in recent reviews (15, 37, 70, 77, 217, 358).

We will refer the reader to reviews that present in-depth discussions on specific topics relating to phase and antigenic variation, where relevant. These focus, for example, on a single bacterial species, regulatory mechanism, or biological role (29-31, 68, 85, 119, 127, 140, 160, 177, 364, 368, 410). In this review, we hope to provide the reader with awareness and basic understanding of the prevalence, mechanisms, and significance of phase variation in bacteria, with an emphasis on recent developments and insights.


   PHASE-VARIABLE PHENOTYPES AND MOIETIES
 Top
 Previous
 Next
 References
 
The classical view of phase variation and antigenic variation is that its role is to help the bacterium evade the host immune system. This seemed to be supported by the fact that the structures that were found to phase vary were on the cell surface, where they would be exposed to the immune system. From the examples discussed below and those presented in Table 1, it is evident that the majority of identified phase-variable moieties are indeed on the surface and exposed to the environment. However, some variation occurs for which there is no evidence of association with changes in the cell surface, such as phase variation of DNA modification. Furthermore, from Table 1, it is clear that in some bacterial species many more loci have been identified that are under the control of phase or antigenic variation than in other species. Below, both well-studied and less well-studied examples are described that illustrate the diversity in moieties affected and species in which phase and antigenic variation have been identified. References are provided for systems that are not described in detail but may be of interest to the reader, and, where relevant, we refer the reader to Table 1 or other sections of this review where additional aspects of the system are discussed.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Representative selection of bacterial species in which phase and/or antigenic variation occursa

 
Colony Morphology and Opacity

Historically, phase-variable structures have been recognized by their effect on colony morphology that led to descriptions like dry versus moist, ruffled versus smooth, and opaque versus translucent. These changes in colony morphology can be attributed to phase variation of a variety of surface-exposed proteins, of the capsule, and of cell wall composition. These changes most probably lead to altered packing of cells in the colony, which determines the colony morphology. This relationship was directly shown for Vibrio parahaemolyticus with variable levels of capsule production (92).

In Haemophilus influenzae type b strains, three colony variants, opaque, intermediate, and translucent, have been found. These vary in important virulence properties like colonization in the nasospharynx and serum resistance. The colony phenotype appears to be a result of specific combinations of expression of multiple phase-variable proteins, which include variation in the level of capsule production and in the level of a cell envelope protein encoded by oapA (244, 281, 296, 388) (Table 1). A clear relation between colony phenotype and colonization or virulence also exists in Streptococcus pneumoniae (172, 387). In an animal model, opaque variants were more virulent on systemic infection whereas the translucent variants were more successful colonizers of the nasospharynx. The two variants also differed in an in vitro assay of invasion and transcytosis of endothelial cells. Opaque variants produced up to sixfold more capsule and twofold less teichoic acid compared to the transparent form (292). Similarly, Streptococcus gordonii colony morphology and virulence-associated properties, including hemolysin production, phase vary (163, 374). In contrast, in Helicobacter pylori, a variable-colony phenotype is a result of phase variation of expression of phospholipase A, which indirectly affects virulence by release of urease and VacA (347, 348) (Table 1). In the pathogen Salmonella enterica serotype Typhimurium, variable colony morphology is correlated with the coordinated control of phase variation of at least four proteins (153). However, the soil isolate Pseudomonas aeruginosa also showed variable colony morphology, which was correlated with phase variation of multiple traits including aggregation and motility (71).

Color variation in colonies grown on specific media can be caused by phase variation of proteins that interact with a dye. For example, in certain strains of Staphylococcus epidermidis, phase variation of a polysaccharide adhesin leads to variable colony color when grown on Congo red agar (419, 420) (Table 1; also see "Molecular mechanisms of phase variation" below). In summary, any reversible change in colony morphology, opacity, or color indicates that the expression of one or more proteins phase varies. Further analysis of these and related phenotypic variations may provide us with new insights into bacterial survival strategies, and the strong correlation between virulence and colony morphology in pathogens suggests that characterizing the underlying molecular basis may also provide valuable insights into bacterial pathogenesis.

Capsule

The capsule can influence interactions with the host cells and host environment, including invasion, adhesion, and serum sensitivity, and is a well-recognized virulence factor. Phase variation of capsule synthesis can involve a classic on/off switch but also has been used to describe modulations in the level of production; as mentioned above, this can change colony morphology. Capsule variation has been found to occur in both gram-positive and gram-negative bacterial species, including Campylobacter jejuni (9), Citrobacter freundii (267), S. pneumoniae (379), and specific serogroups of Neisseria meningitidis (121, 122) (Table 1). In Bacteroides fragilis, eight different capsule polysaccharides can be produced per cell. The expression of each is under the control of on/off phase variation, resulting in a very diverse clonal population (183a). In H. influenzae type b cells, the level of expression of the capsule can be modulated, and an irreversible switch to a nonexpressing phenotype can occur (see the sections on recombination [below]) (reviewed in reference 297).

Fimbriae and Pili

Fimbriae, also referred to as pili, are proteinaceous structures that extend from the cell surface. Fimbriae assembled by the chaperone/usher-dependent and the nucleator/precipitation pathways are distinct in both structure and function from type IV pili (reviewed in references 96, 227, 306, and 381). Fimbriae mediate adhesion of the bacterial cell to host tissue through interaction with receptors located on the host cell (reviewed for Escherichia coli in reference 209). These interactions, which occur either with the main structural subunit or with a fimbrial adhesin, are often specific, so that certain chemical groups of the host proteins or lipids are recognized by the fimbrial adhesin. Fimbria-mediated attachment to inorganic, solid surfaces most probably occurs by nonspecific interactions, an important feature in biofilm formation (263, 282). Phase variation of fimbriae is regulated mostly by mechanisms that affect transcription originating at the major promoter of the operon, resulting in variable (on/off) expression of most or all genes in the fimbrial operon (209).

A single species, or even a single isolate, can express multiple fimbriae that each can phase vary. The genomes of different pathogenic E. coli isolates encode different fimbriae, some of which phase vary, as well as type 1 fimbriae, which is common to all isolates and which phase varies (Table 1). The S. enterica serotype Typhimurium genome encodes at least 11 fimbrial operons, among which phase-variable expression has been identified for pef, lpf, and fim (53, 146, 251, 256) (Table 1). The biological role of phase variation of fimbriae is discussed in "Biological significance of phase variation" (below).

Fimbrial phase variation control mechanisms and the fimbrial structural genes may have evolved as separate modules. For example, the common feature of the pap-like family of fimbrial operons is the regulatory mechanism of phase variation, but the fimbrial structural genes within this family are not all related. Conversely, the subunit of the MR/P fimbriae in Proteus mirabilis resembles that of Pap fimbriae in E. coli, but the two phase variation control mechanisms are different (128, 202) (see "Molecular mechanisms of phase variation" below) (Table 1).

Type IV pili function as adhesins and include conjugative pili. Phase variation, antigenic variation of the structural subunits, and phase-variable modification have been described. Sequence variation in the type IV conjugative pili encoded by plasmids R64, R721, and ColI-P9 occurs as a result of incorporation of only one of a set of distinct C termini in the PilV tip proteins of the pilus in an individual cell. This sequence variation is associated with different receptor specificity, thereby dictating the species that will be preferred as a DNA recipient in a conjugation reaction (154, 177, 178). N. gonorrhoeae can theoretically produce over a million different, antigenically distinct pilin subunits for its type IV pili (99, 180, 240, 319, 346; reviewed in reference 12) (see "Molecular mechanisms of phase variation" below) (Table 1). In addition, the pilus-associated protein PilC phase varies (164) (Table 1). These pili are involved in interaction with eukaryotic cells, and thus these variations are probably important for pathogenesis (28, 312). In S. enterica serotype Typhi, phase-variable expression of the PilV subunit of the type IVB pili affects the pilus-associated property of cellular autoaggregation (243) (see "Molecular mechanisms of phase variation" below). Pili can also be modified, but whether modification occurs can vary within a clone, due to phase variation in the expression of one of the enzymes involved. This is the case for glycosylation by PgtA in N. gonorrhoeae (10) (Table 1).

Flagella

Flagella mediate bacterial motility; adhesion and virulence are sometimes enhanced by flagellar expression and motility (265). Flagellin is also a pathogen-associated molecular pattern recognized by the innate immune system through Toll-like receptor 5 (204). The antigenic property of the flagellin of a bacterial isolate forms a significant part of the serological classification scheme, indicating that this is, in general, an invariant property. However, phase and antigenic variation of flagella in a clonal population does occur. As early as 1922, variation of the flagellar antigens in S. enterica serotype Typhimurium was described by Andrewes (6). This consists of variation between two forms (biphasic antigenic variation), H1 and H2, in which the flagellar subunit consists of the FliC or FljB protein, respectively (35, 326) (see "Molecular mechanisms of phase variation" below). Mutants that no longer can switch between flagellar types were altered in virulence compared to the wild-type isolate (150), and the two flagellar subunits appear to elicit different responses from eukaryotic cells (46).

In Campylobacter coli (272), Campylobacter jejuni (106, 169), and Helicobacter pylori (264), flagellar expression and motility phase vary (41, 73, 165) (Table 1). The underlying reason for this can differ. For example, C. coli expression of FlhA, which is required for the expression of flagellin, phase varies (272), whereas in H. pylori, expression of the fliP gene, encoding the flagellar basal body, phase varies (165) (Table 1). In Bordetella pertussis, phase-variable expression of the regulatory system BvgAS results in flagellar phase variation (340) (Table 1) (see "Regulatory proteins" below). In this species, flagellar synthesis is not required for virulence and may even be detrimental (2).

Other Surface-Exposed Proteins

Proteins that are integrated in the cell wall in gram-positive organisms or in the outer membrane in gram-negative organisms can have a variety of functions; these proteins include transporters, porins, receptors, colonizing factors, and enzymes. Antigenic or phase variation has been found for at least one member of each of these functional groups (Table 1).

In the M1inv+ clone of the gram-positive group A Streptococcus (S. pyogenes), expression of the cell wall-associated surface proteins C5a peptidase, M protein, and type IIa IgG Fc receptor phase vary, as well as expression of the capsule and pyrogenic exotoxin (52, 195). This is in part due to phase-variable expression of the DNA binding, regulatory protein Mga (see "Regulatory proteins" below) (36, 233-235, 329). Expression of the collagen-like surface protein SclB is under the control of a separate phase variation control mechanism (285) (see "Molecular mechanisms of phase variation" below) (Table 1).

In gram-negative N. gonorrhoeae and N. meningitidis strains, expression of various outer membrane proteins phase varies, including that of members of the family of outer membrane opacity proteins (opa) that facilitate adhesion (339) (see "Biological significance of phase variation" below) (Table 1) and the porin PorA (class I outer membrane protein) in serogroup B N. meningitidis (Table 1). PorA is one of the candidates for a protein-based vaccine, and phase variation, as well as the naturally occurring antigenic variation of this protein, may affect efficacy (16, 51, 355, 365). In addition, in Neisseria spp., the expression of outer membrane proteins that are involved in iron acquisition phase varies, including the siderophore receptor FetA in N. gonorrhoeae (42) and two hemoglobin receptors in N. meningitidis DNM2 (201) (Table 1). This may reflect a need to balance iron acquisition during growth in the host and to evade the immune system. Phase-varying colonizing factors include Ag43 in E. coli (64, 270) (Table 1) and Oap H. influenzae (281, 388).

In Campylobacter fetus, which is a pathogen of domestic and wild animals, a class of proteins known as surface layer proteins (SLPs) are exported to the cell surface and are noncovalently attached to the lipopolysaccharide (LPS) (80). SLPs are important virulence factors, and the absence of SLP leads to increased sensitivity to complement activity and decreased infectivity (27). These SLPs undergo extensive antigenic variation, which is achieved by so called "nested DNA inversion" (81, 83, 84, 286; reviewed in reference 82) (see "Molecular mechanisms of phase variation" below) (Table 1). Other examples of phase variation of surface proteins in gram-negative bacteria are included in Table 1.

Mycoplasma species do not have a cell wall, and lipoproteins constitute part of the surface proteins. Many of these are under the control of phase and antigenic variation (19, 20, 26, 325, 407). This includes a substrate binding component of an ABC transporter in Mycoplasma fermentans (351). Interestingly, the pMGA family of hemagglutinins phase varies in Mycoplasma gallisepticum (207, 253, 401), whereas the homologous proteins in Mycoplasma synoviae undergo antigenic variation (254). Expression of the Vlp family of lipoproteins in Mycoplasma hyorhini undergoes both phase variation and antigenic variation (47-50, 299, 300, 408). The combination of these two regulatory systems and the fact that there is a family of six related Vlp proteins that each are subject to these controls lead to a large repertoire of Vlp proteins that can be expressed (409). In Mycoplasma species that are human commensals and pathogens, phase or antigenic variation is indicated for M. hominis (190) and M. penetrans (250, 301).

Perhaps the best-studied example of multiphasic antigenic variation is that of lipoproteins in Borrelia spirochetes that are the causative agents of relapsing fever (reviewed in references 12 and 13). These lipoproteins are divided in two groups, Vlp and Vsp for large and small proteins, respectively, and can be further divided into different families, each with about 70% sequence identity (see "Biological significance of phase variation" below). One of the mechanisms involves recombination between an extensive repertoire of silent, variable vlp and vsp loci ("archival loci") and an expression site (reviewed in reference 12). The closely related protein VlsE in B. burgdorferi, the causative agent of Lyme disease, is also under the control of antigenic variation through a similar combinatorial variation (181, 232, 411) (Table 1).

LPS and LOS Modification: Variation in Expression of Surface Epitopes

In gram-negative bacteria, LPS is the main constituent of the outer leaflet of the outer membrane. LPS consists of a lipid A moiety, a core of polysaccharide, and an O antigen. LPS variability among species and serotypes occurs mainly in the O antigen, specifically in the identity and number of sugars in the polysaccharide chain. In some species, the core lacks the multiple O-linked saccharide units and is often therefore referred to as lipooligosaccharide (LOS) (283). LPS, also referred to as endotoxin, is a powerful stimulant of the immune system due to the lipid A moiety, which is a pathogen-associated molecular pattern recognized specifically by toll-like receptor 4 (25).

The chemical identity of LPS or LOS is defined by the addition of side groups, for example as a result of the activity of glycosyltransferases or sialyltransferases, or by the addition of phosphorylcholine (ChoP). These traits can vary within a clonal population as a result of phase variation of one or more enzymes involved in the modification. An in-depth review of the chemical nature of modification of the O antigen of LPS is presented in reference 199. LPS modifications can impact antigenicity but can also affect serum sensitivity and adhesion. This is discussed in more detail below for the LOS of N. meningitidis (see "Biological significance of phase variation"). Below, additional examples are given relating to several important pathogens. Since variable LPS modification is not easily identified, it is quite possible that it occurs in other species as well.

Ganglioside mimicry of the LOS by Campylobacter jejuni is thought to be an important factor in the development of Guillain-Barré and Miller-Fisher syndromes after infection and is associated with specific modification of the LOS. Expression of the enzymes involved in the modification can phase vary. Alternate synthesis of gangliosides GM-2 and GM-1, like LOS in C. jejuni NCTC 11168, is a result of phase variation of expression of ß-1,3-galactosyltransferase encoded by wlaN (206); in C. jejuni strain 81-176, reversible conversion between the GM-2 and GM-3 LOS correlates with phase-variable expression of the cgtA gene (111) (Table 1). Based on our understanding of these phase variation mechanisms, Linton et al. were able to show that a different C. jejuni strain, which had been characterized as producing only GM-2 like LOS, was able to convert to producing GM-1 like LOS (206). This illustrates how an understanding of phase variation at the molecular level may have a significant impact on the understanding of the pathogenicity and epidemiology of a pathogen.

Another well-studied example of apparent host mimicry occurs in the human gastric pathogen Helicobacter pylori, which can incorporate into its LPS variable carbohydrate modifications that resemble structures of the Lewis group of antigens of human blood groups. This variation occurs distinct from, and in addition to, LPS microheterogeneity (reviewed in reference 8, 384). Three fucosyl transferase genes (futA, futB, and futC, also referred to as the fucT genes), which are each under the control of a phase variation mechanism, we involved in the LPS modification (7, 385) (Table 1). Other genes involved in LPS modification may phase vary as well (309).

In both encapsulated and nonencapsulated H. influenzae, modification can occur by phase-variable Lic3A or LgtC, a sialyltransferase and a glycosyltransferase, respectively (136, 137, 226, 391) (Table 1). Interestingly, these enzymes compete for the same lactose disaccharide moiety on LOS. Thus, the activity of one enzyme affects the substrate availability for the other. Phase variation of lgt-mediated glycosyl modification in N. meningitidis and N. gonorrhoeae is discussed in more detail below (see "Biological significance of phase variation") (Table 1).

The LOS of H. influenzae can also be decorated with ChoP. Phase-variable expression of the kinase encoded by licA results in phase-variable ChoP decoration, even though other genes may also contribute (315, 394) (Table 1). The presence of ChoP appears to confer sensitivity to serum-mediated killing caused by C-reactive protein, whereas in an animal model of the nasospharynx, this may confer a competitive advantage (214, 392, 395). Genetic analysis of three genes of the lic locus was performed to determine the on or off state of genes involved in LOS modification in bacteria isolated from the nasospharynx, blood, and cerebrospinal fluid in an animal model. Tissue-specific combinations were prevalent, supporting the idea that certain combinations of LOS modification may facilitate colonization or survival in different host environments (141).

Among Neisseria species, phase-variable ChoP modification of the LPS can occur, but type IV pili also contain ChoP (389, 394). Analysis of ChoP expression in a large group of isolates suggests that the commensal isolates decorate the LPS whereas pathogenic isolates decorate the pili (321, 322, 389) (Table 1).

Other phase-variable LOS modifications in H. influenzae, as well as in the related bovine pathogen H. somnus, exist but have not been characterized further (151, 152, 155, 402). Antigenic or phase variation of LPS also occurs in certain Legionella pneumophila isolates (see "Molecular mechanism of phase variation" below) (212, 213) and in S. enterica serotype Typhimurium (188). Modification of LOS or LPS in Francisella tularensis (61), Coxiella burnetii (98, 117), and Chlamydia spp. (211) varies, but the contribution of a phase variation control mechanism versus environmental regulation or mutation has not been determined.

DNA Restriction-Modification Systems

DNA restriction-modification (R/M) systems allow a bacterium to recognize and restrict foreign DNA that is not appropriately modified. Expression of some of these systems phase varies, but protein sequence variation analogous to antigenic variation also occurs (also see "Biological significance of phase variation") This was first identified in Mycoplasma pulmonis, a rodent pathogen. Recombination-dependent rearrangement occurs within the hsdS genes at two loci, each containing two copies of hsdS (85). The variable HsdS proteins that are synthesized from these recombinant genes each determine a different DNA sequence specificity for the DNA R/M system (84). In addition, inversion can result in incorrect orientation of the hsdMR genes relative to the promoter, and therefore on/off phase variation also occurs (86, 87, 331).

Sequence analysis of other bacterial genomes suggests that phase variation of (putative) DNA R/M systems occurs in a variety of species (66, 68, 307, 309, 354) (see "Genomics and phase variation" below). Confirmation of phase variation has been obtained for expression of the mod gene in H. influenzae (66), for a type III modification system in Pasteurella haemolytica (304), for both the modification and restriction enzymes of a type III R/M system in H. pylori (70), and for a modification system in S. pneumoniae (275, 349) (Table 1).

In the gram-positive soil bacterium Streptomyces coelicolor A3(2), the phage growth limitation system determines reversible sensitivity and resistance to {phi}C31 phage. The complete molecular basis is not clear, but it is associated with phase variation of a DNA methyltransferase, PgIX (192, 344). Thus, some phase-variable DNA modification systems may have evolved as a protection mechanism against phage infection.

Regulatory Proteins

DNA binding proteins that function as activators or repressors can be categorized as "global" regulators with genome-wide target sequences or as "operon-specific" or "local" regulators. The expression of multiple regulatory proteins is now known to phase vary and includes representatives of both groups. The expression state of all genes that dependent on the regulator, under both positive or negative control, will depend on the expression state of the phase-varying regulator itself. Examples are the global, virulence-associated regulatory protein in S. pyogenes, Mga (formerly Mry or VirR) (36, 329), and the BvgS protein of the global, two-component BvgAS regulatory system in Bordetella pertussis (340; reviewed in reference 228) (Table 1). The Bvg+, Bvg, or Bvgi phase variants of B. pertussis of later studies are not related to this phase variation mechanism. Rather, these phase variants result from the environmental modulation of BvgAS-dependent regulation (60). Expression of conjugation-related genes phase varies in Enterococcus faecalis, also as a result of phase-variable expression of a regulator, TraE (124, 280).

Phase-variable expression of operons is often the result of a mechanism that regulates the initiation of transcription at the main promoter of the operon (see "Molecular mechanism of phase variation" below). As a result, expression of local regulators encoded by genes in the operon also phase varies. For example, in E. coli, expression of the local regulator PapB, encoded by the pap operon, phase varies (Table 1) (33). This not only affects pap expression through an autoregulatory loop, but also affects type 1 fimbrial expression (97, 403) (see "Cross regulation" below) (Table 1). Together, these examples demonstrate that phase variation of a single regulatory protein can lead to coordinated, phase-variable expression of multiple cellular proteins and can establish an interdependent network of phase-variable gene expression.

Metabolism-Associated Genes

Recently, phase variation of metabolism-associated proteins was identified in the human pathogen Streptococcus pneumoniae. A comparison of protein expression patterns between two colony variants showed that at least three proteins were differentially expressed, including pyruvate oxidase (SpxB), a putative elongation factor, and a proteinase maturation protein (268). The significance of SpxB phase variation appears to be related to the hydrogen peroxide that is produced in the pyruvate oxidase-mediated conversion of pyruvate to acetyl-phosphate. The level is sufficiently high to be lethal to other species and may provide a SpxB+ isolate with a competitive advantage in a mixed-species environment (275, 276). This example shows that identification of differentially expressed proteins between colony phase variants can also yield insight into bacterial virulence strategies.

Phage Genes

The composition of the tail fiber of E. coli phage Mu determines the bacterial host range of the phage, presumably in conjunction with strain- or species-specific bacterial LPS. The composition alternates between two forms as a result of a DNA inversion event mediated by the site-specific recombinase Gin (110, 279, 363). Phage Mu is not associated with virulence of E. coli, but some virulence factors are phage encoded, for example cholera toxin (reviewed in reference 39). It is therefore conceivable that phase variation will be identified in phage proteins that (indirectly) affect the spread of virulence traits among natural bacterial populations and therefore will be an important feature from an epidemiological standpoint.

Concluding Remarks

Most of the genes that are known to undergo phase-variable expression encode surface-exposed proteins or proteins that modify or regulate surface proteins, and most of the putative phase-varying genes identified in genome-wide screens also belong to these classes (Table 1). However, in this section we have presented numerous examples showing that phase variation is not limited to proteins with a specific function or specific cellular location. Most phase-varying proteins were identified in bacterial pathogens of mammalian hosts, and many are proteins that affect virulence. This bias may reflect the significant focus of the research community on these organisms and does not rule out the occurrence of phase variation in commensal species or species that do not reside in or on a host. It is thus tempting to speculate that the occurrence of phase variation is more prevalent. Novel phase-varying proteins may be identified as a result of our increasing understanding of the role(s) of phase variation and through the facilitated identification of some of the underlying mechanisms using genomics. Conversely, as more phase-varying genes are identified, we will be challenged to determine their biological significance.


   MOLECULAR MECHANISMS OF PHASE VARIATION
 Top
 Previous
 Next
 References
 
As is evident from Table 1, there is no correlation between the phase-varying phenotype and the regulatory mechanism. Specific mechanisms, however, appear to be more prevalent in certain species than others, and some, like epigenetic regulation, have been identified in only a few species. Understanding the molecular mechanisms that lead to phase and antigenic control is a significant part of understanding how these systems contribute to the overall success of the bacterium. For example, it can help determine whether signals can be incorporated into the system that can modulate the switch frequency. This, in turn, ultimately determines the composition of the population, which impacts the evolution and dynamics of the population, the ability to adapt to new environments, and host-bacterium interactions. This section provides an overview of essential features. Recent developments in particular are highlighted. The reader is referred to other reviews that may include different examples (30, 119, 127, 140, 160, 177) and to specialized reviews when available.

Genetic Regulation

In this section the mechanisms are discussed in which the change in expression phase in an "on" cell and an "off" cell can be attribute to a DNA sequence change at a specific locus. All antigenic variation is a result of DNA sequence change and is due to one of several genetic mechanisms; this is also briefly discussed in this section. The change can be minor, with a single nucleotide change in the case of slipped-strand mispairing (SSM), or extensive, involving DNA rearrangements of fragments up to several kilobases. Phenotypic variation as a result of DNA rearrangement, irrespective of molecular mechanism involved, has been referred to as a shufflon (177).

Short sequence repeats and slipped-strand mispairing Multiple contiguous repeats of units of DNA sequence can be subject to expansion or contraction of the number of repeats. A universal SSM mechanism is invoked in which misalignment of the repeat sequences occurs between the mother and daughter strands during DNA synthesis that occurs in either DNA replication or DNA repair. Misalignment between the daughter and parent DNA strands can occur on the leading or lagging strand at the repeat region, which results in an increase or decrease in the number of repetitive units in the newly synthesized DNA (200, 360, 362). These changes in the number of unit repeats can lead to phase-variable expression of a protein, if the location of these repeats is such that either transcription or translation of a gene is affected. Phase variation has been associated with repeat units that consist of 1 to as many as 7 nucleotides (nt). Repetitive sequence units are also referred to as short sequence repeats, microsatellites or variable number of tandem repeats (360).

Regulation at the level of transcription occurs when the repeats are located in the promoter region between the –10 and –35 sites for RNA polymerase binding (Fig. 1A, region 2). The spacing of these sites is critical for the level of transcription, and even a single-nucleotide deviation from the optimal 17-nucleotide (nt) spacing has an effect. Phase variation of the fimbriae encoded by hif in H. influenzae occurs as a result of variation between 9, 10, or 11 repeats of the dinucleotide TA located between the –10 and –35 sequences for the overlapping, divergent promoters for the hifA and hifB genes. These encode the major fimbrial subunit and chaperone, respectively. Not only does the change in strength of the promoter result in an "off" phase and an "on" phase, but also the "on" phase is represented by clones that have either a low or a high level of expression (372). A variation of this principle results in a variable level of production of the high-molecular-weight adhesins in H. influenzae, due to altered spacing between two promoters by SSM at 16 to 28 repeats of a 7-nt unit (65).



View larger version (21K):
[in this window]
[in a new window]
 
FIG. 1. Phase variation as a result of SSM at short sequence repeats. (A) Schematic of the four positions, relative to a gene, at which short sequence repeats can cause phase variation. Indicated are a coding sequence (open rectangle), promoter (–10, –35) with RNA polymerase (RNA pol), the +1 transcription start site, the Shine-Dalgarno sequence for ribosome binding (SD), and the ATG translation start codon. Repeat sequences at regions 1 through 4 can lead to phase variation by affecting transcription initiation (regions 1 and 2), translation (region 4), and as yet unidentified means (region 3) (see the text). (B) Effect on the translation product of a one-unit insertion due to SSM at the tetranucleotide repeat sequence (AGTC) in the coding sequence of mod of H. influenzae (HI056). Partial nucleotide and amino acid sequences and numbering are indicated for 31 (on) and 32 (off) tetranucleotide repeats. Note that as a result of the insertion, the reading frame changes at amino acid 177, which leads to the formation of a premature stop codon (*) following amino acid 194.

 
Transcription can also be affected by changes in repeat sequences located outside of the promoter. The change in nucleotide number can potentially affect the binding of a regulatory protein or can lead to a difference in a posttranscriptional initiation event such as mRNA stability (Fig. 1A, regions 1 and 3). Phase variation of individual fimbrial genes in B. pertussis is proposed to occur as a result of a change in a poly(C) tract that alters the distance between the binding sites of an activator and RNA polymerase (400). Similarly, in N. meningitidis strain MC58, a change in the unit repeat number in the sequence upstream of the –35 sequence of the promoter of the adhesin encoding nadA gene affects its promoter strength (225). Furthermore, in certain isolates of Moraxella catarrhalis, the length of a poly(G) tract that is located downstream of the promoter for the adhesin gene uspA but upstream of the translation initiation site also correlates with the level of gene expression (192).

Translation of a protein can be affected by SSM if the unit repeats are located within its coding sequence (Fig. 1A, region 4). The open reading frame is disrupted if SSM results in a change in nucleotide number that is not a multiple of three. In this case, a nonfunctional, usually truncated protein is synthesized. This is, for example, the basis of phase variation of the expression of the mod gene of H. influenzae, which contains over 30 repeats of the tetranucleotide (5'-AGTC) in its coding sequence (Fig. 1B) (66). The reading frame is altered, and, in addition, a premature stop codon is formed as a result of one tetranucleotide addition within the coding sequence. To summarize, SSM can cause a change in the number of unit repeats consisting of 1 to 7 nt and can affect transcription initiation, a posttranscriptional initiation event, or translation.

Bayliss et al. have addressed the molecular mechanisms underlying phase variation-associated SSM. The effect of specific mutations on the switching frequency was determined for the fimbrial genes hifAB and mod, encoding a DNA modification enzyme in H. influenzae (18). Mutations in either of two genes that affect mismatch repair, dam and mutH, increased the frequency of change at the dinucleotide repeat tract of hifAB but not at the tetranucleotide repeat region of mod (18). A role for the mismatch repair system is also implicated in phase variation by SSM at single-nucleotide repeats in N. meningitidis, since increased switching rates were observed in both mutS and mutL backgrounds (291). These data indicate that a functional mismatch repair system can contribute to minimizing the occurrence of SSM at mono- and dinucleotide repeats but not at tetranucleotide repeats. In contrast, a mutation in polI increased the switching frequency only at the mod tetranucleotide repeat sequence. This suggests that incorrect processing of the Okazaki fragments results in increased instability of the region, but further details are not yet known (18). During misalignment of the strands, small loops of DNA are formed that may be stabilized by the formation of H-DNA, which was shown to form at a 5-nt repeat sequence (21). These studies show that different molecular mechanisms may be involved in stabilizing repeat regions of different unit lengths in H. influenzae and, furthermore, suggest that factors or conditions that affect mismatch repair or DNA replication may also affect SSM-dependent phase variation. How much of this can be extrapolated to other species remains to be determined. Thus, the (in)stability of a given repeat sequence in a specific bacterial species cannot yet be predicted.

Even in the absence of mutations like those described above, SSM-dependent switching frequencies can vary within an isolate. The frequency of variation at mod in H. influenzae, for example, increases with increasing numbers of unit repeats at mod (66, 291), which is probably a general correlation. The switching frequency can also be modulated by active transcription, as was shown for SSM at a poly(dC) tract in the siaD gene of N. meningitidis. The formation of a premature stop codon in siaD as a result of SSM results in disruption of the coupling of transcription and translation, and this in turn facilitates Rho-dependent termination of transcription. This transcriptional termination correlated with an increase in the frequency of change in the length of the poly(dC) tract (196). It will be a challenging but important issue to determine whether regulation of the frequency of the occurrence of SSM is a common occurrence and if this biologically significant.

Expansion or contraction of the number of nucleotides in multiples of three can cause a size polymorphism of a protein if this is located within a coding sequence. An intriguing example was described for the AhpC protein in E. coli. After a single triplet expansion in its coding sequence, the enzymatic function changed from a peroxiredoxin to a disulfide reductase. This change was observed under stress conditions that give a growth advantage to cells that had acquired this change but was nevertheless a reversible event (294). Whether phase variation by SSM at the level of translation significantly affects the biological function or antigenicity of other proteins is not known.

Additional examples of SSM-dependent phase variation are listed in Table 1. It is interesting that SSM-dependent phase variation of virulence factors has not been identified in E. coli and Salmonella sp., even though there does not appear to be a mechanistic constraint (294, 356). In these species, the potential to establish complex regulatory systems, which is facilitated by the large genome size, and a preference for stringent (environmental) regulation associated with their diverse natural habitats, may have influenced the acquisition or evolution of the more complex, phase variation mechanisms.

Short-sequence repeats that are not associated with phase variation, but may cause antigenic variation or other phenotypes, are discussed in an excellent review by van Belkum et al. (360). Two related topics, the identification of SSM-dependent phase-varying genes from genome sequence analyses, and the use of sequence repeats in strain identification, are discussed below in "Genomics and phase variation," and "Diagnostic and experimental significance of phase variation" respectively.

Homologous (general) recombination. Homologous or general recombination in general occurs at long (>50-bp) regions of homology and is dependent on numerous proteins that constitute part of the general DNA repair and maintenance machinery of the cell. Recombination between two alleles of a gene can lead to a gene conversion when this results in a unidirectional exchange of DNA. Gene conversion that is associated with antigenic variation in bacteria involves recombination between one of a repertoire of silent alleles of the gene and the gene located at the expression site. When alleles undergo constant changes as a result of recombination, this can be referred to as combinatorial variation. The mechanism(s) leading to gene conversion in bacteria may vary between species, but in general it requires the machinery of homologous recombination. However, several features distinguish it from most other RecA-dependent homologous recombination events. The frequency of this recombination is much higher, it occurs between regions of much lower homology than is usually considered necessary for RecA mediated recombination, and additional special cis-acting factors or unidentified processes appear to be involved.

Most of our understanding of the mechanism underlying gene conversion leading to antigenic variation is a result of studies of type IV pilin antigenic variation in N. gonorrhoeae (reviewed in reference 319). The pilin proteins that form antigenic variants of the pili are conserved for two-thirds of the N terminus but vary at the remaining C terminus. This variation is a result of unidirectional transfer to the expression locus pilE of a sequence from one of the silent pilS loci. There can be one to six copies of the silent loci on the genome, and these pilS loci can be separated from pilE by as much as 900 kb. The copies at the silent pilS loci consist mainly of variable regions of the gene, whereas the gene in the expressed pilE locus contains both conserved and variable regions. Recombination appears to require only 2 bp of conserved sequence and occurs at a high frequency (>10–3), which are both unusual traits for RecA-dependent recombination. However, RecA is required for antigenic variation, and the RecF-like recombination pathway, in which RecA plays a role, appears to play an essential role in this unidirectional exchange (142, 182, 237, 319, 332, 413). The frequency decreases in a recX mutant (342). This is discussed in more detail, in the context of general recombination, repair, and replication pathways, in an excellent recent review (176). Efficient pilin gene conversion furthermore requires a conserved sequence located at the 3' end of all pil loci (Sma/Cla repeat) that may be a site for recombination but also appears to be a recognition sequence for an as yet unidentified DNA binding protein. Interestingly, these proteins may be present only in pathogenic Neisseria (376, 377).

An intriguing aspect of this recombination is that despite the unidirectional exchange of DNA, chromosomal fidelity is maintained and the sequence at the pilS loci is unaltered. Gene conversion can occur if the donor sequence for recombination is obtained by DNA transformation, in which case both aspects are readily resolved. However, this is not the case for the second mechanism, which appears to be predominant. In this case, DNA exchange occurs between the two copies of the genome formed by DNA replication (143, 319). The recently proposed "hybrid intermediate" model addresses how this genetic exchange occurs, and critical aspects of this model have been verified experimentally (142; reviewed in reference 176). The first step involves a RecA-independent recombination event in the donor chromosome between very short sequences of homology of a pilS locus and pilE sequence, forming an extrachromosomal circular hybrid pilE-pilS molecule but presumably also an additional, undefined intermediate molecule that is critical for the next step (Fig. 2A). This hybrid molecule donates pilS sequence to the pilE locus in the recipient chromosome in a second recombination event. This step requires RecA and involves recombination at a larger region of homology flanking the pil sequences, as well as recombination at a short region of homology within the gene (Fig. 2B) (142). The recombination events result in a unidirectional exchange of variable pilS sequence to the pilE locus without altering the donor pilS sequence. The available experimental data do not rule out other models for gene conversion in Neisseria, and identifying the predominant intermediate molecular structure in the recombination step will be critical in resolving this important recombination mechanism (176).



View larger version (20K):
[in this window]
[in a new window]
 
FIG. 2. Intermediate hybrid model for gene conversion at pilE in N. gonorrhoeae as a result of homologous recombination (142). Open rectangles designate the conserved region of the pil gene, patterns designate the different variable sequences, and the thick bar indicates some of the very short conserved sequences. (A) DNA exchange occurs between a silent pilS locus and the pilE locus of the donor chromosome at a short region of homology. This RecA-independent recombination is indicated by the light cross and results in the formation of an intermediate pilE-pilS hybrid molecule, depicted here as a circular extrachromosomal molecule. The predominant intermediate hybrid molecule may have a different structure (176). (B) The intermediate hybrid structure donates sequence to the pilE locus of the recipient chromosome, involving two crossover events, a RecA-dependent one at a larger region of homology (heavy cross) flanking the pil sequence and one at a short region of homology, depicted here within the pil sequence.

 
The same mechanism that leads to antigenic variation can also cause on/off phase variation of expression of type IV pili. This occurs when a nonfunctional gene is created by the recombination reaction or if, for example, a pilS sequence that contains a premature stop codon is transferred to the pilE locus. An irreversible switch to a nonexpressing phenotype can also occur as a result of recombination-dependent deletion of complete regions of pil containing DNA (221, 317).

Concerning the biological role, antigenic variation of the variable major lipoprotein (Vmp) in Borrelia hermsii (238, 288, 289) and of the VlsE surface proteins in B. burgdorferi (411, 412) is well understood (reviewed in reference 12) (see "Biological significance of phase variation" below). Less is known about the mechanisms and molecular pathways underlying this antigenic variation. Detailed analysis of the genetic exchanges underlying specific seroconversion events in B. hermsii has, however, led to the identification of four mechanisms. The first mechanism is consistent with gene conversion and involves a nonreciprocal recombination event of genes from silent (archival) loci from a linear plasmid to an expressed locus, vlp7, near the telomere on plasmid lp28-1. A second mechanism involves a less frequent occurrence of intraplasmidic recombination at a region containing duplicated sequence. This results in loss of a fragment of DNA but occurs only at the expression site. A third mechanism results in introduction of point mutations at the expressed locus, but these may also originate from archival loci. Finally, by mechanisms that are not clear, transcription of the gene at the expression site on lp28-1 can be silenced in conjunction with expression specifically of vsp33 from a site internally located on a 53-kb plasmid. More details can be found in the excellent recent review by Barbour and in the references therein (12).

Antigenic variation is also extensive in Mycoplasma species (reviewed in reference 48). This includes antigenic variation of the Vsa and Vsp lipoprotein family in M. pulmonis (26, 325) and M. bovis (216) and of the VlhA hemagglutinin in M. synoviae (254). This antigenic variation results from genomic rearrangements and is commonly associated with DNA inversion events, which may be the result of either homologous recombination or site-specific recombination (see below). Molecular details for most molecular mechanisms remain to be elucidated for these important pathogens (138, 171, 215, 254, 325, 330). DNA inversion as a result of homologous recombination leads to antigenic variation of the SLPs in Campylobacter fetus. Antigenic variation of SLPs involves reassortment of eight sap genes, each encoding an antigenically distinct SLP, and a single sap promoter. The DNA inversion involves a fragment of 6.2 kb with the single promoter or, in addition, a flanking region with one or more of the variable sap cassettes. The DNA rearrangement positions the one sap promoter to transcribe one of the eight sap genes. These inversion events, also referred to as nested DNA rearrangements, decreased in a recA mutant, suggesting partial dependence on RecA (81, 84, 286).

Gene duplication by recombination is invoked in modulating the level of expression of a gene. In H. influenzae type b, a heritable variation in the level of capsule production occurs as a result of gene duplication of the cap genes, which may be enhanced by the flanking IS-like sequence. In addition, in type 1 H. influenzae type b, an irreversible switch to the nonexpressing type can occur when the bexA gene, which is essential for capsular synthesis, is lost as a result of recombination between duplicated cap sequences flanking bexA (184-186, 296; reviewed in reference 297). The latter event can be reversed only by transformation with DNA from a bex+ isolate (184, 185). Phase variation of capsule production in several Streptococcus pneumonia serotypes is also associated with DNA duplication and excision. In this case, tandem duplication and precise excision of random fragments of 11 to 239 bp occur in genes essential for capsule production. Duplication of the sequence leads to disruption of the open reading frame and thus to a switch to a nonexpressing phenotype. The recombination mechanism has not been characterized but is likely to be RecA dependent since the repeat region is long (378, 379). This is reminiscent of RecA-dependent variation in the number of long repeat units (over 200 nt) in the coding sequence of the alpha C surface proteins in group B S. pneumoniae, the M proteins in group A Streptococcus, and the Esp protein in Enterococcus faecalis. However, in these genes the coding sequence remains in frame and the change in the number of unit repeats affects the antigenicity of the protein (107, 284).

Site-specific recombination. Nonhomologous, site-specific recombination requires specific enzymes that act at cognate DNA sequences that may have sequence identity, but often in a region of no more than 30 bp. Here the distinction is made between conservative site-specific recombination (CSSR) that can lead to inversion, insertion or excision of a DNA region, and transposition. These recombination events can lead to a variety of genetic rearrangements, some of which will lead to phase or antigenic variation (also reviewed in references 120, 160, and 177).

Based on biochemical properties like sequence, structure, and mechanism of recombination, the CSSR recombinase enzymes associated can be divided into two major families, which are the serine and tyrosine families of recombinases, formerly designated the resolvase-invertase and {lambda} integrase families, respectively. There is a significant amount of functional overlap among these enzymes, and enzymes of either group can mediate phase and antigenic variation. A third family consisting of two enzymes has recently also been identified, but little is known about the molecular mechanism of recombination (353). For additional details about the biochemical properties of these recombinases and the molecular mechanisms of transposition and site-specific recombination, see two recent reviews (120, 160).

(i) Inversion of a DNA element by CSSR. Recombination mediated by members of the serine and tyrosine families of recombinases occurs at short regions of DNA that contain some sequence similarity or identity required for enzyme recognition and result in reciprocal DNA strand exchange. This recombination is the molecular basis of many DNA inversion events that are involved in creating clonal antigenic diversity in bacteria and phages. Depending on whether the genetic information of this inverted element contains regulatory sequence or coding sequence, inversion can lead to on/off phase variation, biphasic antigenic variation, or even multiphasic antigenic variation.

The well-studied Cre recombinase can mediate recombination between two lox sequences in the absence of other factors. In contrast, in most cases where site-specific recombination leads to phenotypic variation, there is a requirement for cellular proteins in addition to the recombinase, presumably to form a recombination-proficient protein nucleocomplex. Through these factors, control of the recombination event can be exerted.

Inversion of a DNA element causes phase variation of expression of the fim and fot operons in E. coli and mrp in P. mirabilis, encoding type 1, CS18, and MR/P fimbriae, respectively (1, 135, 202, 416). In each case, the invertible element contains a promoter that is essential to transcribe the structural operon. This promoter is correctly positioned for this transcription in only one of the two orientations of the invertible element. The recombinase enzymes mediating the inversion, two for fim and one each for fot and mrp, have homology to each other and are members of the tyrosine recombinase family (135, 160). The most extensively studied system is that of fim, encoding type 1 fimbriae. Essential features of the fim system are outlined below. For a more detailed discussion, the reader is referred to an excellent review and the references therein (31).

(a) Type 1 fimbrial phase variation. Type 1 fimbriae, encoded by the fim operon, are the most common fimbrial adhesins in E. coli isolates. These fimbriae are thought to be of particular importance in mediating attachment to host tissue during bacterial colonization of the bladder, which can lead to cystitis, and can also mediate bacterial invasion into bladder epithelial cells (89, 194, 313). Expression of type 1 fimbriae phase varies a result of the inversion of a regulatory element that contains the essential promoter for transcription of the fimbrial structural genes, the fimA promoter (Fig. 3). The invertible element consists of 296 bp flanked by two 9-bp inverted repeats (IRR and IRL). The main subunit of the fimbriae is FimA, and the fimA promoter is properly orientated for transcription of fimA when the inverted element is in the "on" orientation. In the "off" orientation, the promoter is incorrectly oriented for transcribing fimA and fimbriae are not synthesized. Thus, the inversion event is the main feature of this phase variation system. This DNA inversion is mediated by the two site-specific recombinases, FimB and FimE. These have 48% amino acid identity, but their DNA specificity and activity differ. FimB mediates inversion in both directions, whereas FimE mediates the inversion predominantly in the "on" to the "off" direction (174, 230). The FimE bias is due in part to its substrate preference for DNA in the "on" orientation (102, 187). The frequency of inversion mediated by FimB is on the order of 10–3 to 10–4, whereas the FimE-mediated inversion frequency is as high as 10–1.



View larger version (15K):
[in this window]
[in a new window]
 
FIG. 3. Phase variation of type 1 fimbrial expression, encoded by the fim operon, in E. coli as a result of DNA inversion mediated by SSM. The relative positions of the promoters (open arrows), genes (open rectangles), and inverted repeats IRR and IRL (triangles) at fim are shown. The invertible DNA sequence and its orientation are depicted by a shaded bar. IRR an IRL are within the binding sites for the recombinases FimB and FimE. Binding sites for other regulatory proteins (Lrp and integration host factor) are not shown (see the text). The drawing is not to scale and is not meant to convey protein size or other biochemical properties.

 
The relative amount of the two recombinases affects the net phase variation rate of type 1 fimbriae. The fimB and fimE genes are each transcribed from their own promoters (Fig. 3), and several factors have been identified that regulate the transcription of these genes. For example, H-NS affects both fimB and fimE transcription (260, 261) whereas DNA supercoiling affects fimB transcription (79, 130). Recently, it was determined that N-acetylneuraminic acid suppresses FimB expression as well as recombination. This regulation requires the presence of a newly identified, cis-acting regulatory element that is located over 600 bp upstream from the fimB promoter. This element contains regions that function as a silencer and regions involved in antirepression (91). Thus, by regulating the transcription of the recombinase genes, phase variation rates are affected. The significance of incorporating regulatory signals may be, in part, to signal the presence in the intestine of the host.

The level of FimE expression also is affected by posttranscriptional regulation. Specifically, the level of expression is higher when the orientation of the downstream invertible element is in the "on" orientation than when it is inverted. This "orientation control" is a result of differential stability of the fimE transcript. The transcript is quite stable when the invertible element is in the "on" orientation, and the transcript extends into the switch region. The fimE transcript is less stable when the element is in the "off" orientation. Differential stability may be the result of the formation of a Rho-dependent terminator in the "off" orientation or that of a secondary structure that stabilizes the transcript in the "on" orientation (166, 335). Thus, the invertible element exerts its control at two levels, first by affecting the orientation of the main fimA promoter and second by affecting the level of FimE production.

The relative amounts of FimB and FimE are important in current models of regulation, but a much more complex picture of fim regulation has emerged. For example, a bias to the "off" phase occurs as a direct result of transcription originating at the fimE promoter, even in the absence of functional FimE (260). Several lines of evidence also suggest that transcription from the fimA promoter and the DNA inversion event are mutually exclusive (260). In addition, an "off" phenotype in which fimbriae are not produced can be obtained even with the DNA in the "on" orientation, as a result of posttranscriptional regulation (229). Furthermore, various cellular factors influence the recombination reaction, including the host factors Lrp, integration host factor, and H-NS, presumably by assisting in the formation and stabilization of the recombination-proficient protein nucleocomplex (32, 78, 90, 101, 103, 260, 262). The interaction of these regulators with the fim DNA can be modulated, as illustrated by the effect of the branched-chain amino acids and alanine on the Lrp binding affinity (298). Thus, environmental factors can affect fim phase variation, not only by affecting the level of recombinase(s) but also by directly influencing the inversion reaction.

Differences in fim phase variation rates have been found among clinical isolates. One of the identified differences may lie in variable FimB expression due to its dependence on the availability of the minor tRNA LeuX (293). The leuX locus is often linked to pathogenicity islands, and its expression may vary in concert with that of virulence factors (74, 75). In addition, sequence variation of the invertible element and the putative presence of an additional, strain-specific transcriptional activator have been implicated in strain-dependent variation. These differences in fim regulation may influence the relative success of a strain in different environments.

Our understanding of the molecular mechanism underlying the fim expression system has made it possible to address the occurrence and role of fim phase variation in vivo. The percentage of bacteria that contained the element in the "on" position was determined in animal models of urinary tract infection (114, 145, 343) and in bacteria isolated from women with urinary tract infections (205). In addition, using the mouse model, the effect on bacterial colonization of preventing inversion of the fim invertible DNA element was examined (113). Both lines of research suggest that phase variation itself is an important feature during different stages of infection. With a phase-varying isolate, a bias to the "on" state was observed in bacteria in the bladder at specific times of infection, but the relative contributions of regulation of phase variation and of host-driven selection against a specific expression phase are not yet clear. Applying this general approach to other infection models should yield valuable insights into bacterium-host interactions and the role of phase variation in this interaction.

(b) Other CSSR-dependent types of phase variation. DNA inversion-mediated site-specific recombination causes an interesting combination of antigenic and phase variation of the PilV protein of the type IVB pilus in S. enterica serotype Typhi (415). Inversion is mediated by the Rci recombinase, which is a member of the tyrosine recombinase family. Rci was first determined to be involved in inversion associated with a shufflon located on plasmids pR64 and pR721. This shufflon determines which of seven C-terminal ends is incorporated into the PilV subunit of the pilus, and this determines the host range for plasmid transfer. The Rci-mediated inversion at the pilV gene of S. enterica serotype Typhi, however, results in biphasic antigenic variation of the type IVB pilus. Inversion of a 490-bp fragment causes one of two variable C termini to be fused to a constant N terminus of PilV, which is a minor component of the pilus. However, under conditions favoring very rapid inversion, specifically when DNA is highly supercoiled, neither PilV protein is expressed. It is thought that RNA polymerase becomes detached during the inversion process and that therefore during very rapid inversion there is insufficient time for RNA polymerase to synthesize a full-length transcript (243, 414). The antigenic variation affects receptor recognition of the pilus, whereas pilus-mediated autoaggregation occurs in the absence of PilV expression.

DNA inversion is also mediated by members of the serine recombinase family (also called the DNA invertase family), for example, Hin-mediated recombination leading to flagellar H1/H2 antigenic variation in S. enterica serotype Typhimurium. The main difference with tyrosine recombinase-mediated DAN inversion lies in the details of the biochemistry and mechanism of recombination (reviewed in references 120 and 160). H1-H2 antigenic variation is a result of expression of either FliC or FljB, respectively, and is a result of Hin-mediated site-specific recombination at two 26-bp inverted repeats. This causes inversion of a 995-bp region that contains the promoter for fljB and fljA (formerly designated rH1) (316, 327, 328). Thus, when the promoter is oriented toward transcription of the fljBA operon, FljB (H2) flagella are expressed concomitant with the repressor fljA. FljA represses fliC expression by affecting both transcription and translation and may interact with the 5' untranslated region of fliC (35). On Hin-mediated inversion, fljB and fljA are not transcribed, fliC repression is abrogated, and H1 flagella are expressed. Similar to Fim-mediated inversion, cellular factors facilitate the formation of the protein nucleocomplex that must be formed to allow the recombination to occur and is required to activate the recombinase (125). These factors will influence the rate of inversion, as illustrated by a 150-fold increase in the rate of Hin-mediated inversion by Fis binding to the recombination enhancer element and Fis-Hin interactions (161, 239, 364). Other recombinases in this family include the phage-encoded Gin and Cin of E. coli Mu and P1, respectively. These enzymes cause a DNA inversion event that determines the composition of the phage tail fiber and thus the host range of the bacteriophage (110, 149, 363).

The recombinase Piv in Moraxella bovis and M. lancunata also mediates a DNA inversion event, which causes antigenic and phase variation of type IV pili (197, 222, 223). Piv is not a member of the tyrosine or serine recombinase family but has sequence similarity to MooV transposase, which is involved in phase variation in Pseudoalteromonas atlantica (see below) (353). Piv causes DNA inversion between the coding region of tfpQ and tfpI, which determines which of these pilins are expressed and incorporated in type IV pili. In M. lancunata, TfpI contains a mutation that inactivates the gene product, and therefore inversion in this species leads to on/off phase variation of TfpQ (302).

Eight loci have been identified in B. fragilis that encode eight different capsular polysaccharides. Expression of each is under the control of on/off phase variation. At seven loci, the expression phase correlates to the orientation of an invertible DNA element in the locus. This element contains the promoter of a regulatory gene for the corresponding locus. One specific invertible element consists of 193 bp that is flanked on each side by a 19-bp inverted repeat sequence. The other invertible elements are of similar size and are flanked by similar inverted repeats. The corresponding recombinase(s) has, however, not been identified yet (183a).

(ii) Insertion and excision of genetic elements from the chromosome. Transposition can mediate reversible phase variation only if excision is precise, with restoration of the original sequence of the recipient DNA; in most transposition events, the original sequence of the recipient DNA is not restored after excision of the transposing element. Furthermore, classic transposition in general does not target a specific DNA sequence. In contrast, transposition mediated by the putative transposase MooV does lead to phase variation; indeed, this transposition requires short sequence identity between the insertion element and the target sequence (277). Therefore, transposition-mediated phase variation can occur but may be limited to a specific group of transposable elements and recombinases.

MooV appears to mediate phase variation of the extracellular polysaccharide encoded by the eps locus in certain isolates<