CMR FigSearch
Home Help [Feedback] [For Subscribers] [Archive] [Search] [Contents]
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowReprints and Permissions
Right arrow Copyright Information
Right arrow Books from ASM Press
Right arrow MicrobeWorld
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Turenne, C. Y.
Right arrow Articles by Behr, M. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Turenne, C. Y.
Right arrow Articles by Behr, M. A.

Next Article 

Clinical Microbiology Reviews, April 2007, p. 205-229, Vol. 20, No. 2
0893-8512/07/$08.00+0     doi:10.1128/CMR.00036-06
Copyright © 2007, American Society for Microbiology. All Rights Reserved.

Mycobacterium avium in the Postgenomic Era

Christine Y. Turenne,1 Richard Wallace Jr.,2 and Marcel A. Behr1*

McGill University Health Centre Research Institute, Montreal H3G 1A4, Canada,1 The University of Texas Health Center at Tyler, Department of Microbiology, Tyler, Texas 757082

SUMMARY
INTRODUCTION
TAXONOMY AND CLINICAL SIGNIFICANCE
    Classical Definition of MAC Species
        M. avium subsp. avium.
        M. avium subsp. paratuberculosis.
        M. avium subsp. silvaticum.
        M. intracellulare.
    New Designations and Related Species
        M. avium subsp. hominissuis.
        M. lepraemurium.
        Undefined species and novel designations.
    MAC Terminology Used for This Review
LABORATORY ASPECTS OF THE MAC
    Serotyping and Other Traditional Methods
    Genetic Methods To Detect IS Elements
        IS900.
        IS901.
        IS1311.
        IS1245.
        Other insertion elements described for the MAC.
        (i) IS elements of rare distribution.
        (ii) IS elements of partially known distribution.
        (iii) IS elements newly discovered via genome sequencing projects.
    Non-IS-Based PCR Differentiation of MAC
    Sequence-Based Classification
        The ribosomal operon.
        The hsp65 gene.
        Other housekeeping genes.
COMPARATIVE GENOMICS OF THE MAC
    Genetic Variability in the Pregenomic Era
    Genomes of Strains M. avium subsp. paratuberculosis K-10 and M. avium 104
    Comparative Genomics of the MAC
    Genetic Variability in the Postgenomic Era
    Diagnostics and Taxonomy Based on LSPs
    Immunodiagnostics of M. avium subsp. paratuberculosis
    Comparative Genomics: Current Appreciation and Diagnostic Implications
UNRESOLVED ISSUES
    Does M. avium subsp. silvaticum Really Exist?
    Is M. avium subsp. hominissuis the Only True Environmental M. avium Subspecies?
    Considerations in Exploring M. avium subsp. paratuberculosis as a Human Pathogen
FUTURE RESEARCH
CONCLUSION
ACKNOWLEDGMENTS
REFERENCES

   SUMMARY
 Top
 Next
 References
 
The past several years have witnessed an upsurge of genomic data pertaining to the Mycobacterium avium complex (MAC). Despite clear advances, problems with the detection of MAC persist, spanning the tests that can be used, samples required for their validation, and the use of appropriate nomenclature. Additionally, the amount of genomic variability documented to date greatly outstrips the functional understanding of epidemiologically different subsets of the organism. In this review, we discuss how postgenomic insights into the MAC have helped to clarify the relationships between MAC organisms, highlighting the distinction between environmental and pathogenic subsets of M. avium. We discuss the availability of various genetic targets for accurate classification of organisms and how these results provide a framework for future studies of MAC variability. The results of postgenomic M. avium study provide optimism that a functional understanding of these organisms will soon emerge, with genomically defined subsets that are epidemiologically distinct and possess different survival mechanisms for their various niches. Although the status quo has largely been to study different M. avium subsets in isolation, it is expected that attention to the similarities and differences between M. avium organisms will provide greater insight into their fundamental differences, including their propensity to cause disease.


   INTRODUCTION
 Top
 Previous
 Next
 References
 
Past reviews on the species Mycobacterium avium have typically focused on two distinct aspects. The first examines organisms classically called M. avium and their role in human disease, such as disseminated disease in AIDS and pulmonary disease (87, 124). This focus has also included other genetically distinct species, such as M. intracellulare and related species that are grouped together as the M. avium complex (MAC). The other focus has been on the Johne's bacillus, previously known as M. paratuberculosis, in the context of veterinary medicine (36, 46, 112). For a number of reasons, spanning from tradition to tools, these two organisms are still usually studied as separate entities, although by genetic criteria they have been classified as subsets of the same species for over a decade (267). As a result, clinical and epidemiologic studies of human exposure, infection, and disease have largely ignored the now-renamed M. avium subsp. paratuberculosis. In parallel, research on M. avium subsp. paratuberculosis often overlooks the existence of other closely related M. avium organisms and the potential impact of these other M. avium organisms on diagnostic and epidemiologic findings.

The advent of genome sequencing projects and comparative genomic tools has provided a renewed opportunity to firmly classify mycobacteria. In the case of the M. tuberculosis complex (MTBC), comparative genomics has provided genomic signatures that define members of the complex (14, 25, 102, 183), and these genomic signatures now serve in diagnostic laboratories to assign identity to clinical isolates (198). Phenotypically ambiguous organisms can now be classified confidently based on their genomic signatures (185), leading to the recognition that certain organisms previously grouped together due to insufficiently discriminatory methods (184, 245) in fact consist of genetically distinct host-associated variants or ecotypes (adapted to a specific habitat), such as the vole bacillus, seal bacillus, dassie bacillus, oryx bacillus, and M. caprae in goats (2, 55, 182). With the recent availability of complete genome sequences for the two principal M. avium subspecies (152) (The Institute for Genomic Research [TIGR] [http://www.tigr.org/]) and results from comparative genomic studies (236, 240, 304), it is now possible to reconsider M. avium in a similar manner. The existence of natural variants of M. avium is expected to initially pose new challenges in taxonomy and diagnostics. However, once the nomenclature is resolved, a postgenomic phylogenetic framework should serve towards improved diagnostics and strain tracking and, additionally, provide a context for studies of disease pathogenesis. The aim of this review is to address current misconceptions and confusion in M. avium taxonomy, to place emphasis on the importance of recognizing the diversity within M. avium strains, and to highlight the opportunities to study M. avium by exploiting the existence of phenotypically variant members of the same species. In addition, we examine how genomic data provide opportunities and challenges for the derivation of novel diagnostic tools, noting in particular the distinction between elements specific by in silico analysis of genome sequence data and those specific by validated laboratory assays. Finally, in the face of accumulating reviews and rebuttals about the potential role of M. avium subsp. paratuberculosis in human Crohn's disease (CD), we consider it especially valuable to reassess the definition of this organism, the methods used for its detection, and the applicability of these methods for epidemiologic investigation of this association.


   TAXONOMY AND CLINICAL SIGNIFICANCE
 Top
 Previous
 Next
 References
 
Mycobacteria are defined by their acid-fast properties, cell walls containing mycolic acids, and high (~61 to 71%) genomic C+G contents (149). There are now over 130 established and validated species and subspecies of mycobacteria (J. P. Euzéby, List of Prokaryotic Names with Standing in Nomenclature [http://www.bacterio.cict.fr]), with the most commonly isolated species in clinical laboratories consisting of members of the MTBC and members of the MAC. Originally described in two separate veterinary settings, MAC organisms have long been recognized as professional pathogens of birds and ruminants. Based on their source of isolation and pathology in animal models, two distinct organisms, namely, the avian tubercle bacillus, the agent of tuberculosis (TB) in birds, and Johne's bacillus, agent of Johne's disease in ruminants, were recognized. With the recognition that the avian tubercle bacillus could occasionally be isolated from human diseases, MAC organisms were also considered opportunistic pathogens of humans. In order to determine the potential sources of human exposure, environmental surveys were undertaken, revealing viable or culturable MAC organisms in a number of sources, including water (reviewed in reference 208). The latter observation led to the concept or belief that MAC organisms are fundamentally environmental mycobacteria. While it appears that some MAC organisms reside primarily in the environment, other subsets are veterinary pathogens with a limited capacity to survive in the environment (143, 300). Therefore, to best appreciate the natural variability among MAC organisms, it is safest to consider the MAC as a microcosm of the mycobacterial genus including both environmental mycobacteria and host-associated pathogens with their own distinct genomic identities.

The definition of MAC varies with the context in which it is discussed (Table 1). Clinicians and health care workers consider MAC to include M. avium, M. intracellulare, and miscellaneous related species. In veterinary medicine, MAC may be recognized the same way but, notably, is distinct from "M. paratuberculosis." The taxonomist may consider the MAC to contain only the subspecies of M. avium, as the designation implies, including M. avium subsp. paratuberculosis, and recognize that M. intracellulare is a related but clearly distinct species from M. avium. The scientist may or may not adopt any of the definitions described above, depending on the research question being addressed. Confusion sets in when new advances redefine the classic nomenclature. With this being said, we believe that sufficient data now exist to provide clarity in M. avium taxonomy and that a revised taxonomic approach will benefit research into the epidemiology and pathogenesis of diseases due to M. avium.


View this table:
[in this window]
[in a new window]

 
TABLE 1. Nomenclature applied to MAC organisms

 
Classical Definition of MAC Species

A milestone in the characterization and definition of M. avium occurred in 1990 with a publication by Thorel et al. which defined three principal subsets of M. avium, as revealed by prior molecular analyses, such as DNA-DNA hybridization (122, 231, 307), on the basis of growth characteristics and biochemical tests (numerical taxonomy analysis) (267). These three subsets consist of M. avium subsp. avium, M. avium subsp. paratuberculosis, and M. avium subsp. silvaticum.

M. avium subsp. avium. Before the establishment of the M. avium subsp. avium designation, this organism was simply referred to as M. avium and was recognized to be the cause of avian TB and occasional infections in other animals. The type strain, ATCC 25291, was isolated from a diseased hen. The designation includes the standard M. avium subspecies causing disease in birds but also includes agents of disseminated disease in patients with AIDS, cervical lymphadenitis in children, and chronic lung disease in several settings in adolescents with cystic fibrosis and in older adults. Classically, the designation M. avium subsp. avium has not distinguished avian from human or environmental isolates, and hence, sensitization to M. avium is used as a proxy of exposure to environmental mycobacteria, even though avian purified protein derivative (PPD) was derived from a bird isolate (237). As discussed in greater detail in this review, the failure to distinguish between the environmental and host-associated ecotypes of M. avium is especially problematic for interpreting and comparing data from past studies.

M. avium subsp. paratuberculosis. M. avium subsp. paratuberculosis refers to the etiologic agent of Johne's disease or paratuberculosis, a chronic granulomatous enteric disease of ruminant livestock and wildlife (112). Difficulties surrounding paratuberculosis control lie primarily in aspects of diagnosis; assays are most accurate when the disease is well established, but detection of subclinical infection is hampered by poor sensitivity (251, 294, 295) and specificity (168). M. avium subsp. paratuberculosis is one of the slowest growing mycobacterial species, such that primary isolation from specimens can take several months (173, 298). The distinguishing phenotype of M. avium subsp. paratuberculosis has classically been an in vitro growth dependency on mycobactin, an iron-chelating agent first obtained from M. phlei (90, 173) which was subsequently replaced by mycobactin J, currently used today, obtained from a strain of M. avium (41, 174). Notably, the type strain of the species, ATCC 19698, isolated from the feces of a cow with paratuberculosis (172), has lost its mycobactin dependency. From phenotypic analysis, the M. avium subsp. paratuberculosis group has been subdivided into two main types, bovine and ovine, that vary in hosts, diseases caused, and growth phenotypes (260, 297, 298).

M. avium subsp. silvaticum. M. avium subsp. silvaticum applies to the previously named wood pigeon bacillus, an acid-fast organism causing TB-like lesions in these wood pigeons that were not initially successfully cultured in vitro (44, 167). Cultures were obtained for the first time when medium for "M. paratuberculosis" was used for cultivation and were observed for 5 months (249). Subsequently, the organisms were recognized by their mycobactin dependency upon primary isolation, gradually losing this phenotype upon subculture (165). Conflicting experimental data in attempting to classify the organisms led to the performance of DNA-DNA homology studies, ultimately revealing that they belonged to the same species as M. avium and M. avium subsp. paratuberculosis (231, 307). Support for the distinctiveness of M. avium subsp. silvaticum, however, was advanced by distinct patterns obtained with genetic tools such as pulsed-field gel electrophoresis (150), although this type of method is typically used for epidemiological purposes, not to delineate species. Finally, a thorough phenotypic evaluation of the M. avium species revealed that only M. avium subsp. silvaticum was distinct from classical M. avium subsp. avium and M. avium subsp. paratuberculosis based on an inability to grow on egg media and the stimulation of growth at pH 5.5 (267). The type strain, ATCC 48898, represents strain 6409, isolated from the liver and spleen of a wood pigeon and characterized in the numerical taxonomy study (267).

M. intracellulare. Unlike the M. avium subsets, for which the type strains were isolated from nonhuman hosts, the type strain of M. intracellulare (ATCC 13950) was isolated from a human, specifically a child who died from disseminated disease (63). This organism was initially named Nocardia intracellularis, until Runyon made the link between an atypical mycobacterium called the "battey bacillus" and "N. intracellularis" based on similarities with M. avium and subsequently established the M. intracellulare species (223). Since then, M. intracellulare organisms have been isolated from a variety of animal hosts and environmental sources (225, 266, 269). In general, M. intracellulare has been subject to less study than M. avium, as the latter is more prevalent in clinical and environmental samples, has a wider apparent host range, and contributes almost exclusively to disseminated MAC disease in human immunodeficiency virus patients (276, 305). However, when identification to the species level is performed, M. intracellulare is an important contributor to MAC-associated pulmonary infections in immunocompetent or non-human immunodeficiency virus patients (108, 166, 207, 269, 290). M. intracellulare also appears to have a distinct environmental niche, as it has been found to be more prevalent in biofilms and at significantly higher CFU numbers than M. avium (88). The clinical designation MAC or MAI, used to group M. avium and M. intracellulare, largely reflects the conventional inability of the diagnostic laboratory to distinguish these organisms and the use of the same therapeutic regimens. Sequence-based analysis reveals M. intracellulare as a distinct out-group for resolving subsets of M. avium (277) (for example, see Fig. 1). The implications of blurring the species barrier in clinical, epidemiological, immunological, or bacteriologic studies are unknown but clearly important.


Figure 1
View larger version (17K):
[in this window]
[in a new window]

 
FIG. 1. hsp65 gene phylogeny based on nucleotide differences (277), superimposed with genetic variation based on LSPs (239). IS901+ strains cluster in one lineage, and all lack LSPA17. IS900+ strains cluster in a lineage identified by codes 5 and 6, and all lack LSPA8. M. intracellulare serves as the outgroup for the M. avium subspecies. M. avium subsp. silvaticum presents with an identical genetic profile to that of code 4. Bar, 5 nucleotides.

 
New Designations and Related Species

Many strains or groups of strains have been described that share similarity with the MAC, which often results in frustration, confusion, and at times, misleading data and results (137, 147, 250, 292, 293). However, as more sophisticated molecular tools become available, important (or less important) subsets can be identified with greater confidence. Examples of newer and/or less-well-recognized members of M. avium or species associated with MAC include the following.

M. avium subsp. hominissuis. M. avium subsp. hominissuis was proposed to distinguish organisms found in humans and pigs from those isolated from birds. The hypothesis that there might be host-specific differences within M. avium was suggested when laboratories using genotypic methods noted that M. avium isolates from humans rarely shared the genetic profiles of organisms found in birds (22, 107, 139, 219). To study this further, Mijs et al. undertook a first comprehensive study that encompassed phenotypic assessment as well as several genetic tools (IS1245 restriction fragment length polymorphism (RFLP) analysis, commercial assays, and sequencing) previously used for MAC against a large set of isolates from different hosts and geographical origins (176). This study confirmed that classical avian strains are distinct from human, other mammalian, and environmental MAC isolates. The distinguishing features of M. avium subsp. hominissuis are (i) a multiple copy number of IS1245, (ii) a variable 16S-23S internal transcribed spacer (ITS) sequence (the sequence in avian strains is invariant), and (iii) the ability to grow at a wider temperature range (24 to 45°C) (176). The M. avium strain chosen for genome sequencing, strain 104, is of the M. avium subsp. hominissuis subtype (277). Another important distinguishing feature of M. avium subsp. hominissuis from M. avium subsp. avium is that it does not possess the IS901 insertion sequence (IS) (10, 178, 277), which is occasionally used as a marker of the M. avium species (243, 254). No type strain has been designated to represent M. avium subsp. hominissuis, and consequently, this designation has yet to be formally validated. However, reference strains do exist that represent this subset (Table 2).


View this table:
[in this window]
[in a new window]

 
TABLE 2. Commonly used reference M. avium strainsa

 
M. lepraemurium. M. lepraemurium refers to the agent of rodent leprosy, which was later suspected of causing skin disease in cats and dogs. To date, this organism is generally considered unculturable and can be identified reliably only by sequencing methods (119, 120). The association between M. lepraemurium and MAC stems from a likeness in serological groupings (103) and genetic relatedness by DNA-DNA hybridization methods (3, 123). The organism is characterized by only two single-nucleotide polymorphisms (SNPs) in the 16S rRNA gene from that of M. avium but bears a highly divergent hsp65 sequence (170). While this species is genetically closely related to MAC organisms, it is not typically considered part of the MAC.

Undefined species and novel designations. Over the years, the MAC has had other taxonyms, including M. avium-intracellulare (MAI), M. avium-intracellulare-scrofulaceum (MAIS), and MAIX, where "X" represents the ambiguous MAC species that could not be assigned as M. avium or M. intracellulare. These isolates typically have similar characteristics to MAC organisms but lack features that typify the species. For example, isolates positive in the MAC AccuProbe (GenProbe) test but not in the species-specific M. avium or M. intracellulare AccuProbe test would fit within this rubric (12, 93, 147, 247, 285). M. scrofulaceum was once grouped with M. avium and M. intracellulare on the basis of phenotypic similarity and its ability to be serotyped, but it has long been accepted as a separate entity from MAC. The recently described species M. palustre (270) and M. saskatchewanense (278) may be confused with MAC due to their positive reactions with the MAC AccuProbe test, but they are otherwise genetically distant from MAC. Other recently described species that are genetically and phenotypically related to MAC, such as M. chimaera (274) and M. colombiense (186), highlight the fact that outliers or MAC-like organisms continue to be isolated and characterized, defying simple classification. While it is tempting to consider the latter to be the same as other MAC organisms, such a simplification risks overlooking potentially informative differences between organisms that may not yet be apparent. Therefore, when faced with such an organism, the clinical or reference laboratory may best report a MAC-like organism.

MAC Terminology Used for This Review

For the remainder of this review, we use the more encompassing term MAC to include the species and subspecies that preceded but focus mostly on the species M. avium. When discussing subsets of M. avium, we use the following terminology. M. avium subsp. avium refers to the avian subtype, including the type strain ATCC 25291 or TMC724. M. avium subsp. hominissuis refers to human, porcine, and environmental isolates, including the strain used for the genome sequence, which we refer to as M. avium 104. M. avium subsp. paratuberculosis refers to bovine and ovine subtypes of Johne's bacillus, including the strain used for the genome sequence, known as M. avium subsp. paratuberculosis K-10. Designations for commonly studied organisms, including type strains, are presented in Table 2.


   LABORATORY ASPECTS OF THE MAC
 Top
 Previous
 Next
 References
 

Serotyping and Other Traditional Methods

Prior to the era of molecular diagnostics, identification of mycobacteria to the species level was based on morphology and a set of in vitro biochemical tests (135). These tests were not useful in subidentifying members of the MAC, since MAC organisms are generally nonreactive or produce variable results with most tests used to differentiate between species. Morphologically, the MAC presents with a wide range of colony variability, from smooth to rough and from nonpigmented to cream-colored to bright yellow, and can appear like many other mycobacterial species. High-performance liquid chromatography (HPLC) of mycolic acids became popular as a diagnostic method for species identification of mycobacteria and can distinguish MAC organisms from other mycobacteria (100). HPLC patterns of M. avium and M. intracellulare are very similar, although differentiation may be achieved by using precise interpretive criteria (32). M. avium subsp. paratuberculosis cannot be differentiated from the other subspecies of M. avium by this method (65). Serotyping, one of the earliest typing tools for the MAC (232), was based on differences in the sugar residue compositions of surface glycopeptidolipids (GPLs) and became the preferred method of MAC identification in the premolecular era. More than 30 serovars have been described (reviewed in reference 38). Based on a variety of tests, including DNA probing, serotype numbers could generally be assigned to the following MAC species: serotypes 1 to 6, 8 to 11, and 21 are classical M. avium; serotypes 7 and 12 to 20 are M. intracellulare; and serotypes 26, 27, and 41 to 43 are M. scrofulaceum (225, 293). Interlaboratory reproducibility in serotype numbers, however, was poor, and issues with autoagglutination, failure to react with any serum, or agglutination with two or more antisera were common (276, 293). Also, many serotypes could not be assigned confidently to a MAC species due to a poor consensus or to nonreaction with species-specific antisera, resulting in the ambiguous designation "MAC" for these strains. It was concerning that serotyping in tandem with RFLP methods used for epidemiological purposes was noted to generate different serotypes for isolates with identical RFLP profiles (79, 110). Conversely, the same serotype can be represented across the two subgroups of M. avium subsp. avium and M. avium subsp. hominissuis. Multilocus enzyme electrophoresis could differentiate strains of the MAC into many electrophoretic types on the basis that the enzymes chosen had multiple alleles (291, 306). This technique likely reflected variability at the geographic or ecotype level but did not appear to provide the level of resolution desirable for epidemiological tracking of isolates.

Genetic Methods To Detect IS Elements

When well characterized and used in the proper context, the species-specific IS elements described below can serve as a useful classification tool to distinguish subsets of the MAC (10, 49, 84). However, two problems have consistently hampered their utility for this purpose. First, a number of IS elements have been uncovered in strains considered to be MAC organisms, but without adequate strain characterization, it is difficult to judge which organisms harbor such elements. Second, IS elements are by nature mobile elements, so there is a risk that similar elements are found in unrelated bacteria because of mobility to or from MAC organisms. Therefore, while studies may report on the specificity of these elements across MAC organisms, this degree of specificity is not assured in diagnostic laboratories classifying unknown clinical isolates unless the organisms have first been shown to be MAC organisms by other methods. For instance, a newly discovered element may be found only in M. intracellulare among a panel of MAC strains and therefore appear to be a promising target for PCR-based detection directly from broth culture. However, until it can be ascertained that this element, or something genetically similar, is not found among the over 130 mycobacteria that may present to a reference laboratory, a positive PCR for this element should not on its own be considered sufficient evidence to state that M. intracellulare has been detected.

IS900. IS900 was the first IS characterized within the Mycobacterium genus (51, 104). It was identified from a pMB22 clone derived from a genomic library from a human M. avium subsp. paratuberculosis isolate from a CD patient and was found to be specific to M. avium subsp. paratuberculosis. Just as IS6110 has been used successfully for genotyping M. tuberculosis strains (280), RFLP analysis of the IS900 element has been used a molecular tool to type M. avium subsp. paratuberculosis isolates. Based on IS900 RFLP patterns, M. avium subsp. paratuberculosis has been divided into two main groups, namely, those isolates represented by a cattle-associated profile (C) and those represented by a sheep-associated profile (S) (11, 52, 57, 202, 297). A third RFLP genotype, called intermediate (I), was also identified from sheep (11, 67). The IS900 element is by far the most widely used target for the molecular detection of M. avium subsp. paratuberculosis and has been used in the form of direct PCR (161, 235, 284), in situ PCR (228), sequence/hybridization capture PCR (109, 159, 161, 177), nested PCR (28, 187, 224), and real-time PCR (89, 214), with the references listed representing only a small portion of what is available in the literature. However, other similar elements found across other mycobacteria, including M. terrae, M. xenopi, M. scrofulaceum and related strains, M. chelonae, and strain 2333 (related to M. cookii), have been shown to cross-react with IS900 primers used for detection of M. avium subsp. paratuberculosis (56, 85, 258). In these cases, the elements were not 100% identical with IS900, with different regions of the elements showing variable sequence identity. Sequencing of the amplified product for IS900 is therefore necessary to confirm that the amplicon is truly IS900. A few studies have reported SNPs in the IS900 element (19, 187), posing a problem for sequence-based verification of IS900-PCR results for molecular detection of M. avium subsp. paratuberculosis. Because sequencing directly from a single-round PCR product (i.e., not from cloned PCR products or nested PCR products) revealed only two specific SNPs, dividing ovine and bovine forms of M. avium subsp. paratuberculosis, the relevance of these other reported polymorphisms is presently unclear (238).

IS901. Kunze et al. discovered the IS901 element by performing a Southern blot with the pMB22 probe containing the IS900 element across various MAC isolates under low-stringency conditions (142). This element shows ~60% sequence identity to IS900. Screening across a larger panel of isolates revealed that most isolates from birds and some animals contained the element, whereas isolates obtained from AIDS patients or the environment did not. Furthermore, it was found that most bird isolates had similar IS901 patterns. These isolates were also shown to be strikingly more virulent than AIDS patient isolates in BALB/c mice (142, 204). Evidence that a more pathogenic subset of M. avium exists has been advanced numerous times since, leading some to simply divide M. avium isolates into those that are IS901+ and those that are IS901 (68). In general, isolates from diseased birds and animals with macroscopic lesions are IS901+, while those from humans, swine, or other animals without lesions are IS901 (22, 49, 189, 203). The virulence of IS901+ strains has also been confirmed experimentally (79, 203).

Simultaneous to the publication of the IS901 element, Moss et al., who were also screening for IS900 under low-stringency conditions, observed cross-hybridization with a strain of M. avium subsp. silvaticum and designated the related element IS902 (181). They determined that the element was present in all M. avium subsp. silvaticum isolates they tested, although no other MAC strains were included in the study set. Sequence alignment of the IS901 (X59272) and IS902 (X58030) sequences indicates 99% sequence identity, and upon closer inspection, their differences consist of several sequence gaps and four pairs of GC switches, suggestive of editing errors. IS901 and IS902 are most likely the same element, in which case the existence of an IS902 element specific for M. avium subsp. silvaticum would not be a valid distinction. Consequently, claims that M. avium subsp. silvaticum has been detected in samples based on the presence of IS902 should be interpreted with caution, with a more likely scenario being the detection of a strain containing IS901 or related elements.

IS1311. IS1311 was first reported as a GenBank entry in 1994 (U16276) and was subsequently used for RFLP analyses (73, 220). The element is present in all members of the M. avium subspecies, including M. avium subsp. avium, M. avium subsp. hominissuis, and M. avium subsp. paratuberculosis (49), and is not present in M. intracellulare (73, 296). The element itself has 85% sequence identity to IS1245 (described below) and therefore results in cross-hybridization with the conventional IS1245 probe (130). With the wide range of M. avium hosts for this element, it is possible that IS1311 represents an "older" IS element which may have been present prior to subspecies divergence. A longer evolutionary time span is consistent with the presence of mutations in some of the IS1311 elements among distinct subsets within the MAC. This was first observed by Whittington et al., who noted one polymorphism specific to the M. avium subsp. paratuberculosis cow or "C" type (a C-to-T change at bp 223 of the U16276 sequence) and other polymorphisms common to both the "C" and "S" types of M. avium subsp. paratuberculosis compared to other M. avium organisms (296). RFLP analysis of IS1311 also revealed distinct pattern types corresponding to cattle and sheep strains of M. avium subsp. paratuberculosis (49). A simple PCR and restriction enzyme analysis (PCR-REA) using the restriction enzyme HinfI was then developed as a rapid diagnostic tool to distinguish bovine M. avium subsp. paratuberculosis isolates from the ovine type (158). Distinct growth characteristics of M. avium subsp. paratuberculosis isolates from bison in Montana prompted investigation using IS1311 PCR-REA and revealed a third IS1311 genotype, "B" (299), and M. avium subsp. paratuberculosis strains obtained from armadillos in Wisconsin were reported to have yet another IS1311 PCR-REA allele (54). In agreement with IS900 RFLP analysis (reviewed in reference 297), cattle and goats have predominantly the C type and sheep have predominantly the S type, while the B type has been found not only in American bison but also in goats and sheep in India (241, 296). Other animals, when tested, generally have the C type (296).

IS1245. First described in 1995 (107), IS1245 was presented as having a more restricted range than IS1311, being limited to the subspecies of M. avium, i.e., M. avium subsp. avium (that would include M. avium subsp. hominissuis), M. avium subsp. paratuberculosis, and M. avium subsp. silvaticum. By PCR analysis, this element was not found in M. intracellulare or 17 other mycobacterium species. This element, however, has high DNA sequence identity with IS1311, with both belonging to the IS30 family, and it was shown that cross-hybridization of IS1245 probes with IS1311 is widespread; for instance, M. avium subsp. paratuberculosis does not contain the IS1245 element (130). Nonetheless, from this first publication on IS1245, Guerrero et al. observed that human and swine strains contained an elevated number of copies (more than eight; "multicopy"), whereas bird strains, including M. avium subsp. silvaticum, presented a three-band pattern (107). The observation that human and swine strains (now called M. avium subsp. hominissuis) differ from avian strains has since been confirmed numerous times (139, 178, 190, 219, 263), also with the added dimension that environmental strains have similar characteristics to those of the M. avium subset from humans and swine (78, 164). Standardization of IS1245 RFLP analysis was proposed in 1998 as a tool for MAC molecular epidemiology (283). To eliminate cross-hybridization with IS1311, the method was modified, leading to the recognition that the three-band bird type IS1245 RFLP profile in fact consisted of a single IS1245 copy and two copies of IS1311 (130). It remains to be seen if epidemiological value would be added by using an IS1245-specific protocol instead of the standard protocol.

As for any widely tested insertion element, the "presence" of the IS1245 element in species outside its typical host has been documented, although this was not confirmed by sequencing and may have been related elements, such as IS1311 (12, 134). Some M. avium isolates have been documented as being IS1245 negative, but only a few such reports have presented further documentation of strain identity by a sequence-based method (12). In some reports, IS1245-negative isolates have been described that contain an hsp65 sequence identical to that of M. avium but that differ from M. avium in other taxonomic targets, such as the 16S rRNA gene and the ITS sequence (147, 277).

Other insertion elements described for the MAC. Many other IS elements have been described or detected in various members of the MAC. In most cases, their distribution is either unknown or only partially known. We attempt to put some emphasis on their most likely distribution or lack thereof. When stating that some elements are similar to others, we refer to similarity on the order of 80 to 85% identity at the nucleotide level.

(i) IS elements of rare distribution. The IS1110 element was identified from a single strain, designated M. avium LR541 (116). It has some similarity to IS900 and IS901 and was found in only a small proportion of M. avium strains. However, which subset of M. avium contains this element is not clear, and screening was done by Southern hybridization, where the signal could have resulted from other related elements. IS1110-like elements have been reported for many species of mycobacteria, but these either have not been sequence confirmed (117) or have been confirmed but do not correspond to IS1110 per se (194, 253). The only information available for the IS1141 element is a GenBank entry dated 1995 (L10239). It was found in a strain identified as M. intracellulare strain Va14. Since then, no new data have been presented on this element or on the strain of M. intracellulare in which it was found. Unfortunately, no IS element has been identified to date that is present in all strains of M. intracellulare or even in any single well-known strain representative of the species. IS1626 was discovered in the same manner as IS901 (and IS902), by Southern blotting of 66 MAC isolates with an IS900 probe (210), and has some similarity to IS900 and to IS1613 (below). Strong hybridization occurred for only one strain, subsequently characterized as M. avium by a variety of molecular tests. It is unclear, however, if this strain or any of the others screened were M. avium subsp. avium or M. avium subsp. hominissuis. This element appears to be uncommon in MAC organisms in general. IS1613 is another element for which very little information is available: a GenBank submission exists (AJ011837), and one publication mentions that it was isolated from an AIDS patient (28), indicating a probable M. avium subsp. hominissuis strain. It is similar to both IS1626 and IS900. None of these elements is present in the genome sequence of strain 104 or K-10.

(ii) IS elements of partially known distribution. The element IS1612, identified in a strain of M. avium subsp. silvaticum and in M. avium subsp. avium TMC724 (30), corresponds to IS2534 (80), similarly found in strain TMC724. Proper IS nomenclature (244) was eventually assigned to this element, now referred to as ISMav1 (ISFinder [http://www-is.biotoul.fr/]). ISMav1 is present in at least one M. avium subsp. hominissuis strain, the M. avium subsp. avium type strain, and also the M. avium 104 genome sequence. The distribution of this element across a panel of MAC isolates is undetermined. The element IS666 was identified in M. avium isolates from humans (36%), pigs (5%), cattle (12%), and the environment (78%) but not from avian strains (227). Therefore, IS666 is likely present only in some subsets of M. avium subsp. hominissuis, as it was present in 21% of M. avium strains tested. The IS1601 element was identified during a study of the genetic mechanisms behind the variable morphology of M. avium and was implicated in the smooth-to-rough switch in some strains (81). The IS1348 element was uncovered upon further sequencing of the ser2 GPL gene cluster (81). Both IS1601 and IS1348 are present in the M. avium 104 genome but not in M. avium subsp. paratuberculosis K-10. On this basis, these elements appear to be present in at least a subset of M. avium subsp. hominissuis strains and not in M. avium subsp. paratuberculosis strains. ISMav2 is a potentially M. avium subsp. paratuberculosis-specific element, as it was detected in all M. avium subsp. paratuberculosis strains but not in strains of M. avium subsp. avium (243, 254). Unfortunately, IS901-negative strains were not evaluated, and therefore the distribution of ISMav2 in M. avium subsp. hominissuis isolates is unknown. The IS999 element was found in isolates presumed to be M. avium subsp. hominissuis since they were from human clinical samples and was absent from one strain known to be M. avium subsp. avium (144). While its distribution in M. avium subsp. paratuberculosis was not evaluated, it is not present in the K-10 genome sequence. The element ISMpa1, with 80% sequence identity with IS1601, was found in all M. avium subsp. paratuberculosis strains tested, 2 of 13 MAC organisms tested, and no other mycobacterial species (191). The true distribution of this element within the MAC is unknown since only a small panel of isolates was evaluated.

(iii) IS elements newly discovered via genome sequencing projects. The M. avium subsp. paratuberculosis K-10 genome sequence contains three insertion elements that were previously described, namely, the M. avium subsp. paratuberculosis-specific elements IS900 and ISMav2 and the pan-M. avium element IS1311. Sixteen additional insertion elements were identified and named IS_MAP01 through -16 (152). Of note, IS_MAP12 corresponds to the previously described ISMpa1 (191). The most abundant IS family represented in the K-10 genome is the IS110 family, which includes IS900, ISMpa1, and IS_MAP14 to -16 (152) but also the IS1110, IS901, IS1613, and IS1626 elements described for other MAC strains. A few K-10 IS elements correspond to some found in the M. avium 104 genome, while others have low or no similarity to other bacteria, including mycobacteria, and may potentially serve in the specific diagnosis of M. avium subsp. paratuberculosis. With the 14 IS elements described for the MAC in the pregenomic era, 15 novel IS elements identified in the genome sequence of M. avium subsp. paratuberculosis K-10, and more to be found through the genome sequence of M. avium 104, it is clear that MAC organisms contain a very large number of IS elements, many of which are related to each other. As new genome sequences become available at an increasing pace, so will the number of related insertion elements. For example, IS1110, which was first identified because of its similarity with IS900 and IS901, is now known to have much higher similarity to one of the new insertion elements in M. avium subsp. paratuberculosis K-10 (IS_MAP15) and to also share high similarity with an element in the recently sequenced Mycobacterium sp. strain MCS (GenBank accession no. CP0003841; M. monacense by 16S rRNA gene sequencing). This example illustrates an important limitation of targeting IS for diagnostics, as cross-hybridization with closely related elements has been documented by both PCR and Southern hybridization.

Non-IS-Based PCR Differentiation of MAC

Apart from insertion elements, other genes have been used as diagnostics or to differentiate MAC organisms and can be a more attractive option due to concerns of nonspecificity associated with IS elements. A single-copy sequence named F57 was identified as specific for M. avium subsp. paratuberculosis (206) and later used in a duplex PCR that differentiated the MTBC, M. avium, and M. avium subsp. paratuberculosis (47, 101). More recently, a real-time PCR assay based on the F57 element was developed for the detection of M. avium subsp. paratuberculosis in milk, feces, and tissue (23, 259). Another M. avium subsp. paratuberculosis-specific genetic target showed similarity to the dnaJ family of heat shock protein genes and was designated hspX (83). The specificity of hspX for M. avium subsp. paratuberculosis was subsequently confirmed across a large panel of MAC strains with various genetic and host characteristics (84). To distinguish bovine and ovine M. avium subsp. paratuberculosis strains, a three-primer PCR assay was developed that yields PCR products of different sizes in a single reaction tube (50, 66).

In an attempt to find an M. intracellulare- and M. avium-specific target for use in clinical laboratory diagnostics, Southern hybridization of genomic fragments cloned from the avian type strain revealed two fragments specific for the MAC that were not found in other mycobacterial species (265). Fragment DT1 was specific for all isolates of M. intracellulare and M. avium strains of serotypes 2 and 3, while DT6 was specific for all M. avium isolates and M. avium subsp. paratuberculosis (265). Although it was not considered at the time, most isolates of M. avium tested were from human clinical isolates (276), therefore most likely consisting of M. avium subsp. hominissuis isolates, which could explain why few M. avium strains were positive for DT1. While DT1 appears to be a marker of M. intracellulare (71, 248) and may possibly be a marker of M. avium subsp. avium, the presence of DT1 was also found in several other mycobacterial species closely related to the MAC (72) and was lacking in some M. intracellulare isolates (97). DT1 reveals no similarity to M. avium 104 (the highest match is 61%) and no similarity to M. avium subsp. paratuberculosis K-10 or other sequences available in GenBank to date. Conversely, DT6 is found in both sequenced genomes and the avian type strain and therefore may serve as a marker for subspecies of M. avium.

Sequence-Based Classification

The ribosomal operon. The 132 established mycobacterial species at last count are known to present almost as many different 16S rRNA gene sequences. Additionally, many other mycobacterial 16S rRNA genotypes are thought to exist for which the organisms are not yet recognized as species (199, 272). Yet the M. avium subspecies (M. avium subsp. avium, M. avium subsp. paratuberculosis, and M. avium subsp. hominissuis) share an identical 16S rRNA gene sequence and hence cannot be differentiated by 16S rRNA gene sequencing. The unvalidated species "M. brunense" (ATCC 23434) also shares 100% identity with the M. avium 16S rRNA gene sequence (279), although it is unclear to which subspecies it corresponds. The closest relatives to M. avium by 16S rRNA gene sequence vary by 6 bp (M. colombiense), 9 bp (M. intracellulare), and 10 bp (M. chimaera) and are considered part of the MAC in a clinical setting. Restricting analysis to the type strains of validated species, the next closest species are M. bohemicum (13 bp) and M. malmoense (17 bp).

Commercial molecular diagnostic assays offer a user-friendly, rapid method of classifying mycobacteria and are typically based on the ribosomal operon. The first such available assay was the AccuProbe test (GenProbe, Inc., San Diego, CA), developed for the most common mycobacteria in human clinical samples, including the MTBC, MAC, M. kansasii, and M. gordonae. In addition to the pan-MAC probe, species-specific probes against M. avium and M. intracellulare are also available. However, cross-reaction of other mycobacterial species with the MAC probe is not uncommon (72, 247), and only one probe can be tested at a time. Two kits based on the technology of reverse hybridization and on line probe assays have recently been developed which can identify several mycobacterial species at once, with the number increasing with new kit versions. The Inno-LiPA Mycobacteria test (Innogenetics, Belgium) is based on the 16S-23S ITS region (146, 256, 273), and the GenoType Mycobacteria test (Hain Lifescience GmbH, Nehren, Germany) is based on the 23S rRNA gene (217, 229). Both of these contain probes that identify M. avium and M. intracellulare, while the Inno-LiPA kit contains additional probes for the "MAIS complex" and M. intracellulare II. Isolates which hybridize to the MAIS complex probe but not the species-specific probes are common and may pose a diagnostic dilemma, but they do emphasize the complexity of strains that resemble MAC organisms and prevent their undue assignment to either species (147). None of these systems was designed to distinguish between the subspecies of M. avium based on the targets chosen.

The 16S-23S ITS is a highly variable genetic region that has been used extensively to study the variability within MAC organisms. To date, 35 MAC sequevars have been identified, including MavA to -H for the species M. avium, MinA to -D for M. intracellulare, and MAC-A to -X for strains which could not be assigned to either species (70, 92, 176, 186) (GenBank no. AY701784 to -86 [unpublished]). Mav and Min sequevars vary by only 1 to 4 bp, while the MAC sequevars present with significant variability and are candidates for new species. To date, three such new species have been described. M. chimaera, characterized by the MAC-A sequevar, and M. colombiense, characterized by the MAC-X sequevar, are genetically related to the MAC and are considered as such in the clinical setting. In contrast, the species M. parascrofulaceum, characterized by the MAC-G sequevar, is a distant species from MAC organisms and should therefore not be considered as such. With that being said, most clinical MAC isolates present with a MavA, MavB, or MinA sequevar (70, 92, 188), and M. avium subsp. avium, M. avium subsp. silvaticum, and M. avium subsp. paratuberculosis belong to the MavA sequevar (277). Therefore, subspecies of M. avium cannot be distinguished from each other by this method, and the majority of the many ITS sequevars are rare and of unknown epidemiological significance.

The hsp65 gene. Housekeeping genes offer a higher level of sequence variation than do ribosomal genes but are nonetheless useful for taxonomic purposes due to the relative sequence conservation imposed to maintain function. In this category, the stress protein gene hsp65 is a preferred target for mycobacterial identification to the species level, having routinely been used in diagnostics since the development of a rapid PCR-restriction enzyme analysis (PRA) method using a 441-bp section of the ~1,600-bp gene (262). However, the PRA method, which is dependent on band size interpretation, shows poor interlaboratory correlation in band size designations. Also, since protein-encoding genes generally have higher mutation rates, as little as one SNP in a restriction enzyme site can result in a different PRA pattern, complicating interpretation (145). With more access to sequencing technology, some laboratories perform hsp65 gene sequencing in the same manner as that done for 16S rRNA gene sequencing (170). The use of this target as an epidemiological tool for closely related mycobacteria, including MAC organisms, has been investigated (72, 86, 190, 288, 289, 303). Single SNPs in this region exist among various subsets of M. avium subsp. hominissuis, and great variability can be observed in M. intracellulare (246). However, like the case for the ITS, the fragment targeted cannot distinguish between M. avium subsp. avium (and M. avium subsp. silvaticum) isolates, M. avium subsp. paratuberculosis isolates, and a great proportion of M. avium subsp. hominissuis isolates. Conversely, the hsp65 sequence outside the Telenti fragment offers unique sequence signatures that can help to identify the various subspecies of M. avium. PRA analysis of a 960-bp fragment was shown to differentiate M. avium subsp. avium from M. avium subsp. paratuberculosis (86). Based on this, comparative sequencing of the nearly complete hsp65 gene was performed on a large panel of isolates, revealing that polymorphisms in the 3' end, beyond the Telenti region, can unambiguously distinguish between M. avium subsp. avium strains, M. avium subsp. paratuberculosis strains of bovine and ovine types, and six sequevars of M. avium subsp. hominissuis (277).

Other housekeeping genes. Housekeeping genes other than the hsp65 gene have been evaluated, though to a lesser extent, for mycobacterial identification to the species level. Unfortunately, these studies typically do not include all members of the MAC, omitting at least one of the main subtypes from test panels, and therefore the true utility of these sequence-based tools, at least in the detection and epidemiology of MAC organisms, remains unknown. The manganese superoxide dismutase gene (sodA) revealed several distinct sequevars among MAC organisms, including a unique sequevar for M. avium subsp. paratuberculosis (29), compared to human M. avium isolates. However, SNPs present among three sequences submitted to GenBank representing the identical type strain of M. avium subsp. avium (X81384, U11550, and AY544802) make it difficult to establish whether avian strains do or do not have a unique sod sequevar.

A 236-bp fragment of the dnaJ gene has been reported to produce variable sequences across a panel of MAC strains, but the utility of this target requires further evaluation (179). The reference panel of isolates tested included the 28 MAC serotypes and a set of clinical strains but did not include any samples of M. avium subsp. paratuberculosis. Notably, the degree of genetic diversity observed among M. avium isolates was relatively limited compared to that for strains of M. intracellulare (179). Therefore, further study of this gene target across a broader sample, and perhaps including a larger fragment, is required to determine the utility of dnaJ variability for characterization of MAC organisms.

Several other genes have been assessed for their diagnostic potential for mycobacteria, including gyrB (132), recA (21), the 32-kDa protein gene (247), rpoB (98, 136), and a combination of these in a multigene approach (74). However, these studies generally evaluated only a few strains belonging to the MAC, often limited to the type strains of M. avium subsp. avium and M. intracellulare. Therefore, the utility of these genes in distinguishing between epidemiologically important subsets of MAC is largely unknown.

The observed variability in a number of genes across MAC isolates suggests that a multilocus approach may provide greater discrimination than analysis of each target on its own. For other pathogenic bacteria, such as Neisseria meningitidis, Streptococcus pneumoniae, and Staphylococcus aureus, the combination of high-throughput sequencing technologies and recognized variations in housekeeping genes has enabled the emergence of multilocus sequence typing (MLST) (154, 155) as a powerful tool for typing and taxonomic purposes. An important advantage of this method is the existence of more than 30 (and increasing) curated MLST sequence databases freely available on the Internet, permitting direct comparisons with existing data (154). MLST has not formally been initiated in mycobacteriology to date, but given the number of variable genes noted above, this method could easily be implemented and serve as an epidemiologic or phylogenetic tool to characterize MAC organisms on the subspecies, geographic, or ecotype level.


   COMPARATIVE GENOMICS OF THE MAC
 Top
 Previous
 Next
 References
 

Genetic Variability in the Pregenomic Era

Prior to the availability of genome sequence data, and hence the capacity to perform microarray studies, genetic methods such as restriction mapping, Southern blotting, suppression subtractive hybridization (SSH), and representational difference analysis (RDA) were employed to identify regions of difference among study strains. These methods and others were applied to MAC organisms in several studies, as described below.

The earliest genetic variability studies of the MAC set out to identify a genetic basis for differences in GPL composition between rough colony variants of M. avium and their smooth counterparts (15, 16, 81). Belisle et al. (15) performed Southern blot analysis against morphological variants of an M. avium serovar 2 strain, using restricted fragments from a plasmid probe containing the complete ser2 gene cluster responsible for biosynthesis of the serovar 2-specific sugar residue characterized earlier (17). Striking differences were observed between rough morphotypes and their parent smooth strains, and it was determined that this was due to genetic deletions, one of which resulted from IS element-mediated recombination causing the loss of the complete ser2 cluster (81). Some genetic differences were also observed between the two serotype 2 strains, i.e., TMC724 from a bird and M. avium 2151, isolated from human sputa (15) and later characterized as having multicopy IS1245 and therefore being M. avium subsp. hominissuis (140). Part of the ser2 gene cluster was also identified by RDA and characterized in an independent study (30, 268), although the association with the ser2 GPL cluster was not confirmed until recently (80). The region was designated "GS," was described as "genetic island-like" with a lower G+C content, and was found in M. avium subsp. paratuberculosis, M. avium subsp. silvaticum, and some M. avium subsp. avium strains of serotype 2 but not in other strains of M. avium subsp. avium. Although the following was not specifically addressed, MAC strains not containing the GS element could be inferred to be M. avium subsp. hominissuis. In an attempt to identify genetic differences between a virulent M. avium strain (M. avium subsp. avium strain 724) and a less virulent human strain of M. avium (M. avium subsp. hominissuis strain A5), RDA analysis was performed and also revealed genes belonging to the ser2 GPL cluster in strain 724 (141).

SSH of M. avium subsp. paratuberculosis against strains of M. avium subsp. avium identified 42 short genetic regions, 24 of which were deemed specific for M. avium subsp. paratuberculosis on the basis that they also revealed low sequence similarity by BLAST searching of genome sequence data from M. avium strain 104 (138). To identify genetic differences between M. avium subsp. paratuberculosis variants of type I/S/ovine and type II/C/bovine (75), the avian type strain of M. avium subsp. avium (ATCC 25291) served to identify M. avium subsp. paratuberculosis-specific sequences by RDA, and one each of the M. avium subsp. paratuberculosis bovine and ovine types was tested against the other to uncover their unique genetic signatures. Three small genetic regions were identified as unique to M. avium subsp. paratuberculosis versus M. avium subsp. avium. Also, three genetic regions present in type I but not type II strains were also present in M. avium subsp. avium, suggesting that type I M. avium subsp. paratuberculosis is an evolutionary intermediate between M. avium subsp. avium and bovine M. avium subsp. paratuberculosis. No genetic regions unique to type II strains and not present in type I strains were identified in the study. In contrast, Marsh and Whittington did uncover an 11.5-kb region missing from the S/I type and present in the C/II type by RDA (162), although comparison with other M. avium strains was not carried out. RDA analysis also uncovered a 7-kb section which was further characterized as part of a putative 38-kb pathogenicity island specific for M. avium subsp. paratuberculosis (253). Described as an ABC transporter operon (mpt), no similarity was found in sequences available from GenBank or in the sequenced genome of M. avium strain 104 from TIGR. Since the driver used in the RDA experiments was M. avium subsp. avium (ATCC 25291T), it is likely that this region is also not found in avian strains.

While this review is focused on chromosomal genetics, it is important that many strains of MAC organisms are known to harbor a variety of plasmids (58, 60, 163, 171). This aspect of MAC research is significantly understudied at present but has revealed important findings in terms of strain epidemiology (171), virulence (95, 96), adaptability and resistance (91, 94, 205), and other mechanisms attributed to the presence of plasmids (61). While most M. avium strains isolated from AIDS patients in the United States were found to contain plasmids (59, 114, 180), the sequenced strain of M. avium 104 is not known to contain any and was chosen in part due to its rare ability among M. avium isolates to be genetically manipulated (197, 271). Furthermore, to our knowledge, no plasmids have been discovered or identified in strains of M. avium subsp. paratuberculosis.

Genomes of Strains M. avium subsp. paratuberculosis K-10 and M. avium 104

MAC genetics and genomics have advanced exponentially in the past decade, with but one paragraph on this topic in the last comprehensive MAC review (124). The turn of the century brought us the first genome sequences of MAC organisms, for (i) M. avium strain 104 from the blood of an AIDS patient (TIGR), a representative of M. avium subsp. hominissuis; and (ii) M. avium subsp. paratuberculosis strain K-10, isolated from a cow with Johne's disease (152). Several other MAC genome sequencing projects are under way and are expected to generate more data in the coming 3 to 5 years (John Bannantine and Vivek Kapur, personal communication).

The K-10 genome is 4.83 Mb long, has a G+C content of 69.3%, and contains 4,350 open reading frames (ORFs) (152). For comparison, M. avium 104 has an additional ~700 kb of DNA, for a total genome size of 5.48 Mb, but has a similar G+C content of 69.0%. Comparison of genes orthologous between the two genomes reveals 98 to 99% sequence identity. Approximately 75% of the K-10 genome has homologs in the published M. tuberculosis H37Rv genome sequence (48, 152). Since little is known about the actual virulence mechanisms of M. avium subsp. paratuberculosis, the representation of H37Rv genes associated with virulence was investigated (152). The K-10 genome has significantly fewer PE/PPE genes (i.e., with Pro Glu and Pro Pro Glu motifs) (1% versus 10% of total genes in H37Rv). As well, K-10 has additional mammalian cell entry (mce) gene homologs or operons but also a lack of some of the specific mce genes described for H37Rv. Additionally, K-10 has a larger number of genes possibly involved in lipid metabolism. K-10 is notably lacking two important operons, for polyketide and mycocerosic acid synthesis, that together result in the production of the cell wall component phthiocerol dimycocerosate. However, additional genes were identified which may possibly play a role in phthiocerol dimycocerosate synthesis. Since several of these are also found in M. avium 104, they alone cannot explain the pathogenic nature of M. avium subsp. paratuberculosis.

The raw genome sequence of M. avium 104 has been available from TIGR since about 2003. However, the TIGR annotation was just released in late 2006 (GenBank accession no. CP000479). In the interim, publications on the M. avium 104 genome were based on the in-house annotation efforts of individual groups (240, 304). These predicted the presence of between 4,480 (240) and 4,987 (304) ORFs. The present TIGR annotation includes 5,313 genes, 5,120 of which code for proteins. The differences in gene content reflect differences in annotation methods and improvements in the identification of short genes. A formal published annotation of M. avium 104 will be a useful resource for helping to resolve these differences and opening the door to further postgenomic study of the MAC.

Comparative Genomics of the MAC

Prior to the completion of the two MAC genome sequences, comparative analyses using contig fragments representing the partial genome of K-10 already revealed 27 genes unique to it compared to M. avium strain 104 (6). Some of these were found to be present in other mycobacteria, such as M. intracellulare and other strains of M. avium, when screened by PCR, emphasizing caution in the interpretation of what is truly subset specific and foreshadowing the genetic variability in the complex. Additionally, examination of the origin of replication (oriC) site of M. avium subsp. paratuberculosis compared to that of M. avium subsp. avium as well as random genomic regions beyond the oriC site did not reveal any notable differences that could explain their divergent phenotypes. Nucleotide identity values were no different from those observed for other bacteria belonging to the same species and with identical phenotypes (8).

Based on the availability of genome sequence data, three groups have assembled DNA microarrays, two based on the genome sequence of M. avium 104 (240, 304) and one starting with the M. avium subsp. paratuberculosis K-10 genome as a template (201). These microarrays then served to evaluate genomic variability in the MAC as a whole. The different groups employed similar experimental approaches, beginning with a relatively small panel of MAC strains of different types to identify large sequence polymorphisms (LSPs) or regions of difference by microarray analysis and then confirming the presence/absence of these regions by PCR and sequencing across a larger panel of isolates. In part due to an unawareness of the taxonomic issues detailed above, discrimination between M. avium subsp. hominissuis and "true" M. avium subsp. avium isolates was not taken into consideration, which may have resulted in greater complication and/or simplification in the interpretation of results from these experiments.

The first of these experiments used the first available genome data set, albeit not an annotated set, namely, that of M. avium 104 from TIGR (240), and included 93% of the putative ORFs, representing a first-generation array. Microarray analyses were performed against strains representative of other M. avium subsets, including M. avium subsp. paratuberculosis K-10 (bovine type) and LN20 (ovine type) and the type strain of M. avium subsp. silvaticum. Fourteen LSPs, defined as representing six or more contiguous ORFs missing in test strains, were identified as present in strain 104 but not in the others. These were first labeled LSP1 to LSP14 and subsequently renamed LSPA1 to LSPA14 to distinguish them from the LSPPs described in a subsequent study as present in M. avium subsp. paratuberculosis K-10 but missing from M. avium 104 (236). LSPAs were 21 kb to 197 kb long, for a total of 727 kb (13.5% of the strain 104 genome). Only LSPA11, representing part of the mce2 operon, appeared to be variably present in M. avium subsp. paratuberculosis. Upon further investigation, it was determined to be missing specifically from an ovine strain of a specific IS900 RFLP type (unpublished data), which happened to have been the only ovine strain tested by microarray in this study. Notably, only 3 of the 14 LSPAs were observed as uniformly present in non-M. avium subsp. paratuberculosis MAC strains, indicating that a small minority of the variability observed represented differences between M. avium subsp. hominissuis and M. avium subsp. paratuberculosis. Additionally, no LSPA could serve to distinguish M. avium subsp. silvaticum from strains designated M. avium subsp. avium. Principal observations made from this comparative analysis included a high conservation of PE/PPE genes in all tested strains and variability in the distribution of mce genes. Genes of the mycobactin synthesis operon (mbtA to mbtJ), which is characterized for M. tuberculosis (212), are all present in strains 104 and K-10. However, mbtA, believed to be the initiator of mycobactin synthesis, is truncated in M. avium subsp. paratuberculosis strain K-10 (240), as confirmed by others (152). The hypothesis that this may be the cause of mycobactin dependency in M. avium subsp. paratuberculosis remains to be confirmed formally by functional studies and may be technically hampered by the existence of a number of smaller mutations in other genes of the mbt operon (C. Y. Turenne and M. A. Behr, unpublished).

More recently, another group took a similar approach where smaller regions of deletion, defined as three or more consecutive ORFs, were considered in comparative analyses. Twenty-four LSPs were identified as missing from M. avium subsp. paratuberculosis strains, which included most of those described by Semret et al. plus an additional 96 ORFs distributed among 11 LSPs. Altogether, these ranged from 3 to 196 kb long, totaling 846 kb (17% of the strain 104 genome) (304). In addition to mce operons, genome plasticity was also observed in TetR transcriptional regulators. Finally, three large genetic inversions were described between the 104 and K-10 genomes.

Working in the converse sense, the availability of the M. avium subsp. paratuberculosis K-10 genome has facilitated in silico (236, 304) and microarray-based (201) approaches to determine which genes are present in K-10 but missing from M. avium 104 and other MAC organisms. Semret et al. identified 17 LSPPs spanning 230 kb of sequence in sections of 3 to 66 kb (236). Comparably, Wu et al. identified 18 LSPPs (GI MAP-1 to -18) spanning 240 kb, 16 of which were perfectly shared in both studies, with the only differences being due to short genetic regions spanning a few genes (304). Paustian et al. presented their microarray data according to any individual genes differentially present in MAC strains versus K-10 (201), not restricted to runs of genes. Not surprisingly, many of these consisted of transposase genes specific to M. avium subsp. paratuberculosis. Also, they identified seven large regions of difference that corroborated with the larger LSPPs described in the other two studies and several single genes or genes in small groups that corresponded mostly to the smaller LSPPs.

Studies to date have found that little genomic variability exists among M. avium subsp. paratuberculosis strains. However, the level of variability between M. avium subsp. paratuberculosis and the other MAC organisms is >1 log greater than that observed through nearly a decade of genomic studies on the MTBC. In common with efforts for the MTBC, though different labs find various numbers of elements, LSPs noted across different papers are typically recognizable as the same genomic regions, providing valuable independent confirmation for the findings presented.

Genetic Variability in the Postgenomic Era

Evolutionary events among the MAC organisms can be speculated upon by the use of LSP analysis. LSPs can be the result of (i) horizontal gene transfer (HGT) or genetic insertion events or (ii) deletion events. Which event occurred is not always evident by simple comparative genomics of two strains against each other. Events in intergenic regions reveal no directionality by themselves. One can best assume an insertion event if the flanking regions represent a single gene split in two. Conversely, a deletion event can be inferred if a gene of known function or homology to a closely related species is truncated or if