首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Comparative genomic hybridizations have been used to examine genetic relationships among bacteria. The microarrays used in these experiments may have open reading frames from one or more reference strains (whole-genome microarrays), or they may be composed of random DNA fragments from a large number of strains (mixed-genome microarrays [MGMs]). In this work both experimental and virtual arrays are analyzed to assess the validity of genetic inferences from these experiments with a focus on MGMs. Empirical data are analyzed from an Enterococcus MGM, while a virtual MGM is constructed in silico using sequenced genomes (Streptococcus). On average, a small MGM is capable of correctly deriving phylogenetic relationships between seven species of Enterococcus with accuracies of 100% (n = 100 probes) and 95% (n = 46 probes); more probes are required for intraspecific differentiation. Compared to multilocus sequence methods and whole-genome microarrays, MGMs provide additional discrimination between closely related strains and offer the possibility of identifying unique strain or lineage markers. Representational bias can have mixed effects. Microarrays composed of probes from a single genome can be used to derive phylogenetic relationships, although branch length can be exaggerated for the reference strain. We describe a case where disproportional representation of different strains used to construct an MGM can result in inaccurate phylogenetic inferences, and we illustrate an algorithm that is capable of correcting this type of bias. The bias correction algorithm automatically provides bootstrap confidence values and can provide multiple bias-corrected trees with high confidence values.  相似文献   

2.
3.

Background  

Classification microarrays are used for purposes such as identifying strains of bacteria and determining genetic relationships to understand the epidemiology of an infectious disease. For these cases, mixed microarrays, which are composed of DNA from more than one organism, are more effective than conventional microarrays composed of DNA from a single organism. Selection of probes is a key factor in designing successful mixed microarrays because redundant sequences are inefficient and limited representation of diversity can restrict application of the microarray. We have developed a Java-based software tool, called PLASMID, for use in selecting the minimum set of probe sequences needed to classify different groups of plasmids or bacteria.  相似文献   

4.
Whole genomic DNA-DNA hybridization has been a cornerstone of bacterial species determination but is not widely used because it is not easily implemented. We have developed a method based on random genome fragments and DNA microarray technology that overcomes the disadvantages of whole-genome DNA-DNA hybridization. Reference genomes of four fluorescent Pseudomonas species were fragmented, and 60 to 96 genome fragments of approximately 1 kb from each strain were spotted on microarrays. Genomes from 12 well-characterized fluorescent Pseudomonas strains were labeled with Cy dyes and hybridized to the arrays. Cluster analysis of the hybridization profiles revealed taxonomic relationships between bacterial strains tested at species to strain level resolution, suggesting that this approach is useful for the identification of bacteria as well as determining the genetic distance among bacteria. Since arrays can contain thousands of DNA spots, a single array has the potential for broad identification capacity. In addition, the method does not require laborious cross-hybridizations and can provide an open database of hybridization profiles, avoiding the limitations of traditional DNA-DNA hybridization.  相似文献   

5.
Chenuil A  Anne C 《Genetica》2006,127(1-3):101-120
The use of molecular genetic markers (MGMs) has become widespread among evolutionary biologists, and the methods of analysis of genetic data improve rapidly, yet an organized framework in which scientists can work is lacking. Elements of molecular evolution are summarized to explain the origin of variation at the DNA level, its measures, and the relationships linking genetic variability to the biological parameters of the studied organisms. MGM are defined by two components: the DNA region(s) screened, and the technique used to reveal its variation. Criteria of choice belong to three categories: (1) the level of variability, (2) the nature of the information (e.g. dominance vs. codominance, ploidy, ... ) which must be determined according to the biological question and (3) some practical criteria which mainly depend on the equipment of the laboratory and experience of the scientist. A three-step procedure is proposed for drawing up MGMs suitable to answer given biological questions, and compiled data are organized to guide the choice at each step: (1) choice, determined by the biological question, of the level of variability and of the criteria of the nature of information, (2) choice of the DNA region and (3) choice of the technique.  相似文献   

6.
Whole genomic DNA-DNA hybridization has been a cornerstone of bacterial species determination but is not widely used because it is not easily implemented. We have developed a method based on random genome fragments and DNA microarray technology that overcomes the disadvantages of whole-genome DNA-DNA hybridization. Reference genomes of four fluorescent Pseudomonas species were fragmented, and 60 to 96 genome fragments of approximately 1 kb from each strain were spotted on microarrays. Genomes from 12 well-characterized fluorescent Pseudomonas strains were labeled with Cy dyes and hybridized to the arrays. Cluster analysis of the hybridization profiles revealed taxonomic relationships between bacterial strains tested at species to strain level resolution, suggesting that this approach is useful for the identification of bacteria as well as determining the genetic distance among bacteria. Since arrays can contain thousands of DNA spots, a single array has the potential for broad identification capacity. In addition, the method does not require laborious cross-hybridizations and can provide an open database of hybridization profiles, avoiding the limitations of traditional DNA-DNA hybridization.  相似文献   

7.
Zhou Y  Call DR  Broschat SL 《Plasmid》2012,68(2):133-141
Plasmids are mosaic in composition with a maintenance "backbone" as well as "accessory" genes obtained via horizontal gene transfer. This horizontal gene transfer complicates the study of their genetic relationships. We describe a method for relating a large number of Gram-negative (GN) bacterial plasmids based on their genetic sequences. Complete coding gene sequences of 527 GN bacterial plasmids were obtained from NCBI. Initial classification of their genetic relationships was accomplished using a computational approach analogous to hybridization of "mixed-genome microarrays." Because of this similarity, the phrase "virtual hybridization" is used to describe this approach. Protein sequences generated from the gene sequences were randomly chosen to serve as "probes" for the virtual arrays, and virtual hybridization for each GN plasmid was achieved using BLASTp. Each resulting intensity matrix was used to generate a distance matrix from which an initial tree was constructed. Relationships were refined for several clusters by identifying conserved proteins within a cluster. Multiple-sequence alignment was applied to the concatenated conserved proteins, and maximum likelihood was used to generate relationships from the results of the alignment. While it is not possible to prove that the genetic relationships among the 527 GN bacterial plasmids obtained in this study are correct, replication of identical results produced in a separate study for a small group of IncA/C plasmids provides evidence that the approach used can correctly predict genetic relationships. In addition, results obtained for clusters of Borrelia plasmids are consistent with the expected exclusivity for plasmids from this genus. Finally, the 527-plasmid tree was used to study the distribution of four common antibiotic resistance genes.  相似文献   

8.
A phylogenetic analysis of 14 complete simian virus 40 (SV40) genomes was conducted in order to determine strain relatedness and the extent of genetic variation. This analysis included infectious isolates recovered between 1960 and 1999 from primary cultures of monkey kidney cells, from contaminated poliovaccines and an adenovirus seed stock, from human malignancies, and from transformed human cells. Maximum-parsimony and distance methods revealed distinct SV40 clades. However, no clear patterns of association between genotype and viral source were apparent. One clade (clade A) is derived from strain 776, the reference strain of SV40. Clade B contains isolates from poliovaccines (strains 777 and Baylor), from monkeys (strains N128, Rh911, and K661), and from human tumors (strains SVCPC and SVMEN). Thus, adaptation is not essential for SV40 survival in humans. The C terminus of the T-antigen (T-ag-C) gene contains the highest proportion of variable sites in the SV40 genome. An analysis based on just the T-ag-C region was highly congruent with the whole-genome analysis; hence, sequencing of just this one region is useful in strain identification. Analysis of an additional 16 strains for which only the T-ag-C gene was sequenced indicated that further SV40 genetic diversity is likely, resulting in a provisional clade (clade C) that currently contains strains associated with human tumors and human strain PML-1. Four other polymorphic regions in the genome were also identified. If these regions were analyzed in conjunction with the T-ag-C region, most of the phylogenetic signal could be captured without complete genome sequencing. This report represents the first whole-genome approach to establishing phylogenetic relatedness among different strains of SV40. It will be important in the future to develop a more complete catalog of SV40 variation in its natural monkey host, to determine if SV40 strains from different clades vary in biological or pathogenic properties, and to identify which SV40 strains are transmissible among humans.  相似文献   

9.
Comparative genomic hybridizations (CGH) using microarrays are performed with bacteria in order to determine the level of genomic similarity between various strains. The microarrays applied in CGH experiments are constructed on the basis of the genome sequence of one strain, which is used as a control, or reference, in each experiment. A strain being compared with the known strain is called the unknown strain. The ratios of fluorescent intensities obtained from the spots on the microarrays can be used to determine which genes are divergent in the unknown strain, as well as to predict the copy number of actual genes in the unknown strain. In this paper, we focus on the prediction of gene copy number based on data from CGH experiments. We assumed a linear connection between the log2 of the copy number and the observed log2-ratios, then predictors based on the factor analysis model and the linear random model were proposed in an attempt to identify the copy numbers. These predictors were compared to using the ratio of the intensities directly. Simulations indicated that the proposed predictors improved the prediction of the copy number in most situations. The predictors were applied on CGH data obtained from experiments with Enterococcus faecalis strains in order to determine copy number of relevant genes in five different strains.  相似文献   

10.
Specific identification of microorganisms in the environment is important but challenging, especially at the species/strain level. Here, we have developed a novel k-mer-based approach to select strain/species-specific probes for microbial identification with diagnostic microarrays. Application of this approach to human microbiome genomes showed that multiple (≥10 probes per strain) strain-specific 50-mer oligonucleotide probes could be designed for 2,012 of 3,421 bacterial strains of the human microbiome, and species-specific probes could be designed for most of the other strains. The method can also be used to select strain/species-specific probes for sequenced genomes in any environments, such as soil and water.  相似文献   

11.
Genome level analysis of bacterial strains provides information on genetic composition and resistance mechanisms to clinically relevant antibiotics. To date, whole genome characterization of linezolid-resistant Enterococcus faecalis isolated in the clinic is lacking. In this study, we report the entire genome sequence, genomic characteristics and virulence factors of a pathogenic E. faecalis strain, DENG1. Our results showed considerable differences in genomic characteristics and virulence factors compared with other E. faecalis strains (V583 and OG1RF). The genome of this LZD-resistant E. faecalis strain can be used as a reference to study the mechanism of LZD resistance and the phylogenetic relationship of E. faecalis strains worldwide.  相似文献   

12.
The genetic characterization of hepatitis A virus (HAV) strains is commonly accomplished by sequencing subgenomic regions, such as the VP1/P2B junction. HAV genome is not extensively variable, thus presenting opportunity for sharing sequences of subgenomic regions among genetically unrelated isolates. The degree of misrepresentation of phylogenetic relationships by subgenomic regions is especially important for tracking transmissions. Here, we analyzed whole-genome (WG) sequences of 101 HAV strains identified from 4 major multi-state, food-borne outbreaks of hepatitis A in the Unites States and from 14 non-outbreak-related HAV strains that shared identical VP1/P2B sequences with the outbreak strains. Although HAV strains with an identical VP1/P2B sequence were specific to each outbreak, WG were different, with genetic diversity reaching 0.31% (mean 0.09%). Evaluation of different subgenomic regions did not identify any other section of the HAV genome that could accurately represent phylogenetic relationships observed using WG sequences. The identification of 2–3 dominant HAV strains in 3 out of 4 outbreaks indicates contamination of the implicated food items with a heterogeneous HAV population. However, analysis of intra-host HAV variants from eight patients involved in one outbreak showed that only a single sequence variant established infection in each patient. Four non-outbreak strains were found closely related to strains from 2 outbreaks, whereas ten were genetically different from the outbreak strains. Thus, accurate tracking of HAV strains can be accomplished using HAV WG sequences, while short subgenomic regions are useful for identification of transmissions only among cases with known epidemiological association.  相似文献   

13.
《Genomics》2021,113(4):1952-1961
BackgroundPlague is a highly dangerous vector-borne infectious disease that has left a significant mark on history of humankind. There are 13 natural plague foci in the Caucasus, located on the territory of the Russian Federation, Azerbaijan, Armenia and Georgia. We performed whole-genome sequencing of Y. pestis strains, isolated in the natural foci of the Caucasus and Transcaucasia. Using the data of whole-genome SNP analysis and Bayesian phylogeny methods, we carried out an evolutionary-phylogeographic analysis of modern population of the plague pathogen in order to determine the phylogenetic relationships of Y. pestis strains from the Caucasus with the strains from other countries.ResultsWe used 345 Y. pestis genomes to construct a global evolutionary phylogenetic reconstruction of species based on whole-genome SNP analysis. The genomes of 16 isolates were sequenced in this study, the remaining 329 genomes were obtained from the GenBank database. Analysis of the core genome revealed 3315 SNPs that allow differentiation of strains. The evolutionary phylogeographic analysis showed that the studied Y. pestis strains belong to the genetic lineages 0.PE2, 2.MED0, and 2.MED1. It was shown that the Y. pestis strains isolated on the territory of the East Caucasian high-mountain, the Transcaucasian high-mountain and the Priaraksinsky low-mountain plague foci belong to the most ancient of all existing genetic lineages - 0.PE2.ConclusionsOn the basis of the whole-genome SNP analysis of 345 Y. pestis strains, we describe the modern population structure of the plague pathogen and specify the place of the strains isolated in the natural foci of the Caucasus and Transcaucasia in the structure of the global population of Y. pestis. As a result of the retrospective evolutionary-phylogeographic analysis of the current population of the pathogen, we determined the probable time frame of the divergence of the genetic lineages of Y. pestis, as well as suggested the possible paths of the historical spread of the plague pathogen.  相似文献   

14.
Phylogenomics refers to the inference of historical relationships among species using genome-scale sequence data and to the use of phylogenetic analysis to infer protein function in multigene families. With rapidly decreasing sequencing costs, phylogenomics is becoming synonymous with evolutionary analysis of genome-scale and taxonomically densely sampled data sets. In phylogenetic inference applications, this translates into very large data sets that yield evolutionary and functional inferences with extremely small variances and high statistical confidence (P value). However, reports of highly significant P values are increasing even for contrasting phylogenetic hypotheses depending on the evolutionary model and inference method used, making it difficult to establish true relationships. We argue that the assessment of the robustness of results to biological factors, that may systematically mislead (bias) the outcomes of statistical estimation, will be a key to avoiding incorrect phylogenomic inferences. In fact, there is a need for increased emphasis on the magnitude of differences (effect sizes) in addition to the P values of the statistical test of the null hypothesis. On the other hand, the amount of sequence data available will likely always remain inadequate for some phylogenomic applications, for example, those involving episodic positive selection at individual codon positions and in specific lineages. Again, a focus on effect size and biological relevance, rather than the P value, may be warranted. Here, we present a theoretical overview and discuss practical aspects of the interplay between effect sizes, bias, and P values as it relates to the statistical inference of evolutionary truth in phylogenomics.  相似文献   

15.
Whole-genome regression methods are being increasingly used for the analysis and prediction of complex traits and diseases. In human genetics, these methods are commonly used for inferences about genetic parameters, such as the amount of genetic variance among individuals or the proportion of phenotypic variance that can be explained by regression on molecular markers. This is so even though some of the assumptions commonly adopted for data analysis are at odds with important quantitative genetic concepts. In this article we develop theory that leads to a precise definition of parameters arising in high dimensional genomic regressions; we focus on the so-called genomic heritability: the proportion of variance of a trait that can be explained (in the population) by a linear regression on a set of markers. We propose a definition of this parameter that is framed within the classical quantitative genetics theory and show that the genomic heritability and the trait heritability parameters are equal only when all causal variants are typed. Further, we discuss how the genomic variance and genomic heritability, defined as quantitative genetic parameters, relate to parameters of statistical models commonly used for inferences, and indicate potential inferential problems that are assessed further using simulations. When a large proportion of the markers used in the analysis are in LE with QTL the likelihood function can be misspecified. This can induce a sizable finite-sample bias and, possibly, lack of consistency of likelihood (or Bayesian) estimates. This situation can be encountered if the individuals in the sample are distantly related and linkage disequilibrium spans over short regions. This bias does not negate the use of whole-genome regression models as predictive machines; however, our results indicate that caution is needed when using marker-based regressions for inferences about population parameters such as the genomic heritability.  相似文献   

16.

Background

A low genetic diversity in Francisella tularensis has been documented. Current DNA based genotyping methods for typing F. tularensis offer a limited and varying degree of subspecies, clade and strain level discrimination power. Whole genome sequencing is the most accurate and reliable method to identify, type and determine phylogenetic relationships among strains of a species. However, lower cost typing schemes are necessary in order to enable typing of hundreds or even thousands of isolates.

Results

We have generated a high-resolution phylogenetic tree from 40 Francisella isolates, including 13 F. tularensis subspecies holarctica (type B) strains, 26 F. tularensis subsp. tularensis (type A) strains and a single F. novicida strain. The tree was generated from global multi-strain single nucleotide polymorphism (SNP) data collected using a set of six Affymetrix GeneChip® resequencing arrays with the non-repetitive portion of LVS (type B) as the reference sequence complemented with unique sequences of SCHU S4 (type A). Global SNP based phylogenetic clustering was able to resolve all non-related strains. The phylogenetic tree was used to guide the selection of informative SNPs specific to major nodes in the tree for development of a genotyping assay for identification of F. tularensis subspecies and clades. We designed and validated an assay that uses these SNPs to accurately genotype 39 additional F. tularensis strains as type A (A1, A2, A1a or A1b) or type B (B1 or B2).

Conclusion

Whole-genome SNP based clustering was shown to accurately identify SNPs for differentiation of F. tularensis subspecies and clades, emphasizing the potential power and utility of this methodology for selecting SNPs for typing of F. tularensis to the strain level. Additionally, whole genome sequence based SNP information gained from a representative population of strains may be used to perform evolutionary or phylogenetic comparisons of strains, or selection of unique strains for whole-genome sequencing projects.  相似文献   

17.
Amplified fragment length polymorphism (AFLP) analysis allows a rapid, relatively simple analysis of a large portion of a microbial genome, providing information about the species and its phylogenetic relationship to other microbes (Vos et al. 1995). The method simply surveys the genome for length and sequence polymorphisms. The AFLP pattern identified can be used for comparison to the genomes of other species. Unlike other methods, it does not rely on analysis of a single genetic locus that may bias the interpretation of results and does not require any prior knowledge of the targeted organism. Moreover, a standard set of reagents can be applied to any species without using species-specific information or molecular probes. We are using AFLP analysis to rapidly identify different bacterial species. A comparison of AFLP profiles generated from a large battery of Bacillus anthracis strains shows very little variability among different isolates (Keim et al. 1997). By contrast, there is a significant difference between AFLP profiles generated for any B. anthracis strain and even the most closely related Bacillus species. Sufficient variability is apparent among all known microbial species to allow phylogenetic analysis based on large numbers of genetically unlinked loci. These striking differences among AFLP profiles allow unambiguous identification of previously identified species and phylogenetic placement of newly characterized isolates relative to known species based on a large number of independent genetic loci. Data generated thus far show that the method provides phylogenetic analyses that are consistent with other widely accepted phylogenetic methods. However, AFLP analysis provides a more detailed analysis of the targets and samples a much larger portion of the genome. Consequently, it provides an inexpensive, rapid means of characterizing microbial isolates to further differentiate among strains and closely related microbial species. Such information cannot be rapidly generated by other means. AFLP sample analysis quickly generates a very large amount of molecular information about microbial genomes. However, this information cannot be analysed rapidly using manual methods. We are developing a large archive of electronic AFLP signatures that is being used to identify isolates collected from medical, veterinary, forensic and environmental samples. We are also developing the computational packages necessary to rapidly and unambiguously analyse the AFLP profiles and conduct a phylogenetic comparison of these data relative to information already in our database. We will use this archive and the associated algorithms to determine the species identity of previously uncharacterized isolates and place them phylogenetically relative to other microbes based on their AFLP signatures. This study provides significant new information about microbes with environmental, veterinary and medical significance. This information can be used in further studies to understand the relationships among these species and the factors that distinguish them from one another. It should also allow the identification of unique factors that contribute to important microbial traits, including pathogenicity and virulence. We are also using AFLP data to identify, isolate and sequence DNA fragments that are unique to particular microbial species and strains. The fragment patterns and sequence information provide insights into the complexity and organization of bacterial genomes relative to one another. They also provide the information necessary for the development of species-specific polymerase chain reaction primers that can be used to interrogate complex samples for the presence of B. anthracis, other microbial pathogens or their remnants.  相似文献   

18.
Characterization of Trichomonad Species and Strains by PCR Fingerprinting   总被引:9,自引:0,他引:9  
ABSTRACT. The random amplified polymorphic DNA (RAPD) technique was used for phylogenetic analysis of trichomonads, for intraspecies genealogical study of Trichomonas vaginalis strains, and for assessment of intrastrain polymorphism in Trichomonas vaginalis . The phylogenetic tree for 12 trichomonad species showed certain discrepancies with current models of trichomonad evolution. However, it shows that RAPD traits retain phylogenetically relevant information. The results of intraspecies analyses of 18 Trichomonas vaginalis strains suggested some concordance between the genetic relationship of strains and their geographic origin. They also suggested a concordance between the strain genetic relationships and the resistance to metronidazole. A concordance was also found with respect to the severity of disease observed in donor patients but not with the results of laboratory virulence assays. No concordance was found between genetic relationship of strains and strain infection with a dsRNA Trichomonas vaginalis virus (TVV). The latter suggests that TVV might be transmitted horizontally among Trichomonas vaginalis populations. The identity of RAPD patterns of clones isolated from in vitro cultures and those of the cultures reisolated independently from the same patient within a period of six weeks suggests that individual Trichomonas vaginalis strains are not polymorphic and that the RAPD patterns are stable. Therefore, the RAPD technique seems useful for addressing various clinically relevant issues.  相似文献   

19.
Although resolving phylogenetic relationships and establishing species limits are primary goals of systematics, these tasks remain challenging at both conceptual and analytical levels. Here, we integrated genomic and phenotypic data and employed a comprehensive suite of coalescent‐based analyses to develop and evaluate competing phylogenetic and species delimitation hypotheses in a recent evolutionary radiation of grasshoppers (Chorthippus binotatus group) composed of two species and eight putative subspecies. To resolve the evolutionary relationships within this complex, we first evaluated alternative phylogenetic hypotheses arising from multiple schemes of genomic data processing and contrasted genetic‐based inferences with different sources of phenotypic information. Second, we examined the importance of number of loci, demographic priors, number and kind of phenotypic characters and sex‐based trait variation for developing alternative species delimitation hypotheses. The best‐supported topology was largely compatible with phenotypic data and showed the presence of two clades corresponding to the nominative species groups, one including three well‐resolved lineages and the other comprising a four‐lineage polytomy and a well‐differentiated sister taxon. Integrative species delimitation analyses indicated that the number of employed loci had little impact on the obtained inferences but revealed the higher power provided by an increasing number of phenotypic characters and the usefulness of assessing their phylogenetic information content and differences between sexes in among‐taxa trait variation. Overall, our study highlights the importance of integrating multiple sources of information to test competing phylogenetic hypotheses and elucidate the evolutionary history of species complexes representing early stages of divergence where conflicting inferences are more prone to appear.  相似文献   

20.
A single Bacillus thuringiensis strain can harbor numerous different insecticidal crystal protein (cry) genes from 46 known classes or primary ranks. The cry1 primary rank is the best known and contains the highest number of cry genes which currently totals over 130. We have designed an oligonucleotide-based DNA microarray (cryArray) to test the feasibility of using microarrays to identify the cry gene content of B. thuringiensis strains. Specific 50-mer oligonucleotide probes representing the cry1 primary and tertiary ranks were designed based on multiple cry gene sequence alignments. To minimize false-positive results, a consentaneous approach was adopted in which multiple probes against a specific gene must unanimously produce positive hybridization signals to confirm the presence of a particular gene. In order to validate the cryArray, several well-characterized B. thuringiensis strains including isolates from a Mexican strain collection were tested. With few exceptions, our probes performed in agreement with known or PCR-validated results. In one case, hybridization of primary- but not tertiary-ranked cry1I probes indicated the presence of a novel cry1I gene. Amplification and partial sequencing of the cry1I gene in strains IB360 and IB429 revealed the presence of a cry1Ia gene variant. Since a single microarray hybridization can replace hundreds of individual PCRs, DNA microarrays should become an excellent tool for the fast screening of new B. thuringiensis isolates presenting interesting insecticidal activity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号