首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Controversy exists over the origins of photosynthetic organelles in that contradictory trees arise from different sequence, biochemical and ultrastructural data sets. We propose a testable hypothesis which explains this inconsistency as a result of the differing GC contents of sequences. We report that current methods of tree reconstruction tend to group sequences with similar GC contents irrespective of whether the similar GC content is due to common ancestry or is independently acquired. Nuclear encoded sequences (high GC) give different trees from chloroplast encoded sequences (low GC). We find that current data is consistent with the hypothesis of multiple origins for photosynthetic organelles and single origins for each type of light harvesting complex.  相似文献   

2.
Use of quantitative real-time PCR (QPCR) with TaqMan probes is increasingly popular in various environmental works to detect and quantify a specific microorganism or a group of target microorganism. Although many aspects of conducting a QPCR assay have become very easy to perform, a proper design of oligonucleotide sequences comprising primers and a probe is still considered as one of the most important aspects of a QPCR application. This work was conducted to design group specific primer and probe sets for the detection of ammonia oxidizing bacteria (AOB) using a real-time PCR with a TaqMan system. The genera Nitrosomonas and Nitrosospira were grouped into five clusters based on similarity of their 16S rRNA gene sequences. Five group-specific AOB primer and probe sets were designed. These sets separately detect four subgroups of Nitrosomonas (Nitrosomonas europaea-, Nitrosococcus mobilis-, Nitrosomonas nitrosa-, and Nitrosomonas cryotolerans-clusters) along with the genus Nitrosospira. Target-group specificity of each primer and probe set was initially investigated by analyzing potential false results in silico, followed by a series of experimental tests for QPCR efficiency and detection limit. In general, each primer and probe set was very specific to the target group and sensitive to detect target DNA as low as two 16S rRNA gene copies per reaction mixture. QPCR efficiency, higher than 93.5%, could be achieved for all primer and probe sets. The primer and probe sets designed in this study can be used to detect and quantify the beta-proteobacterial AOB in biological nitrification processes and various environments.  相似文献   

3.

Background  

Traditional genome alignment methods consider sequence alignment as a variation of the string edit distance problem, and perform alignment by matching characters of the two sequences. They are often computationally expensive and unable to deal with low information regions. Furthermore, they lack a well-principled objective function to measure the performance of sets of parameters. Since genomic sequences carry genetic information, this article proposes that the information content of each nucleotide in a position should be considered in sequence alignment. An information-theoretic approach for pairwise genome local alignment, namely XMAligner, is presented. Instead of comparing sequences at the character level, XMAligner considers a pair of nucleotides from two sequences to be related if their mutual information in context is significant. The information content of nucleotides in sequences is measured by a lossless compression technique.  相似文献   

4.
A novel combinatorial approach to synthesize oligonucleotides on fluorescently encoded microspheres based on flow sorting and segmental solid-phase synthesis is described. BODIPY dyes were covalently attached to polystyrene (8.8 microm, 55% DVB) microsphere particles to generate four fluorescently encoded sets. 20-mer oligonucleotide sequences can be synthesized on these microspheres with yields comparable to conventional CPG supports (80% overall yield, average stepwise yield = 99%). The concept of segmental solid-phase synthesis by flow sorting was demonstrated by synthesizing unique 20-mer oligonucleotide sequences on each of four fluorescently encoded microsphere sets by including a flow sorting step (after first eight base additions) and flow cytometric detection of sequences synthesized on each microsphere set by hybridization with fluorescently labeled complementary sequence.  相似文献   

5.
Biological sequence families contain many sequences that are very similar to each other because they are related by evolution, so the strategy for splitting data into separate training and test sets is a nontrivial choice in benchmarking sequence analysis methods. A random split is insufficient because it will yield test sequences that are closely related or even identical to training sequences. Adapting ideas from independent set graph algorithms, we describe two new methods for splitting sequence data into dissimilar training and test sets. These algorithms input a sequence family and produce a split in which each test sequence is less than p% identical to any individual training sequence. These algorithms successfully split more families than a previous approach, enabling construction of more diverse benchmark datasets.  相似文献   

6.
Two approaches to the understanding of biological sequences are confronted. While the recognition of particular signals in sequences relies on complex physical interactions, the problem is often analysed in terms of the presence or absence of literal motifs (strings) in the sequence. We present here a test-case for evaluating the potential of this approach. We classify DNA sequences as positive or negative depending on whether they contain a single melted domain in the middle of the sequence, which is a global physical property. Two sets of positive "biological" sequences were generated by a computer simulation of evolutionary divergence along the branches of a phylogenetic tree, under the constraint that each intermediate sequence be positive. These two sets and a set of random positive sequences were subjected to pattern analysis. The observed local patterns were used to construct expert systems to discriminate positive from negative sequences. The experts achieved 79% to 90% success on random positive sequences and up to 99% on the biological sets, while making less than 2% errors on negative sequences. Thus, the global constraints imposed on sequences by a physical process may generate local patterns that are sufficient to predict, with a reasonable probability, the behaviour of the sequences. However, rather large sets of biological sequences are required to generate patterns free of illegitimate constraints. Furthermore, depending upon the initial sequence, the sets of sequences generated on a phylogenetic tree may be amenable or refractory to string analysis, while obeying identical physical constraints. Our study clarifies the relationship between experts' errors on positive and negative sequences, and the contributions of legitimate and illegitimate patterns to these errors. The test-case appears suitable both for further investigations of problems in the theory of sequence evolution and for further testing of pattern analysis techniques.  相似文献   

7.
We performed two sets of in vitro selections to dissect the role of the -10 base sequence in determining the rate and efficiency with which Escherichia coli RNA polymerase-sigma(70) forms stable complexes with a promoter. We identified sequences that (i) rapidly form heparin-resistant complexes with RNA polymerase or (ii) form heparin-resistant complexes at very low RNA polymerase concentrations. The sequences selected under the two conditions differ from each other and from the consensus -10 sequence. The selected promoters have the expected enhanced binding and kinetic properties and are functionally better than the consensus promoter sequence in directing RNA synthesis in vitro. Detailed analysis of the selected promoter functions shows that each step in this multistep pathway may have different sequence requirements, meaning that the sequence of a strong promoter does not contain the optimal sequence for each step but instead is a compromise sequence that allows all steps to proceed with minimal constraint.  相似文献   

8.
Human mitochondrial DNA (mtDNA) sequences reveal an abundance of polymorphic sites in which the frequencies of the segregating bases are very different. A typical polymorphism involves one base at low frequency and the other base at high frequency. In contrast, nuclear gene data sets tend to show an excess of polymorphisms in which both segregating bases are at intermediate frequencies. A new statistical test of this difference finds significant differences between mtDNA and nuclear gene data sets reported in the literature. However, differences in the polymorphism patterns could be caused by different sample origins for the different data sets. To examine the mtDNA-nuclear difference more closely, DNA sequences were generated from a portion of the X-linked pyruvate dehydrogenase E1 alpha subunit (PDHA1) locus and from a portion of mitochondrial control region I (CRI) from each of eight individuals, four from sub-Saharan Africa. The two genes revealed a significant difference in the site frequency distribution of polymorphic sites. PDHA1 revealed an excess of intermediate-frequency polymorphisms, while CRI showed an excess of sites with the low-high frequency pattern. The discrepancy suggests that mitochondrial variation has been shaped by natural selection, and may not be ideal for some questions on human origins.   相似文献   

9.
Three particulate methane monooxygenase PCR primer sets (A189-A682, A189-A650, and A189-mb661) were investigated for their ability to assess methanotroph diversity in soils from three sites, i.e., heath, oak, and sitka, each of which was capable of oxidizing atmospheric concentrations of methane. Each PCR primer set was used to construct a library containing 50 clones from each soil type. The clones from each library were grouped by restriction fragment length polymorphism, and representatives from each group were sequenced and analyzed. Libraries constructed with the A189-A682 PCR primer set were dominated by amoA-related sequences or nonspecific PCR products with nonsense open reading frames. The primer set could not be used to assess methanotroph diversity in these soils. A new pmoA-specific primer, A650, was designed in this study. The A189-A650 primer set demonstrated distinct biases both in clone library analysis and when incorporated into denaturing gradient gel electrophoresis analysis. The A189-mb661 PCR primer set demonstrated the largest retrieval of methanotroph diversity of all of the primer sets. However, this primer set did not retrieve sequences linked with novel high-affinity methane oxidizers from the soil libraries, which were detected using the A189-A650 primer set. A combination of all three primer sets appears to be required to examine both methanotroph diversity and the presence of novel methane monooxygenase sequences.  相似文献   

10.
We present theoretical considerations that suggest that synonymous-codon usage might be expected to be close to an equilibrium distribution given a very homogeneous process of silent substitution. By homogeneous we mean that substitution depends only on the two bases involved, so that 12 base-substitution rates completely describe the silent substitution process. We have developed a method of statistically testing for such homogeneous equilibrium and applied it to reported data on the codon usages of different classes of organisms. Weakly expressed bacterial sequences and both mammalian and nonmammalian eukaryotic sequences deviate significantly from a random pattern of codon usage, in the direction of homogeneous equilibrium. On the other hand, highly expressed bacterial sequences do not exhibit homogeneous equilibrium, which may be correlated with recent experimental results showing that they are optimized to accept the most abundant tRNAs. To examine the effect of amino acid replacements on the homogeneous model of silent substitution, we divided the amino acids with degenerate codes into two classes, those with high mutabilities and those with low, and performed the same analysis on bacterial and eukaryotic data sets. The codon sets of the highly mutable class of amino acids are not further from homogeneous equilibrium than are the codon sets of the class with low mutabilities. We also found for the eukaryotic data that these independent classes of codon sets show very similar equilibrium patterns. The various results suggest a high level of uniformity in the process of silent fixation in the different synonymous-codon sets, especially in eukaryotes.  相似文献   

11.
Summary We present theoretical considerations that suggest that synonymous-codon usage might be expected to be close to an equilibrium distribution given a very homogeneous process of silent substitution. By homogeneous we mean that substitution depends only on the two bases involved, so that 12 base-substitution rates completely describe the silent substitution process. We have developed a method of statistically testing for such homogeneous equilibrium and applied it to reported data on the codon usages of different classes of organisms. Weakly expressed bacterial sequences and both mammalian and nonmammalian eukaryotic sequences deviate significantly from a random pattern of codon usage, in the direction of homogeneous equilibrium. On the other hand, highly expressed bacterial sequences do not exhibit homogeneous equilibrium, which may be correlated with recent experimental results showing that they are optimized to accept the most abundant tRNAs. To examine the effect of amino acid replacements on the homogeneous model of silent substitution, we divided the amino acids with degenerate codes into two classes, those with high mutabilities and those with low, and performed the same analysis on bacterial and eukaryotic data sets. The codon sets of the highly mutable class of amino acids are not further from homogeneous equilibrium than are the codon sets of the class with low mutabilities. We also found for the eukaryotic data that these independent classes of codon sets show very similar equilibrium patterns. The various results suggest a high level of uniformity in the process of silent fixation in the different synonymous-codon sets, especially in eukaryotes.  相似文献   

12.
This study makes use of three sources of data, morphology and two chloroplast DNA sequences,ndhF andrbcL, to resolve relationships in Gesneriaceae. Cladograms from each of the three data sets separately are not topologically congruent. Statistical indices suggest that each data set is congruent with thendhF data althoughrbcL and morphology are themselves incongruent. Consensus methods provide no resolution of taxonomic relationships when trees from the different data sets are combined. Combining data sets generally results in cladograms that are more fully resolved than each of the data sets analyzed separately and support for the clades increases based on higher decay index and bootstrap values. These results indicate that there is a phylogenetic signal common to each of the data sets, however, the noise (errors due to homoplasy, mis-scoring, etc.) unique to each data source masks this signal. In combining the data, the evidence for the common evolutionary history in each data set overcomes the noise and is apparent in the resulting trees.  相似文献   

13.
Three particulate methane monooxygenase PCR primer sets (A189-A682, A189-A650, and A189-mb661) were investigated for their ability to assess methanotroph diversity in soils from three sites, i.e., heath, oak, and sitka, each of which was capable of oxidizing atmospheric concentrations of methane. Each PCR primer set was used to construct a library containing 50 clones from each soil type. The clones from each library were grouped by restriction fragment length polymorphism, and representatives from each group were sequenced and analyzed. Libraries constructed with the A189-A682 PCR primer set were dominated by amoA-related sequences or nonspecific PCR products with nonsense open reading frames. The primer set could not be used to assess methanotroph diversity in these soils. A new pmoA-specific primer, A650, was designed in this study. The A189-A650 primer set demonstrated distinct biases both in clone library analysis and when incorporated into denaturing gradient gel electrophoresis analysis. The A189-mb661 PCR primer set demonstrated the largest retrieval of methanotroph diversity of all of the primer sets. However, this primer set did not retrieve sequences linked with novel high-affinity methane oxidizers from the soil libraries, which were detected using the A189-A650 primer set. A combination of all three primer sets appears to be required to examine both methanotroph diversity and the presence of novel methane monooxygenase sequences.  相似文献   

14.
We have analyzed a total of 12 different global and local multiple protein-sequence alignment methods. The purpose of this study is to evaluate each method's ability to correctly identify the ordered series of motifs found among all members of a given protein family. Four phylogenetically distributed sets of sequences from the hemoglobin, kinase, aspartic acid protease, and ribonuclease H protein families were used to test the methods. The performance of all 12 methods was affected by (1) the number of sequences in the test sets, (2) the degree of similarity among the sequences, and (3) the number of indels required to produce a multiple alignment. Global methods generally performed better than local methods in the detection of motif patterns.   相似文献   

15.
Transfer of sequence tagged site PCR markers between wheat and barley.   总被引:6,自引:0,他引:6  
Transfer of mapping information between related species has facilitated the development of restriction fragment length polymorphism (RFLP) maps in the cereals. Sequence tagged site (STS) primer sets for use in the polymerase chain reaction may be developed from mapped RFLP clones. For this study, we mapped 97 STS primer sets to chromosomes in wheat and barley to determine the potential transferability of the primer sets and the degree of correspondence between RFLP and STS locations. STS products mapped to the same chromosome group in wheat and barley 75% of the time. RFLP location predicted STS location 69% of the time in wheat and 56% of the time in barley. Southern hybridizations showed that most primer sets amplified sequences homologous to the RFLP clone, although additional sequences were often amplified that did not hybridize to the RFLP clone. Nontarget sequences were often amplified when primer sets were transferred across species. In general, results suggest a good probability of success in transferring STSs between wheat and barley, and that RFLP location can be used to predict STS location. However, transferability of STSs cannot be assumed, suggesting a need for recombinational mapping of STS markers in each species as new primer sets are developed. Key words : sequence tagged sites, PCR, wheat, barley.  相似文献   

16.
Two primer sets for direct sequence determination of all seven rRNA operons (rrn) of Escherichia coli have been developed; one is for specific-amplification of each rrn operon and the other is for direct sequencing of the amplified operons. Using these primer sets, we determined the nucleotide sequences of seven rrn operons, including promoter and terminator regions, of an enterohemorrhagic E. coli (EHEC) O157:H7 Sakai strain. To elucidate the intercistronic or intraspecific variation of rrn operons, their sequences were compared with those for the K-12 rrn operons. The rrn genes and the internal transcribed spacer regions showed a higher similarity to each other in each strain than between the corresponding operons of the two strains. However, the degree of intercistronic homogeneity was much higher in the EHEC strain than in K-12. In contrast, promoter and terminator regions in each operons were conserved between the corresponding operons of the two strains, which exceeded intercistronic similarity.  相似文献   

17.
An algorithm is presented for the generation of sets of non-interacting DNA sequences, employing existing thermodynamic models for the prediction of duplex stabilities and secondary structures. A DNA ‘word’ structure is employed in which individual DNA ‘words’ of a given length (e.g. 12mer and 16mer) may be concatenated into longer sequences (e.g. four tandem words and six tandem words). This approach, where multiple word variants are used at each tandem word position, allows very large sets of non-interacting DNA strands to be assembled from combinations of the individual words. Word sets were generated and their figures of merit are compared to sets as described previously in the literature (e.g. 4, 8, 12, 15 and 16mer). The predicted hybridization behavior was experimentally verified on selected members of the sets using standard UV hyperchromism measurements of duplex melting temperatures (Tms). Additional experimental validation was obtained by using the sequences in formulating and solving a small example of a DNA computing problem.  相似文献   

18.
The use of partial covariance models to search for RNA family members in genomic sequence databases is explored. The partial models are formed from contiguous subranges of the overall RNA family multiple alignment columns. A binary decision-tree framework is presented for choosing the order to apply the partial models and the score thresholds on which to make the decisions. The decision trees are chosen to minimize computation time subject to the constraint that all of the training sequences are passed to the full covariance model for final evaluation. Computational intelligence methods are suggested to select the decision tree since the tree can be quite complex and there is no obvious method to build the tree in these cases. Experimental results from seven RNA families shows execution times of 0.066-0.268 relative to using the full covariance model alone. Tests on the full sets of known sequences for each family show that at least 95 percent of these sequences are found for two families and 100 percent for five others. Since the full covariance model is run on all sequences accepted by the partial model decision tree, the false alarm rate is at least as low as that of the full model alone.  相似文献   

19.
The most popular way of comparing the performance of multiple sequence alignment programs is to use empirical testing on sets of test sequences. Several such test sets now exist, each with potential strengths and weaknesses. We apply several different alignment packages to 6 benchmark datasets, and compare their relative performances. HOMSTRAD, a collection of alignments of homologous proteins, is regularly used as a benchmark for sequence alignment though it is not designed as such, and lacks annotation of reliable regions within the alignment. We introduce this annotation into HOMSTRAD using protein structural superposition. Results on each database show that method performance is dependent on the input sequences. Alignment benchmarks are regularly used in combination to measure performance across a spectrum of alignment problems. Through combining benchmarks, it is possible to detect whether a program has been over-optimised for a single dataset, or alignment problem type.  相似文献   

20.
Increasing evidence of the fungal diversity in deep-sea sediments has come from amplification of environmental DNA with fungal specific or eukaryote primer sets. In order to assess the fungal diversity in deep-sea sediments of the Central Indian Basin (CIB) at ~5,000 m depth, we amplified sediment DNA with four different primer sets. These were fungal-specific primer pair ITS1F/ITS4 (internal transcribed spacers), universal 18S rDNA primers NS1/NS2, Euk18S-42F/Euk18S-1492R and Euk18S-555F/Euk18S-1269R. One environmental library was constructed with each of the primer pairs, and 48 clones were sequenced per library. These sequences resulted in 8 fungal Operational Taxonomic Units (OTUs) with ITS and 19 OTUs with 18S rDNA primer sets respectively by taking into account the 2% sequence divergence cut-off for species delineation. These OTUs belonged to 20 distinct fungal genera of the phyla Ascomycota and Basidiomycota. Seven sequences were found to be divergent by 79–97% from the known sequences of the existing database and may be novel. A majority of the sequences clustered with known sequences of the existing taxa. The phylogenetic affiliation of a few fungal sequences with known environmental sequences from marine and hypersaline habitat suggests their autochthonous nature or adaptation to marine habitat. The amplification of sequences belonging to Exobasidiomycetes and Cystobasidiomycetes from deep-sea is being reported for the first time in this study. Amplification of fungal sequences with eukaryotic as well as fungal specific primers indicates that among eukaryotes, fungi appear to be a dominant group in the sampling site of the CIB.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号