首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sequencing by hybridization is a method for reconstructing a DNA sequence based on its k-mer content. This content, called the spectrum of the sequence, can be obtained from hybridization with a universal DNA chip. However, even with a sequencing chip containing all 4(9) 9-mers and assuming no hybridization errors, only about 400-bases-long sequences can be reconstructed unambiguously. Drmanac et al. (1989) suggested sequencing long DNA targets by obtaining spectra of many short overlapping fragments of the target, inferring their relative positions along the target, and then computing spectra of subfragments that are short enough to be uniquely recoverable. Drmanac et al. do not treat the realistic case of errors in the hybridization process. In this paper, we study the effect of such errors. We show that the probability of ambiguous reconstruction in the presence of (false negative) errors is close to the probability in the errorless case. More precisely, the ratio between these probabilities is 1 + O(p = (1 - p)(4). 1 = d) where d is the average length of subfragments, and p is the probability of a false negative. We also obtain lower and upper bounds for the probability of unambiguous reconstruction based on an errorless spectrum. For realistic chip sizes, these bounds are tighter than those given by Arratia et al. (1996). Finally, we report results on simulations with real DNA sequences, showing that even in the presence of 50% false negative errors, a target of cosmid length can be recovered with less than 0.1% miscalled bases.  相似文献   

2.
Sequencing by hybridization (SBH) is a method for sequencing DNA. The Watson-Crick complementarity of DNA can be used to determine whether the DNA contains an oligonucleotide substring. A large number of oligonucleotides can be arranged on an array (SBH chip). A combinatorial method is used to construct the sequence from the collection of probes that occur in it. We develop an idea of Margaritis and Skiena and propose an algorithm that uses a series of small SBH chips to sequence long strings. The total number of probes used by our method matches the information theoretical lower bound up to a constant factor.  相似文献   

3.
The SHOM method (Sequencing by Hybridization with Oligonucleotide Matrix) developed in 1988 is a new approach to nucleic acid sequencing by hybridization to an oligonucleotide matrix composed of an array of immobilized oligonucleotides. The original matrix proposed for sequencing by SHOM had to contain at least 65,536 octanucleotides. The present work describes a new family of matrices, which allows one to reduce the number of synthesized oligonucleotides 5-15 times without essentially decreasing the resolving power of the method.  相似文献   

4.
DNA sequencing by hybridization using semi-degenerate bases.   总被引:1,自引:0,他引:1  
One way to enhance the performance of hybridization microarrrays for DNA de novo sequencing is the use of probing patterns with gaps of unsampled positions. Ideally, such gaps could be realized by the inclusion into microarray oligos (probes) of wild-card compounds, referred to as universal bases (which bind nonspecifically to natural bases). The suggested alternative is to deploy in the gap positions degenerate bases, i.e., uniform mixtures of the four natural bases, with ensuing deterioration of the hybridization signal. In this paper, we show that such signal loss is a minor shortcoming, compared with the fact that degenerate bases cannot be treated as universal. Indeed, the substantial spread of hybridization energies at any microarray feature is such that on overwhelming number of mismatches bind more strongly than legal matches. We observed, however, that much narrower energy spreads are exhibited by pairs of bases in the same strength class (A-T and C-G). We call semi-degenerate a gap position realized with bases in the same energy class and show that well-known sequence reconstruction algorithms can be modified to achieve substantial improvements in sequencing effectiveness. For example, with a 4(9)-feature microarray and an acceptable weakening of the hybridization signal, one may achieve lengths of about 4,000 bases (compared with < 250 of the standard uniform method). Our approach also incorporates the use of a spectrum expressed in terms of observed feature melting temperatures (analog spectrum), rather than binary decisions made directly at the biochemical level (digital spectrum). While universal bases represent the ultimate goal of sequencing by hybridization, semidegenerate natural bases are the most effective known substitute.  相似文献   

5.
MOTIVATION: It is widely recognized that the hybridization process is prone to errors and that the future of DNA sequencing by hybridization is predicated on the ability to successfully cope with such errors. However, the occurrence of hybridization errors results in the computational difficulty of the reconstruction of DNA sequencing by hybridization. The reconstruction problem of DNA sequencing by hybridization with errors is a strongly NP-hard problem. So far the problem has not been solved well. RESULTS: In this paper, a new approach is presented to solve the reconstruction problem of DNA sequencing by hybridization, which realizes the computational part of the SBH experiment. The proposed algorithm accepts both the negative and positive errors. The computational experiments show that the algorithm behaves satisfactorily, especially for the case with k-tuple repetitions and positive errors.  相似文献   

6.
MOTIVATION: A realistic approach to sequencing by hybridization must deal with realistic sequencing errors. The results of such a method can surely be applied to similar sequencing tasks. RESULTS: We provide the first algorithms for interactive sequencing by hybridization which are robust in the presence of hybridization errors. Under a strong error model allowing both positive and negative hybridization errors without repeated queries, we demonstrate accurate and efficient reconstruction with error rates up to 7%. Under the weaker traditional error model of Shamir and Tsur (Proceedings of the Fifth International Conference on Computational Molecular Biology (RECOMB-01), pp 269-277, 2000), we obtain accurate reconstructions with up to 20% false negative hybridization errors. Finally, we establish theoretical bounds on the performance of the sequential probing algorithm of Skiena and Sundaram (J. Comput. Biol., 2, 333-353, 1995) under the strong error model. AVAILABILTY: Freely available upon request. CONTACT: skiena@cs.sunysb.edu.  相似文献   

7.
Sequencing by hybridization (SBH) is a DNA sequencing technique, in which the sequence is reconstructed using its k-mer content. This content, which is called the spectrum of the sequence, is obtained by hybridization to a universal DNA array. Standard universal arrays contain all k-mers for some fixed k, typically 8 to 10. Currently, in spite of its promise and elegance, SBH is not competitive with standard gel-based sequencing methods. This is due to two main reasons: lack of tools to handle realistic levels of hybridization errors and an inherent limitation on the length of uniquely reconstructible sequence by standard universal arrays. In this paper, we deal with both problems. We introduce a simple polynomial reconstruction algorithm which can be applied to spectra from standard arrays and has provable performance in the presence of both false negative and false positive errors. We also propose a novel design of chips containing universal bases that differs from the one proposed by Preparata et al. (1999). We give a simple algorithm that uses spectra from such chips to reconstruct with high probability random sequences of length lower only by a squared log factor compared to the information theoretic bound. Our algorithm is very robust to errors and has a provable performance even if there are both false negative and false positive errors. Simulations indicate that its sensitivity to errors is also very small in practice.  相似文献   

8.
The concept of 'organismal complexity' has had a chequered career in genetics, with no rigorous operational definition available for the term. The recent finding that Drosophila melanogaster has more than four thousand fewer genes than the nematode forces a re-examination of whether gene number, in itself, can be taken as any real guide to complexity.  相似文献   

9.
A method for DNA sequencing by hybridization with oligonucleotide matrix.   总被引:12,自引:0,他引:12  
A new technique of DNA sequencing by hybridization with oligonucleotide matrix (SHOM) which could also be applied for DNA mapping and fingerprinting, mutant diagnostics, etc., has been tested in model experiments. A dot matrix was prepared which contained 9 overlapping octanucleotides (8-mers) complementary to a common 17-mer. Each of the 8-mers was immobilized as individual dot in thin layer of polyacrylamide gel fixed on a glass plate. The matrix was hybridized with the 32P-labeled 17-mer and three other 17-mers differing from the first one by a single base change. The hybridization enabled us to distinguish perfect duplexes from those containing mismatches in 32 out of 35 cases. These results are discussed with respect to the applicability of the approach for sequencing. It was shown that hybridization of DNA with an immobilized 8-mer in the presence of a labeled 5-mer led to the formation of a stable duplex with the 5-mer only if the 5- and the 8-mers were in continuous stacking making a perfect nicked duplex 13 (5+8) base pairs long. These experiments and computer simulations suggest that continuous stacking hybridization may increase the efficiency of sequencing so that random or natural coding DNA fragments about 1000 bases long could be sequenced in more than 97% of cases. Miniaturized matrices or sequencing chips were designed, where oligonucleotides were immobilized within 100 x 100 micron dots disposed at 100 micron intervals. Hybridization of fluorescently labeled DNA fragments with microchips may simplify sequencing and ensure sensitivity of at least 10 attomoles per dot. The perspectives and limitations of SHOM are discussed.  相似文献   

10.
The efficiency of sequencing by hybridization to an oligonucleotide microchip grows with an increase in the number and in the length of the oligonucleotides; however, such increases raise enormously the complexity of the microchip and decrease the accuracy of hybridization. We have been developing the technique of contiguous stacking hybridization (CSH) to circumvent these shortcomings. Stacking interactions between adjacent bases of two oligonucleotides stabilize their contiguous duplex with DNA. The use of such stacking increases the effective length of microchip oligonucleotides, enhances sequencing accuracy and allows the sequencing of longer DNA. The effects of mismatches, base composition, length and other factors on the stacking are evaluated. Contiguous stacking hybridization of DNA with immobilized 8mers and one or two 5mers labeled with two different fluorescent dyes increases the effective length of sequencing oligonucleotides from 8 to 13 and 18 bases, respectively. The incorporation of all four bases or 5-nitroindole as a universal base into different positions of the 5mers permitted a decrease in the number of additional rounds of hybridization. Contiguous stacking hybridization appears to be a promising approach to significantly increasing the efficiency of sequencing by hybridization.  相似文献   

11.
The homogeneity of DNA complementary to the 35S RNA subunit of avian myeloblastosis virus (AMV) has been demonstrated by single or multistep hybridization. For multistep hybridizations, 35S AMV RNA was preselected for its ability to hybridize either to unfractionated leukemic DNA or to leukemic DNA enriched for unique or for reiterated sequences. These experiments indicate that the viral genome is complementary to DNA sequences with a low reiteration frequency. Competition experiments confirm the absence of fast-hybridizing sequences in viral DNA. Computer analyses of the data reveal that there are two to four copies of viral DNA in infected cells.  相似文献   

12.
MOTIVATION: Developing a new method of assembling small sequences based on sequencing by hybridization with many positive and negative faults. First, an interpretation of a generic traveling salesman problem is provided (i.e. finding the shortest route for visiting many cities), using genetic algorithms. Second, positive errors are excluded before assembly by a sanitization process. RESULTS: The present method outperforms those described in previous studies, in terms of both time and accuracy. AVAILABILITY: http://kamit.med.u-tokai.ac.jp/~takaho/sbh/index.html  相似文献   

13.

Background

Genome sequences, now available for most pathogens, hold promise for the rational design of new therapies. However, biological resources for genome-scale identification of gene function (notably genes involved in pathogenesis) and/or genes essential for cell viability, which are necessary to achieve this goal, are often sorely lacking. This holds true for Neisseria meningitidis, one of the most feared human bacterial pathogens that causes meningitis and septicemia.

Results

By determining and manually annotating the complete genome sequence of a serogroup C clinical isolate of N. meningitidis (strain 8013) and assembling a library of defined mutants in up to 60% of its non-essential genes, we have created NeMeSys, a biological resource for Neisseria meningitidis systematic functional analysis. To further enhance the versatility of this toolbox, we have manually (re)annotated eight publicly available Neisseria genome sequences and stored all these data in a publicly accessible online database. The potential of NeMeSys for narrowing the gap between sequence and function is illustrated in several ways, notably by performing a functional genomics analysis of the biogenesis of type IV pili, one of the most widespread virulence factors in bacteria, and by identifying through comparative genomics a complete biochemical pathway (for sulfur metabolism) that may potentially be important for nasopharyngeal colonization.

Conclusions

By improving our capacity to understand gene function in an important human pathogen, NeMeSys is expected to contribute to the ongoing efforts aimed at understanding a prokaryotic cell comprehensively and eventually to the design of new therapies.  相似文献   

14.
15.
16.
The complexity of the overlap method for sequencing biopolymers   总被引:1,自引:0,他引:1  
The problem of trying to reconstruct the sequence of a biopolymer by using overlapping fragments obtained from cleaving agents is shown to be computationally intractable. This strongly suggests that any computer program for overlap sequencing, even though it may work well for a limited number of inputs, will not work sufficiently for all inputs. However, if the problem is restricted so that certain crucial fragments are known, called prime strings, a sequence can be found efficiently in all cases. Graph theory techniques for doing so can also be used to count the number of sequences consistent with the fragment data to determine whether a unique sequence has been obtained.  相似文献   

17.
18.
Oligonucleotide microchips are manufactured by immobilizing presynthesized oligonucleotides within 0.1 x 0.1 x 0.02 mm or 1 x 1 x 0.02 mm polyacrylamide gel pads arranged on the surface of a microscope slide. The gel pads are separated from each other by hydrophobic glass spacers and serve as a kind of 'microtest tube' of 200 pl or 20 nl volume, respectively. Fractionation of single-stranded DNAs is carried out by their hybridization with chip pads containing immobilized 10mers. DNA extracted separately from each pad is transferred onto a sequencing chip and analyzed thereon. The chip, containing a set of 10mers, was enzymatically phosphorylated, then hybridized with DNA and ligated in a site-directed manner with a contiguously stacked 5mer. Several cycles of successive hybridization-ligation of the chip-bound 10mers with different contiguously stacked 5mers and hybridized with DNA were carried out to sequence DNA containing tetranucleotide repeats. Combined use of these techniques show significant promise for sequence comparison of homologous regions in different genomes and for sequence analysis of comparatively long DNA fragments or DNA containing internal repeats.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号