期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

DNA sequencing with positive and negative errors. 总被引：7，自引：0，他引：7

J B?azewicz P Formanowicz M Kasprzak W T Markiewicz J Weglarz 《Journal of computational biology》1999,6(1):113-123

The problem addressed in this paper is concerned with DNA sequencing by hybridization. An algorithm is proposed that solves a computational phase of this approach in the presence of both positive and negative errors resulting from the hybridization experiment. No a priori knowledge of the nature and source of these errors is required. An extensive set of computational experiments showed that the algorithm behaves surprisingly well if only positive errors appear. The general case, where positive and negative errors occur, can be also solved satisfactorily for an error rate up to 10%. 相似文献

2.

A heuristic managing errors for DNA sequencing

Błazewicz J Formanowicz P Guinand F Kasprzak M 《Bioinformatics (Oxford, England)》2002,18(5):652-660

MOTIVATION: A new heuristic algorithm for solving DNA sequencing by hybridization problem with positive and negative errors. RESULTS: A heuristic algorithm providing better solutions than algorithms known from the literature based on tabu search method. 相似文献

3.

Improved DNA sequencing quality and efficiency using an optimized fast cycle sequencing protocol 总被引：1，自引：0，他引：1

Platt AR Woodhall RW George AL 《BioTechniques》2007,43(1):58, 60, 62

相似文献

4.

Recombination values and their errors.

L Butler 《Canadian journal of genetics and cytology》1977,19(3):521-529

Two four-point testcrosses comprising 87,000 tomato plants were grown and the data collected from 28 subgroups. Each subgroup consisted of 2,000 or 5,000 plants and should give a valid estimate of the three recombination values. The 28 values for each interval give more outlyers (23% are outside the 95% limits set by the standard deviation calculated by the binomial formula square root of p q/n) than would be expected by chance. If each subgroup was regarded as the control and the other groups tested against this, then 42% of the time the two subgroups would be significantly different. It is suggested that there are many cases in the literature where this comparison has been made and the significant difference wrongly ascribed to treatment. While the causes of these changes in recombination value are unknown and therefore uncontrollable, they must be anticipated in all such studies. Control and treatment must be replicated enough that chance extreme values will not be attributed to treatment. 相似文献

5.

A quality control algorithm for DNA sequencing projects. 总被引：2，自引：0，他引：2

下载免费PDF全文

O White T Dunning G Sutton M Adams J C Venter C Fields 《Nucleic acids research》1993,21(16):3829-3838

Heterologous DNA sequences from rearrangements with the genomes of host cells, genomic fragments from hybrid cells, or impure tissue sources can threaten the purity of libraries that are derived from RNA or DNA. Hybridization methods can only detect contaminants from known or suspected heterologous sources, and whole library screening is technically very difficult. Detection of contaminating heterologous clones by sequence alignment is only possible when related sequences are present in a known database. We have developed a statistical test to identify heterologous sequences that is based on the differences in hexamer composition of DNA from different organisms. This test does not require that sequences similar to potential heterologous contaminants are present in the database, and can in principle detect contamination by previously unknown organisms. We have applied this test to the major public expressed sequence tag (EST) data sets to evaluate its utility as a quality control measure and a peer evaluation tool. There is detectable heterogeneity in most human and C.elegans EST data sets but it is not apparently associated with cross-species contamination. However, there is direct evidence for both yeast and bacterial sequence contamination in some public database sequences annotated as human. Results obtained with the hexamer test have been confirmed with similarity searches using sequences from the relevant data sets. 相似文献

6.

Evaluation of an automated DNA sequencing system developed in RIKEN

I. Endo S. Katsura Y. Murakami M. Yohda 《Bioprocess and biosystems engineering》1995,13(5):223-229

For the advancement of Human Genome Project, we have developed an automated DNA sequencing system HUGA-I. It is composed of several automated instruments and transfer robots connecting them. In this paper we describe the results of the performance evaluation test of HUGA-I. Although some of the system units showed good performances, the total performance of the HUGA-I was about 1/6 of the designed value. By revealing principal reasons of this poor performance, we would like to contribute to the automation in genome analysis, particularly in human genome analysis.Since the sequence technology advanced remarkably in these years, the system units of HUGA-I become older than those which are now commercially available and the throughput of it is out of our expectations. Nevertheless, we believe that it is meaningful to introduce the exact performance of HUGA-I and present the bottle neck points in the automating sequencing processes. Because, automation in the gene analysis is ultimately important, in particular for the analysis of large genomes such as the human genome. The aims of this paper are to introduce the results in performance evaluation of HUGA-I and to elucidate the bottle neck points in the automation of sequencing processes.The authors express their sincere thanks to Mr. Morisada Hayakawa and Mrs. Nobuko Kato for their technical asistance. 相似文献

7.

Analysis of context-dependent errors for illumina sequencing

Abnizova I Leonard S Skelly T Brown A Jackson D Gourtovaia M Qi G Te Boekhorst R Faruque N Lewis K Cox T 《Journal of bioinformatics and computational biology》2012,10(2):1241005

The new generation of short-read sequencing technologies requires reliable measures of data quality. Such measures are especially important for variant calling. However, in the particular case of SNP calling, a great number of false-positive SNPs may be obtained. One needs to distinguish putative SNPs from sequencing or other errors. We found that not only the probability of sequencing errors (i.e. the quality value) is important to distinguish an FP-SNP but also the conditional probability of "correcting" this error (the "second best call" probability, conditional on that of the first call). Surprisingly, around 80% of mismatches can be "corrected" with this second call. Another way to reduce the rate of FP-SNPs is to retrieve DNA motifs that seem to be prone to sequencing errors, and to attach a corresponding conditional quality value to these motifs. We have developed several measures to distinguish between sequence errors and candidate SNPs, based on a base call's nucleotide context and its mismatch type. In addition, we suggested a simple method to correct the majority of mismatches, based on conditional probability of their "second" best intensity call. We attach a corresponding second call confidence (quality value) of being corrected to each mismatch. 相似文献

8.

Quake: quality-aware detection and correction of sequencing errors

Kelley DR Schatz MC Salzberg SL 《Genome biology》2010,11(11):R116

We introduce Quake, a program to detect and correct errors in DNA sequencing reads. Using a maximum likelihood approach incorporating quality values and nucleotide specific miscall rates, Quake achieves the highest accuracy on realistically simulated reads. We further demonstrate substantial improvements in de novo assembly and SNP detection after using Quake. Quake can be used for any size project, including more than one billion human reads, and is freely available as open source software from . 相似文献

9.

A rapid method for isolating high quality plasmid DNA suitable for DNA sequencing. 总被引：19，自引：4，他引：19

下载免费PDF全文

D S Jones J P Schofield 《Nucleic acids research》1990,18(24):7463-7464

相似文献

10.

PCR and DNA sequencing 总被引：5，自引：0，他引：5

U B Gyllensten 《BioTechniques》1989,7(7):700-708

Specific DNA segments defined by the sequence of two oligonucleotides can be enzymatically amplified up to a millionfold using the polymerase chain reaction (PCR). One of the most significant uses of this technique is for generation of sequencing templates, either from cloned inserts or directly from genomic DNA. To avoid the problem of reassociation of the linear DNA strands in the sequencing reaction, ssDNA templates can be produced directly in the PCR or generated directly from dsDNA by enzymatic treatment, electrophoretic separation or affinity purification. By combining PCR with direct sequencing, both the amplification and the sequencing reaction can be performed in the same vial. Finally, use of fluorescently labeled terminators or sequencing primers will allow the whole procedure to be amenable to complete automation. 相似文献

11.

VSQual: a visual system to assist DNA sequencing quality control

Binneck E Silva JF Neumaier N Farias JR Nepomuceno AL 《Genetics and molecular research : GMR》2004,3(4):474-482

A lack of pliant software tools that support small- to medium-scale DNA sequencing efforts is a major hindrance for recording and using laboratory workflow information to monitor the overall quality of data production. Here we describe VSQual, a set of Perl programs intended to provide simple and powerful tools to check several quality features of the sequencing data generated by automated DNA sequencing machines. The core program of VSQual is a flexible Perl-based pipeline, designed to be accessible and useful for both programmers and non-programmers. This pipeline directs the processing steps and can be easily customized for laboratory needs. Basically, the raw DNA sequencing trace files are processed by Phred and Cross_match, then the outputs are parsed, reformatted into Web-based graphical reports, and added to a Web site structure. The result is a set of real time sequencing reports easily accessible and understood by common laboratory people. These reports facilitate the monitoring of DNA sequencing as well as the management of laboratory workflow, significantly reducing operational costs and ensuring high quality and scientifically reliable results. 相似文献

12.

Statistical modeling of sequencing errors in SAGE libraries

Beissbarth T Hyde L Smyth GK Job C Boon WM Tan SS Scott HS Speed TP 《Bioinformatics (Oxford, England)》2004,20(Z1):i31-i39

相似文献

13.

Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform

Melanie Schirmer Umer Z. Ijaz Rosalinda D'Amore Neil Hall William T. Sloan Christopher Quince 《Nucleic acids research》2015,43(6):e37

With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina''s MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. 相似文献

14.

DNA sequencing and gene structure 总被引：11，自引：0，他引：11

Walter Gilbert 《Bioscience reports》1981,1(5):353-375

相似文献

15.

DNA sequencing and helix–coil transition. III. DNA sequencing

M. Ya. Azbel 《Biopolymers》1980,19(1):95-109

We show that the fine oscillatory structure of the DNA melting curve can be used to determine explicitly the nucleotide composition and the order of certain domains within the DNA. If DNA is specifically fragmented, the order of fragments can be learned directly from a comparison of the differential melting curves of the nonfragmented and fragmented DNA. The indicated information may complement exact methods of DNA sequencing. The proposed analysis is applied to bacteriophage ?X-174, whose melting curve is known. Compared to the known ?X-174 DNA sequence, the results of the analysis are found to be very accurate. 相似文献

16.

BiQ Analyzer: visualization and quality control for DNA methylation data from bisulfite sequencing 总被引：11，自引：0，他引：11

Bock C Reither S Mikeska T Paulsen M Walter J Lengauer T 《Bioinformatics (Oxford, England)》2005,21(21):4067-4068

SUMMARY: Manual processing of DNA methylation data from bisulfite sequencing is a tedious and error-prone task. Here we present an interactive software tool that provides start-to-end support for this process. In an easy-to-use manner, the tool helps the user to import the sequence files from the sequencer, to align them, to exclude or correct critical sequences, to document the experiment, to perform basic statistics and to produce publication-quality diagrams.Emphasis is put on quality control: The program automatically assesses data quality and provides warnings and suggestions for dealing with critical sequences. The BiQ Analyzer program is implemented in the Java programming language and runs on any platform for which a recent Java virtual machine is available. AVAILABILITY: The program is available without charge for non-commercial users and can be downloaded from http://biq-analyzer.bioinf.mpi-inf.mpg.de/ 相似文献

17.

Detecting the impact of sequencing errors on SAGE data

Colinge J Feger G 《Bioinformatics (Oxford, England)》2001,17(9):840-842

SAGE data are obtained by sequencing short DNA tags. Due to the mistakes in DNA sequencing, SAGE data contain errors. We propose a new approach to identify tags whose abundance is biased by sequencing errors. This approach is based on a concept of neighbourhood: abundant tags can contaminate tags whose sequence is very close. The application of our approach reveals that moderately abundant tags can be generated by sequencing errors uniquely. It also allows for detecting correct rare tags. AVAILABILITY: Software is available only to non-profit entities and for non-commercial purposes upon request. 相似文献

18.

Handling long targets and errors in sequencing by hybridization.

Eran Halperin Shay Halperin Tzvika Hartman Ron Shamir 《Journal of computational biology》2003,10(3-4):483-497

Sequencing by hybridization (SBH) is a DNA sequencing technique, in which the sequence is reconstructed using its k-mer content. This content, which is called the spectrum of the sequence, is obtained by hybridization to a universal DNA array. Standard universal arrays contain all k-mers for some fixed k, typically 8 to 10. Currently, in spite of its promise and elegance, SBH is not competitive with standard gel-based sequencing methods. This is due to two main reasons: lack of tools to handle realistic levels of hybridization errors and an inherent limitation on the length of uniquely reconstructible sequence by standard universal arrays. In this paper, we deal with both problems. We introduce a simple polynomial reconstruction algorithm which can be applied to spectra from standard arrays and has provable performance in the presence of both false negative and false positive errors. We also propose a novel design of chips containing universal bases that differs from the one proposed by Preparata et al. (1999). We give a simple algorithm that uses spectra from such chips to reconstruct with high probability random sequences of length lower only by a squared log factor compared to the information theoretic bound. Our algorithm is very robust to errors and has a provable performance even if there are both false negative and false positive errors. Simulations indicate that its sensitivity to errors is also very small in practice. 相似文献

19.

Evaluation of genetic diversity of Chinese Pleurotus ostreatus cultivars using DNA sequencing technology

Yu Liu Shouxian Wang Yonggang Yin Feng Xu 《Annals of microbiology》2013,63(2):571-576

Pleurotus spp. are well-known and economically important cultivated mushrooms in China. Knowledge of the genetic relationship between the Chinese cultivars is essential to the improvement of P. ostreatus strains. Sequence analysis of the internal transcribed spacers (ITS), translation elongation factor (EF1α) and the second largest subunit of RNA polymerase II (RPB2) was performed to assess the genetic diversity of Pleurotus ostreatus strains cultivated in China. The phylogenetic tree constructed using the combined results of the ITS, EF1α and RPB2 sequence analyses showed the genetic relationships between the studied strains. Our phylogenetic analyses therefore provided valuable information on the relationships among the P. ostreatus strains used in this study and that was useful for examining genetic diversity among these strains. 相似文献

20.

A new and fast method for preparing high quality lambda DNA suitable for sequencing. 总被引：15，自引：6，他引：15

下载免费PDF全文

G Manfioletti C Schneider 《Nucleic acids research》1988,16(7):2873-2884

A method is described for the rapid purification of high quality lambda DNA. The method can be used from either liquid or plate lysates and on a small scale or a large scale. It relies on the preadsobtion of all polyanions present in the lysate to an insoluble anion-exchange matrix (DEAE or TEAE). Phage particles are then disrupted by combined treatment with EDTA/proteinase K and the resulting DNA is precipitated by the addition of the cationic detergent cetyl (or hexadecyl)-trimethyl ammonium bromide-CTAB (soluble anion-exchange matrix). The precipitated CTAB-DNA complex is then exchanged to Na-DNA and ethanol precipitated. The resultant purified DNA is suitable for enzymatic reactions and provides a high quality template for dideoxy-sequence analysis. 相似文献