共查询到20条相似文献,搜索用时 0 毫秒
1.
O White T Dunning G Sutton M Adams J C Venter C Fields 《Nucleic acids research》1993,21(16):3829-3838
Heterologous DNA sequences from rearrangements with the genomes of host cells, genomic fragments from hybrid cells, or impure tissue sources can threaten the purity of libraries that are derived from RNA or DNA. Hybridization methods can only detect contaminants from known or suspected heterologous sources, and whole library screening is technically very difficult. Detection of contaminating heterologous clones by sequence alignment is only possible when related sequences are present in a known database. We have developed a statistical test to identify heterologous sequences that is based on the differences in hexamer composition of DNA from different organisms. This test does not require that sequences similar to potential heterologous contaminants are present in the database, and can in principle detect contamination by previously unknown organisms. We have applied this test to the major public expressed sequence tag (EST) data sets to evaluate its utility as a quality control measure and a peer evaluation tool. There is detectable heterogeneity in most human and C.elegans EST data sets but it is not apparently associated with cross-species contamination. However, there is direct evidence for both yeast and bacterial sequence contamination in some public database sequences annotated as human. Results obtained with the hexamer test have been confirmed with similarity searches using sequences from the relevant data sets. 相似文献
2.
Otto TD Vasconcellos EA Gomes LH Moreira AS Degrave WM Mendonça-Lima L Alves-Ferreira M 《Genetics and molecular research : GMR》2008,7(3):861-871
Optimizing and monitoring the data flow in high-throughput sequencing facilities is important for data input and output, for tracking the status of results for the users of the facility, and to guarantee a good, high-quality service. In a multi-user system environment with different throughputs, each user wants to access his/her data easily, track his/her sequencing history, analyze sequences and their quality, and apply some basic post-sequencing analysis, without the necessity of installing further software. Recently, Fiocruz established such a core facility as a "technological platform". Infrastructure includes a 48-capillary 3730 DNA Sequence Analyzer (Applied Biosystems) and supporting equipment. The service includes running samples for large-scale users, performing DNA sequencing reactions and runs for medium and small users, and participation in partial or full genome projects. We implemented a workflow that fulfills these requirements for small and high throughput users. Our implementation also includes the monitoring of data for continuous quality improvement (reports by plate, month and user) by the sequencing staff. For the user, different analyses of the chromatograms, such as visualization of good quality regions, as well as processing, such as comparisons or assemblies, are available. So far, 180 users have made use of the service, generating 155,000 sequences, 35% of which were produced for the BCG Moreau-RJ genome project. The pipeline (named ChromaPipe for Chromatogram Pipeline) is available for download by the scientific community at the url http://bioinfo.pdtis.fiocruz.br/ChromaPipe/. The support for assembly is also configured as a web service: http://bioinfo.pdtis.fiocruz.br/Assembly/. 相似文献
3.
4.
BiQ Analyzer: visualization and quality control for DNA methylation data from bisulfite sequencing 总被引:11,自引:0,他引:11
Bock C Reither S Mikeska T Paulsen M Walter J Lengauer T 《Bioinformatics (Oxford, England)》2005,21(21):4067-4068
SUMMARY: Manual processing of DNA methylation data from bisulfite sequencing is a tedious and error-prone task. Here we present an interactive software tool that provides start-to-end support for this process. In an easy-to-use manner, the tool helps the user to import the sequence files from the sequencer, to align them, to exclude or correct critical sequences, to document the experiment, to perform basic statistics and to produce publication-quality diagrams.Emphasis is put on quality control: The program automatically assesses data quality and provides warnings and suggestions for dealing with critical sequences. The BiQ Analyzer program is implemented in the Java programming language and runs on any platform for which a recent Java virtual machine is available. AVAILABILITY: The program is available without charge for non-commercial users and can be downloaded from http://biq-analyzer.bioinf.mpi-inf.mpg.de/ 相似文献
5.
Todesco S Campagna D Levorin F D'Angelo M Schiavon R Valle G Vezzi A 《BioTechniques》2008,44(1):60, 62, 64
Genome sequencing projects are either based on whole genome shotgun (WGS) or on a BAC-by-BAC strategy. Although WGS is in most cases the preferred choice, sometimes the BAC-by-BAC approach may be better because it requires a much simpler assembly process. Furthermore, when the study is limited to specific regions of the genome, the WGS would require an unjustified effort, making the BAC-by-BAC the only feasible strategy. In this paper we describe an informatics pipeline called PABS (Platform Assisted BAC-by-BAC Sequencing) that we developed to provide a tool to optimize the BAC-by-BAC sequencing strategy. PABS has two main functions: (i) PABS-Select, to choose suitable overlapping clones; and (ii) PABS-Validate, to verify whether a BAC under analysis is actually overlapping the neighboring BAC. 相似文献
6.
Background
Better automation, lower cost per reaction and a heightened interest in comparative genomics has led to a dramatic increase in DNA sequencing activities. Although the large sequencing projects of specialized centers are supported by in-house bioinformatics groups, many smaller laboratories face difficulties managing the appropriate processing and storage of their sequencing output. The challenges include documentation of clones, templates and sequencing reactions, and the storage, annotation and analysis of the large number of generated sequences. 相似文献7.
We have developed an automated system for management of DNA sequencing projects. The system, named GEL, can handle data from both random sequences and from fragments whose relative positions are known. The system is highly interactive, self-documenting, and forgiving; it is designed for use by computer-naive molecular biologists. An editor designed specifically for sequences allows simple entry of data. Special functions allow direct checking and immediate editing of paired readings of the same gel. Merging of new random fragment sequences into the project as a whole is semi-automated. The user is shown probable overlaps if they exist, and can edit either the sequences or the consensus. Heuristic approaches to limiting the kinds of searches made in the merging process reduces the problem of combinatoric data overload as sequencing projects grow large. Complete histories of all entries, editing changes, and generation of consensus sequences are automatically prepared. 相似文献
8.
Rapid advances in sequencing technologies of second- and even third-generation made the whole genome sequencing a routine procedure. However, the methods for assembling of the obtained sequences and its results require special consideration. Modern assemblers are based on heuristic algorithms, which lead to fragmented genome assembly composed of scaffolds and contigs of different lengths, the order of which along the chromosome and belonging to a particular chromosome often remain unknown. In this regard, the resulting genome sequence can only be considered as a draft assembly. The principal improvement in the quality and reliability of a draft assembly can be achieved by targeted sequencing of the genome elements of different size, e.g., chromosomes, chromosomal regions, and DNA fragments cloned in different vectors, as well as using reference genome, optical mapping, and Hi-C technology. This approach, in addition to simplifying the assembly of the genome draft, will more accurately identify numerical and structural chromosomal variations and abnormalities of the genomes of the studied species. In this review, we discuss the key technologies for the genome sequencing and the de novo assembly, as well as different approaches to improve the quality of existing drafts of genome sequences. 相似文献
9.
Next generation sequencing (NGS) technologies provide a high-throughput means to generate large amount of sequence data. However, quality control (QC) of sequence data generated from these technologies is extremely important for meaningful downstream analysis. Further, highly efficient and fast processing tools are required to handle the large volume of datasets. Here, we have developed an application, NGS QC Toolkit, for quality check and filtering of high-quality data. This toolkit is a standalone and open source application freely available at http://www.nipgr.res.in/ngsqctoolkit.html. All the tools in the application have been implemented in Perl programming language. The toolkit is comprised of user-friendly tools for QC of sequencing data generated using Roche 454 and Illumina platforms, and additional tools to aid QC (sequence format converter and trimming tools) and analysis (statistics tools). A variety of options have been provided to facilitate the QC at user-defined parameters. The toolkit is expected to be very useful for the QC of NGS data to facilitate better downstream analysis. 相似文献
10.
Adriana Malena Boris Pantic Doriana Borgia Gianluca Sgarbi Giancarlo Solaini Ian J. Holt 《Autophagy》2016,12(11):2098-2112
Pathological mutations in the mitochondrial DNA (mtDNA) produce a diverse range of tissue-specific diseases and the proportion of mutant mitochondrial DNA can increase or decrease with time via segregation, dependent on the cell or tissue type. Previously we found that adenocarcinoma (A549.B2) cells favored wild-type (WT) mtDNA, whereas rhabdomyosarcoma (RD.Myo) cells favored mutant (m3243G) mtDNA. Mitochondrial quality control (mtQC) can purge the cells of dysfunctional mitochondria via mitochondrial dynamics and mitophagy and appears to offer the perfect solution to the human diseases caused by mutant mtDNA. In A549.B2 and RD.Myo cybrids, with various mutant mtDNA levels, mtQC was explored together with macroautophagy/autophagy and bioenergetic profile. The 2 types of tumor-derived cell lines differed in bioenergetic profile and mitophagy, but not in autophagy. A549.B2 cybrids displayed upregulation of mitophagy, increased mtDNA removal, mitochondrial fragmentation and mitochondrial depolarization on incubation with oligomycin, parameters that correlated with mutant load. Conversely, heteroplasmic RD.Myo lines had lower mitophagic markers that negatively correlated with mutant load, combined with a fully polarized and highly fused mitochondrial network. These findings indicate that pathological mutant mitochondrial DNA can modulate mitochondrial dynamics and mitophagy in a cell-type dependent manner and thereby offer an explanation for the persistence and accumulation of deleterious variants. 相似文献
11.
An improved strategy for fluorescence-labeled dideoxy chain termination sequencing involving restriction enzyme-digested DNA fragments as primers, which are prepared from the DNA to be sequenced, is described. By using modified nucleoside triphosphates for strand protection in chain termination reactions, newly synthesized chains were detached from a primer at the regenerated recognition site by means of suitable restriction enzyme digestion. The digests could be analyzed with commercial automated DNA sequencers. Thus, by using restriction DNA fragments (double-stranded) as primers, sequence information was obtained from both "minus" and "plus" single-stranded DNA templates without subcloning. Nor is the synthesis of oligonucleotide primers needed. This method, named "Multi-Priming Sequencing," was proven to be time-saving, economical, and effective compared to conventional methods. 相似文献
12.
A multipurpose cloning site has been introduced into the gene for beta-galactosidase (beta-D-galactosidegalactohydrolase, EC 3.21.23) on the single-stranded DNA phage M13mp2 (Gronenborn, B. and Messing, J., (1978) Nature 272, 375-377) with the use of synthetic DNA. The site contributes 14 additional codons and does not affect the ability of the lac gene product to undergo intracistronic complementation. Two restriction endonuclease cleavage sites in the viral gene II were removed by single base-pair mutations. Using the new phage M13mp7, DNA fragments generated by cleavage with a variety of different restriction endonucleases can be cloned directly. The nucleotide sequences of the cloned DNAs can be determined rapidly by DNA synthesis using chain terminators and a synthetic oligonucleotide primer complementary to 15 bases preceeding the new array of restriction sites. 相似文献
13.
14.
Double stranded DNA sequencing as a choice for DNA sequencing. 总被引:6,自引:0,他引:6
15.
DNA sequencing: bench to bedside and beyond 总被引:3,自引:1,他引:3
Hutchison CA 《Nucleic acids research》2007,35(18):6227-6237
Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage ϕX174), which demonstrated that sequence could give profound insights into genetic organization. Incremental improvements allowed sequencing of molecules >200 kb (human cytomegalovirus) leading to an avalanche of data that demanded computational analysis and spawned the field of bioinformatics. The US Human Genome Project spurred sequencing activity. By 1992 the first ‘sequencing factory’ was established, and others soon followed. The first complete cellular genome sequences, from bacteria, appeared in 1995 and other eubacterial, archaebacterial and eukaryotic genomes were soon sequenced. Competition between the public Human Genome Project and Celera Genomics produced working drafts of the human genome sequence, published in 2001, but refinement and analysis of the human genome sequence will continue for the foreseeable future. New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine. These advances will also allow new approaches to a variety of problems in biology, evolution and the environment. 相似文献
16.
Planet E Attolini CS Reina O Flores O Rossell D 《Bioinformatics (Oxford, England)》2012,28(4):589-590
We provide a Bioconductor package with quality assessment, processing and visualization tools for high-throughput sequencing data, with emphasis in ChIP-seq and RNA-seq studies. It includes detection of outliers and biases, inefficient immuno-precipitation and overamplification artifacts, de novo identification of read-rich genomic regions and visualization of the location and coverage of genomic region lists. AVAILABILITY: www.bioconductor.org. 相似文献
17.
Badrick T 《The Clinical biochemist. Reviews / Australian Association of Clinical Biochemists》2008,29(Z1):S67-S70
Quality Control System: an understanding of analytical error; synthetic QC material; a set of QC rules; a process to follow if the rules signal. Quality Control (QC) Sera: reconstitution - staff trained; stability tested - post reconstitution and frozen. QC Rules: rules documented - basis of adoption; action to follow in case of failure documented; evidence of this procedure being used in place; are QC rules defined for both batch and continuous analysis - how is a 'run' defined for a continuous analytical process; means and standard deviations (SDs) of controls based on sufficient data points and reflects true state of system; evidence of staff training in the interpretation of QC rules; process documented; evidence of training of staff; evidence of regular review of Internal QC results. Patient-based QC Procedures in place: if delta check/anion gap/rerun of samples used, then a documented procedure to describe the process and evidence of it being in use; critical values - documented and evidence of use and documentation. Action on QC Rule Failure: documented process to follow with patient samples if control failure occurs; evidence that procedure has been followed in instances of control failure. External Quality Assessment (EQA) Program: Integration of Internal and External QC data. 相似文献
18.
Background
Originally a binary classifier, Lot Quality Assurance Sampling (LQAS) has proven to be a useful tool for classification of the prevalence of Schistosoma mansoni into multiple categories (≤10%, >10 and <50%, ≥50%), and semi-curtailed sampling has been shown to effectively reduce the number of observations needed to reach a decision. To date the statistical underpinnings for Multiple Category-LQAS (MC-LQAS) have not received full treatment. We explore the analytical properties of MC-LQAS, and validate its use for the classification of S. mansoni prevalence in multiple settings in East Africa.Methodology
We outline MC-LQAS design principles and formulae for operating characteristic curves. In addition, we derive the average sample number for MC-LQAS when utilizing semi-curtailed sampling and introduce curtailed sampling in this setting. We also assess the performance of MC-LQAS designs with maximum sample sizes of n = 15 and n = 25 via a weighted kappa-statistic using S. mansoni data collected in 388 schools from four studies in East Africa.Principle Findings
Overall performance of MC-LQAS classification was high (kappa-statistic of 0.87). In three of the studies, the kappa-statistic for a design with n = 15 was greater than 0.75. In the fourth study, where these designs performed poorly (kappa-statistic less than 0.50), the majority of observations fell in regions where potential error is known to be high. Employment of semi-curtailed and curtailed sampling further reduced the sample size by as many as 0.5 and 3.5 observations per school, respectively, without increasing classification error.Conclusion/Significance
This work provides the needed analytics to understand the properties of MC-LQAS for assessing the prevalance of S. mansoni and shows that in most settings a sample size of 15 children provides a reliable classification of schools. 相似文献19.
DNA sequencing separations in capillary gels on a modified commercial DNA sequencing instrument 总被引:2,自引:0,他引:2
DNA sequencing separations of standard DNA fragments of known sequence have been achieved in small diameter capillary gels electrophoresed and analyzed in parallel in a modified commercial DNA sequencer instrument. DNA sequencing in terms of base-calling accuracy is comparable to conventional slab gels; however, the separations in the capillary were performed somewhat faster and required less sample than those in the slab gel. Advantages of this approach vs. separations on conventional slab gels are discussed. 相似文献
20.
The rapid DNA sequencing system based on the single-stranded bacteriophage M13 and the chain-terminator method has been used to look directly for mutational alterations. A small DNA fragment that primes DNA synthesis through the N-terminal 200 base pairs of the beta-galactosidase gene was prepared, and used to detect changes in base sequence among phages that give white plaques after treatment of the host cells with bleomycin. Bleomycin treatment of E. coli in which M13 mp2 was growing gave an increase in white plaque frequency. DNA sequence analysis of phage from 7 independent mutant plaques showed them all to have a frameshift mutation. 相似文献