首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies.  相似文献   

2.
DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%–20%), contamination-adjusted calls eliminate 48%–77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%.  相似文献   

3.
4.
Modified purine and pyrimidine bases constitute one of the major classes of hydroxyl-radical-mediated DNA damage together with oligonucleotide strand breaks, DNA-protein cross-links and abasic sites. A comprehensive survey of the main available data on both structural and mechanistic aspects of.OH-induced decomposition pathways of both purine and pyrimidine bases of isolated DNA and model compounds is presented. In this respect, detailed information is provided on both thymine and guanine whereas data are not as complete for adenine and cytosine. The second part of the overview is dedicated to the formation of.OH-induced base lesions within cellular DNA and in vivo situations. Before addressing this major point, the main available methods aimed at singling out.OH-mediated base modifications are critically reviewed. Unfortunately, it is clear that the bulk of the chemical and biochemical assays with the exception of the high performance liquid chromatographic-electrochemical detection (HPLC/ECD) method have suffered from major drawbacks. This explains why there are only a few available accurate data concerning both the qualitative and quantitative aspects of the.OH-induced formation of base damage within cellular DNA. Therefore, major efforts should be devoted to the reassessment of the level of oxidative base damage in cellular DNA using appropriate assays including suitable conditions of DNA extraction.  相似文献   

5.
The contamination of cell cultures by mycoplasmas remains a major problem in cell culture. Mycoplasmas can produce a virtually unlimited variety of effects in the cultures they infect. These organisms are resistant to most antibiotics commonly employed in cell cultures. Here we provide a concise overview of the current knowledge on: (1) the incidence and sources of mycoplasma contamination in cell cultures, the mycoplasma species most commonly detected in cell cultures, and the effects of mycoplasmas on the function and activities of infected cell cultures; (2) the various techniques available for the detection of mycoplasmas with particular emphasis on the most reliable detection methods; (3) the various methods available for the elimination of mycoplasmas highlighting antibiotic treatment; and (4) the recommended procedures and working protocols for the detection, elimination and prevention of mycoplasma contamination. The availability of accurate, sensitive and reliable detection methods and the application of robust and successful elimination methods provide powerful means for overcoming the problem of mycoplasma contamination in cell cultures. This revised version was published online in August 2006 with corrections to the Cover Date.  相似文献   

6.
Endotoxins liberated by gram-negative bacteria are frequent contaminations of protein solutions derived from bioprocesses. Because of their high toxicity in vivo and in vitro, their removal is essential for a safe parenteral administration. A general method for the removal of endotoxins from protein solutions is not available. Methods used for decontamination of water, such as ultrafiltration, have little effect on endotoxin levels in protein solutions. Various techniques described in the patent literature are not broadly applicable, as they are tailored to meet specific product requirements. Besides ion-exchangers and two-phase extraction, affinity techniques are applied with varying success. Also, taylor-made endotoxin-selective adsorber matrices for the prevention of endotoxin contamination and endotoxin removal are discussed for this purpose. After giving an overview of the properties of endotoxins and the significance of endotoxin contamination, this review intends to provide an overall picture of the various methods employed for their removal. Avenues are pointed out how to optimise a method with regard to the specific properties of endotoxins in aqueous solution.  相似文献   

7.
Bifidobacteria are an important group of the human intestinal microbiota that have been shown to exert a number of beneficial probiotic effects on the health status of their host. Due to these effects, bifidobacteria have attracted strong interest in health care and food industries for probiotic applications and several species are listed as so-called "generally recognized as safe" (GRAS) microorganisms. Moreover, recent studies have pointed out their potential as an alternative or supplementary strategy in tumor therapy or as live vaccines. In order to study the mechanisms by which these organisms exert their beneficial effects and to generate recombinant strains that can be used as drug delivery vectors or live vaccines, appropriate molecular tools are indispensable. This review provides an overview of the currently available methods and tools to generate recombinant strains of bifidobacteria. The currently used protocols for transformation of bifidobacteria, as well as replicons, selection markers, and determinants of expression, will be summarized. We will further discuss promoters, terminators, and localization signals that have been used for successful generation of expression vectors.  相似文献   

8.
9.
MOTIVATION: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis--combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs--has been proposed to address these errors and improve the accuracy of functional classification. The explicit integration of structure prediction and analysis in this framework, which we call structural phylogenomics, provides additional insights into protein superfamily evolution. RESULTS: Results of protein functional classification using phylogenomic analysis show fewer expected false positives overall than when pairwise methods of functional classification are employed. We present an overview of the motivations and fundamental principles of phylogenomic analysis, new methods developed for the key tasks, benchmark datasets for these tasks (when available) and suggest procedures to increase accuracy. We also discuss some of the methods used in the Celera Genomics high-throughput phylogenomic classification of the human genome. AVAILABILITY: Software tools from the Berkeley Phylogenomics Group are available at http://phylogenomics.berkeley.edu  相似文献   

10.
MOTIVATION: Ranking gene feature sets is a key issue for both phenotype classification, for instance, tumor classification in a DNA microarray experiment, and prediction in the context of genetic regulatory networks. Two broad methods are available to estimate the error (misclassification rate) of a classifier. Resubstitution fits a single classifier to the data, and applies this classifier in turn to each data observation. Cross-validation (in leave-one-out form) removes each observation in turn, constructs the classifier, and then computes whether this leave-one-out classifier correctly classifies the deleted observation. Resubstitution typically underestimates classifier error, severely so in many cases. Cross-validation has the advantage of producing an effectively unbiased error estimate, but the estimate is highly variable. In many applications it is not the misclassification rate per se that is of interest, but rather the construction of gene sets that have the potential to classify or predict. Hence, one needs to rank feature sets based on their performance. RESULTS: A model-based approach is used to compare the ranking performances of resubstitution and cross-validation for classification based on real-valued feature sets and for prediction in the context of probabilistic Boolean networks (PBNs). For classification, a Gaussian model is considered, along with classification via linear discriminant analysis and the 3-nearest-neighbor classification rule. Prediction is examined in the steady-distribution of a PBN. Three metrics are proposed to compare feature-set ranking based on error estimation with ranking based on the true error, which is known owing to the model-based approach. In all cases, resubstitution is competitive with cross-validation relative to ranking accuracy. This is in addition to the enormous savings in computation time afforded by resubstitution.  相似文献   

11.
Human toxocariasis (HT) is a zoonotic disease caused by infection with the larval stage of Toxocara canis, the intestinal roundworm of dogs. Infection can be associated with a wide clinical spectrum varying from asymptomatic to severe organ injury. While the incidence of symptomatic human toxocariasis appears to be low, infection of the human population is widespread. In Cuba, a clear overview on the status of the disease is lacking. Here, we review the available information on toxocariasis in Cuba as a first step to estimate the importance of the disease in the country. Findings are discussed and put in a broader perspective. Data gaps are identified and suggestions on how to address these are presented. The available country data suggest that Toxocara infection of the definitive dog host and environmental contamination with Toxocara spp. eggs is substantial, but information on HT is less conclusive. The availability of adequate diagnostic tools in the country should be guaranteed. Dedicated studies are needed for a reliable assessment of the impact of toxocariasis in Cuba and the design of prevention or control strategies.  相似文献   

12.
Hilgers LJ  Herr C 《Theriogenology》1993,40(5):923-932
Commonly used reagents in the culture and transfer of embryos are isolated from blood and tissue samples and thus have the potential for chromosomal and or mitochondrial DNA contamination. In this study, we evaluated the results obtained from PCR analysis of bovine trypsin, bovine sera, and bovine albumin precipitates. Bovine sera samples that were tested yielded minor to heavy DNA contamination signals depending on the manufacturer and specific type of sera. Bovine albumin precipitates showed very little DNA contamination or none at all. Bovine trypsin samples yielded moderate DNA contamination signals depending on the ability of the trypsin to be inactivated prior to PCR analysis.  相似文献   

13.
The use of fluorescent nucleic acid hybridization probes that generate a fluorescence signal only when they bind to their target enables real-time monitoring of nucleic acid amplification assays. Real-time nucleic acid amplification assays markedly improves the ability to obtain qualitative and quantitative results. Furthermore, these assays can be carried out in sealed tubes, eliminating carryover contamination. Fluorescent nucleic acid hybridization probes are available in a wide range of different fluorophore and quencher pairs. Multiple hybridization probes, each designed for the detection of a different nucleic acid sequence and each labeled with a differently colored fluorophore, can be added to the same nucleic acid amplification reaction, enabling the development of high-throughput multiplex assays. In order to develop robust, highly sensitive and specific real-time nucleic acid amplification assays it is important to carefully select the fluorophore and quencher labels of hybridization probes. Selection criteria are based on the type of hybridization probe used in the assay, the number of targets to be detected, and the type of apparatus available to perform the assay. This article provides an overview of different aspects of choosing appropriate labels for the different types of fluorescent hybridization probes used with different types of spectrofluorometric thermal cyclers currently available.  相似文献   

14.
Cytosine methylation is the quintessential epigenetic mark. Two well-established methods, bisulfite sequencing and methyl-DNA immunoprecipitation (MeDIP) lend themselves to the genome-wide analysis of DNA methylation by high throughput sequencing. Here we provide an overview and brief review of these methods. We summarize our experience with MeDIP followed by high throughput Illumina/Solexa sequencing, exemplified by the analysis of the methylated fraction of the Neurospora crassa genome ("methylome"). We provide detailed methods for DNA isolation, processing and the generation of in vitro libraries for Illumina/Solexa sequencing. We discuss potential problems in the generation of sequencing libraries. Finally, we provide an overview of software that is appropriate for the analysis of high throughput sequencing data generated by Illumina/Solexa-type sequencing by synthesis, with a special emphasis on approaches and applications that can generate more accurate depictions of sequence reads that fall in repeated regions of a chosen reference genome.  相似文献   

15.
We study the problem of selecting control clones in DNA array hybridization experiments. The problem arises in the OFRG method for analyzing microbial communities. The OFRG method performs classification of rRNA gene clones using binary fingerprints created from a series of hybridization experiments, where each experiment consists of hybridizing a collection of arrayed clones with a single oligonucleotide probe. This experiment produces analog signals, one for each clone, which then need to be classified, that is, converted into binary values 1 and 0 that represent hybridization and non-hybridization events. In addition to the sample rRNA gene clones, the array contains a number of control clones needed to calibrate the classification procedure of the hybridization signals. These control clones must be selected with care to optimize the classification process. We formulate this as a combinatorial optimization problem called Balanced Covering. We prove that the problem is NP-hard, and we show some results on hardness of approximation. We propose approximation algorithms based on randomized rounding, and we show that, with high probability, our algorithms approximate well the optimum solution. The experimental results confirm that the algorithms find high quality control clones. The algorithms have been implemented and are publicly available as part of the software package called CloneTools.  相似文献   

16.
17.
In order to assess occupational exposure of hospital personnel involved in the preparation and administration of antineoplastic drugs, biological and environmental monitoring are essential to identify the main exposure routes and to quantify potential health risks. If workplace contamination cannot be completely avoided, it is of utmost importance to reduce exposure to the lowest possible levels. To this aim, not only do education and training of the exposed subjects play an important role, but accurate standardized sampling techniques and analytical methods are also required. A critical overview of the most significant methods available in the literature is presented and their value is discussed, especially with respect to their sensitivity and specificity. In addition, attention is given to validation procedures and, consequently, to their reliability. The results from the most important surveys carried out at hospital departments are also discussed, with a view to improving both monitoring strategies and moreover working conditions.  相似文献   

18.
A typical small-sample biomarker classification paper discriminates between types of pathology based on, say, 30,000 genes and a small labeled sample of less than 100 points. Some classification rule is used to design the classifier from this data, but we are given no good reason or conditions under which this algorithm should perform well. An error estimation rule is used to estimate the classification error on the population using the same data, but once again we are given no good reason or conditions under which this error estimator should produce a good estimate, and thus we do not know how well the classifier should be expected to perform. In fact, virtually, in all such papers the error estimate is expected to be highly inaccurate. In short, we are given no justification for any claims.Given the ubiquity of vacuous small-sample classification papers in the literature, one could easily conclude that scientific knowledge is impossible in small-sample settings. It is not that thousands of papers overtly claim that scientific knowledge is impossible in regard to their content; rather, it is that they utilize methods that preclude scientific knowledge. In this paper, we argue to the contrary that scientific knowledge in small-sample classification is possible provided there is sufficient prior knowledge. A natural way to proceed, discussed herein, is via a paradigm for pattern recognition in which we incorporate prior knowledge in the whole classification procedure (classifier design and error estimation), optimize each step of the procedure given available information, and obtain theoretical measures of performance for both classifiers and error estimators, the latter being the critical epistemological issue. In sum, we can achieve scientific validation for a proposed small-sample classifier and its error estimate.  相似文献   

19.
Estimation of structure predictability for a particular protein is difficult. Many methods estimate it in an a posteriori system evaluating the final, native protein structure. The SPI scale is intended to estimate the structure predictability of a particular amino acid sequence in an a priori system. A sequence-to-structure library was created based on the complete Protein Data Bank. The tetrapeptide was selected as a unit representing a well-defined structural motif. The early-stage folding structure (a model of which was presented elsewhere) was taken as the object for protein structure classification. Seven structural forms were distinguished for structure classification. The degree of determinability was estimated for the sequence-to-structure and structure-to-sequence relations particularly interesting for threading methods. A comparative analysis of the SPI and Q7 scales with the commonly used SOV and Q3 scales is presented. The complete contingency table, supplementary materials and all the programs used are available on request.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号