共查询到20条相似文献,搜索用时 15 毫秒
1.
There is increasing interest in studying the molecular mechanisms of recent adaptations caused by positive selection in the genomics era. Such endeavors to detect recent positive selection, however, have been severely handicapped by false positives due to the confounding impact of demography and the population structure. To reduce false positives, it is critical to conduct a functional analysis to identify the true candidate genes/mutations from those that are filtered through neutrality tests. However, the extremely high cost of such functional analysis may restrict studies within a small number of model species. In particular, when the false positive rate of neutrality tests is high, the efficiency of the functional analysis will also be very low. Therefore, although the recent improvements have been made in the (joint) inference of demography and selection, our ultimate goal, which is to understand the mechanism of adaptation generally in a wide variety of natural populations, may not be achieved using the currently available approaches. More attention should thus be spent on the development of more reliable tests that could not only free themselves from the confounding impact of demography and the population structure but also have reasonable power to detect selection. 相似文献
2.
OBJECTIVES: To examine the implications of false positive results of mammography in terms of the time lag from screening and complete mammography to the point when women with false positive results are declared free of cancer; the extra examinations, biopsies, and check ups required; and the cost of these extra procedures. DESIGN: Review of women with false positive results from the Stockholm mammography screening trial. SETTING: Department of Oncology, South Hospital, Stockholm. SUBJECTS: 352 and 150 women with false positive results of mammography from the first and second screening rounds of the Stockholm trial. MAIN OUTCOME MEASURES: Extra examinations and investigations required and the cost of these procedures. RESULTS: The 352 women from the first screening round made 1112 visits to the physician and had 397 fine needle aspiration biopsies, 187 mammograms, and 90 surgical biopsies before being declared free of cancer. After six months 64% of the women (219/342) were declared cancer free. The 150 women in the second round made 427 visits to the physician and had 145 fine needle aspiration biopsies, 70 mammograms, and 28 surgical biopsies, and after six months 73% (107/147) were declared cancer free. The follow up costs of the false positive screening results were Kr2.54m (250,000 pounds) in the first round and Kr0.85m (84,000 pounds) in the second round. Women under 50 accounted for about 41% of these costs. CONCLUSIONS: The examinations and investigation carried out after false positive mammography --especially in women under 50--and the cost of these procedures are a neglected but substantial problem. 相似文献
3.
False discovery rate control has become an essential tool in any study that has a very large multiplicity problem. False discovery rate-controlling procedures have also been found to be very effective in QTL analysis, ensuring reproducible results with few falsely discovered linkages and offering increased power to discover QTL, although their acceptance has been slower than in microarray analysis, for example. The reason is partly because the methodological aspects of applying the false discovery rate to QTL mapping are not well developed. Our aim in this work is to lay a solid foundation for the use of the false discovery rate in QTL mapping. We review the false discovery rate criterion, the appropriate interpretation of the FDR, and alternative formulations of the FDR that appeared in the statistical and genetics literature. We discuss important features of the FDR approach, some stemming from new developments in FDR theory and methodology, which deem it especially useful in linkage analysis. We review false discovery rate-controlling procedures--the BH, the resampling procedure, and the adaptive two-stage procedure-and discuss the validity of these procedures in single- and multiple-trait QTL mapping. Finally we argue that the control of the false discovery rate has an important role in suggesting, indicating the significance of, and confirming QTL and present guidelines for its use. 相似文献
4.
5.
6.
Four closely related species of yeast possess multicopy nuclear plasmids whose shared molecular architecture demonstrates a common ancestor, despite their lack of discernible DNA sequence homology. Each plasmid encodes three proteins which have equivalent essential functions in plasmid maintenance. These three groups of proteins show markedly different degrees of conservation, so that although we have successfully aligned sequences for two groups, members of the third group have diverged to such an extent that they cannot be aligned. All the proteins are sufficiently different that they function only in conjunction with their encoding plasmid. These proteins have therefore conserved their functional interactions with the relevant DNA sequences of their particular plasmids, despite lack of amino acid sequence conservation. The maintenance of function in the face of DNA sequence divergence is analogous to the coevolution of ribosomal DNA promoters and RNA polymerase I, and suggests that molecular drive may be an important force in the evolution of these plasmids. This view is reinforced by the inconsistent phylogenetic relationships determined from the two alignment sets, and by the contradiction that the two plasmids known to be the closest related taxonomically and by their host interchangeability are suggested to be the most distant by their sequences. 相似文献
7.
Robust estimation of the false discovery rate 总被引:2,自引:0,他引:2
MOTIVATION: Presently available methods that use p-values to estimate or control the false discovery rate (FDR) implicitly assume that p-values are continuously distributed and based on two-sided tests. Therefore, it is difficult to reliably estimate the FDR when p-values are discrete or based on one-sided tests. RESULTS: A simple and robust method to estimate the FDR is proposed. The proposed method does not rely on implicit assumptions that tests are two-sided or yield continuously distributed p-values. The proposed method is proven to be conservative and have desirable large-sample properties. In addition, the proposed method was among the best performers across a series of 'real data simulations' comparing the performance of five currently available methods. AVAILABILITY: Libraries of S-plus and R routines to implement the method are freely available from www.stjuderesearch.org/depts/biostats. 相似文献
8.
Data preprocessing including proper normalization and adequate quality control before complex data mining is crucial for studies using the cDNA microarray technology. We have developed a simple procedure that integrates data filtering and normalization with quantitative quality control of microarray experiments. Previously we have shown that data variability in a microarray experiment can be very well captured by a quality score q(com) that is defined for every spot, and the ratio distribution depends on q(com). Utilizing this knowledge, our data-filtering scheme allows the investigator to decide on the filtering stringency according to desired data variability, and our normalization procedure corrects the q(com)-dependent dye biases in terms of both the location and the spread of the ratio distribution. In addition, we propose a statistical model for false positive rate determination based on the design and the quality of a microarray experiment. The model predicts that a lower limit of 0.5 for the replicate concordance rate is needed in order to be certain of true positives. Our work demonstrates the importance and advantages of having a quantitative quality control scheme for microarrays. 相似文献
9.
A mixture model for estimating the local false discovery rate in DNA microarray analysis 总被引:3,自引:0,他引:3
MOTIVATION: Statistical methods based on controlling the false discovery rate (FDR) or positive false discovery rate (pFDR) are now well established in identifying differentially expressed genes in DNA microarray. Several authors have recently raised the important issue that FDR or pFDR may give misleading inference when specific genes are of interest because they average the genes under consideration with genes that show stronger evidence for differential expression. The paper proposes a flexible and robust mixture model for estimating the local FDR which quantifies how plausible each specific gene expresses differentially. RESULTS: We develop a special mixture model tailored to multiple testing by requiring the P-value distribution for the differentially expressed genes to be stochastically smaller than the P-value distribution for the non-differentially expressed genes. A smoothing mechanism is built in. The proposed model gives robust estimation of local FDR for any reasonable underlying P-value distributions. It also provides a single framework for estimating the proportion of differentially expressed genes, pFDR, negative predictive values, sensitivity and specificity. A cervical cancer study shows that the local FDR gives more specific and relevant quantification of the evidence for differential expression that can be substantially different from pFDR. AVAILABILITY: An R function implementing the proposed model is available at http://www.geocities.com/jg_liao/software 相似文献
10.
TM or not TM: transmembrane protein prediction with low false positive rate using DAS-TMfilter 总被引:2,自引:0,他引:2
Web-based servers implementing the DAS-TMfilter algorithm have been launched at three mirror sites and their usage is described. The underlying computer program is an upgraded and modified version of the DAS-prediction method. The new server is (approximately 1 among 100 unrelated queries) while the high efficiency of the original algorithm locating TM segments in queries is preserved (sensitivity of approximately 95% among documented proteins with helical TM regions). AVAILABILITY: The server operates at three mirror sites: http://mendel.imp.univie.ac.at/sat/DAS/DAS.html, http://wooster.bip.bham.ac.uk/DAS.html and http://www.enzim.hu/DAS/DAS.html. The program is available on request. 相似文献
11.
The ability ofYarrowia lipolytica to produce ammonia from urea was found variable on some media. The colour change of the indicator in Christensen's urea agar was not due to the urease activity of this species but was a non-specific alkalization reaction. Rapid urea broth was reliable giving no false positive results. It was found thatY. lipolytica is a urease negative yeast species. 相似文献
12.
Background
High-throughput technologies, such as DNA microarray, have significantly advanced biological and biomedical research by enabling researchers to carry out genome-wide screens. One critical task in analyzing genome-wide datasets is to control the false discovery rate (FDR) so that the proportion of false positive features among those called significant is restrained. Recently a number of FDR control methods have been proposed and widely practiced, such as the Benjamini-Hochberg approach, the Storey approach and Significant Analysis of Microarrays (SAM).Methods
This paper presents a straight-forward yet powerful FDR control method termed miFDR, which aims to minimize FDR when calling a fixed number of significant features. We theoretically proved that the strategy used by miFDR is able to find the optimal number of significant features when the desired FDR is fixed.Results
We compared miFDR with the BH approach, the Storey approach and SAM on both simulated datasets and public DNA microarray datasets. The results demonstrated that miFDR outperforms others by identifying more significant features under the same FDR cut-offs. Literature search showed that many genes called only by miFDR are indeed relevant to the underlying biology of interest.Conclusions
FDR has been widely applied to analyzing high-throughput datasets allowed for rapid discoveries. Under the same FDR threshold, miFDR is capable to identify more significant features than its competitors at a compatible level of complexity. Therefore, it can potentially generate great impacts on biological and biomedical research.Availability
If interested, please contact the authors for getting miFDR.13.
14.
L H Oliver R S Poulsen G T Toussaint 《The journal of histochemistry and cytochemistry》1977,25(7):696-701
The performance of a cell recognition system on unknown data is often estimated in terms of its error rates on a test set. This paper investigates methods for producing estimates of error rates in cervical cell classification. Classification performance curves calculated using these methods are given for several classification schemes used to classify 1500 cervical cells. 相似文献
15.
Acetylcholinesterase (AChE) is an important enzyme in the nervous system. It terminates signal transmission at chemical synapses by degrading the neurotransmitter acetylcholine and was found to play a role in plaque formation in Alzheimer's disease. Several functional parts of its structure have been identified in the past. Here, we use a coarse-grained anisotropic network model approach based on structure data to analyze protein mechanics of AChE. Single contacts in the protein are "switched off" and the change in the intrinsic dynamics is measured. We correlate the gained insight with information about coevolution within the molecule derived from multiple sequence alignments. More than 300 AChE sequences were aligned and the mutual information of the positions was calculated. From these structural, biophysical, and evolutionary data we could reveal sites of coevolutionary signatures in AChE, annotate them by the selective pressure induced for biophysical reasons, and further pave the way for a more detailed understanding of evolutionary boundary conditions for AChE. 相似文献
16.
We propose a Dirichlet process mixture model (DPMM) for the P-value distribution in a multiple testing problem. The DPMM allows us to obtain posterior estimates of quantities such as the proportion of true null hypothesis and the probability of rejection of a single hypothesis. We describe a Markov chain Monte Carlo algorithm for computing the posterior and the posterior estimates. We propose an estimator of the positive false discovery rate based on these posterior estimates and investigate the performance of the proposed estimator via simulation. We also apply our methodology to analyze a leukemia data set. 相似文献
17.
18.
Molecular beacons (MBs) have shown great potential in measurement of enzyme activities. However, currently available methods for monitoring of phosphatases only use MBs as a signal reporter. Extra substrates for the phosphatases are needed to hybridize to the MB either as a primer or as a template. Moreover, few MB-based methods have been used to detect enzyme activities in real biological samples due to insufficient sensitivity or false positive interference signals caused by nonspecific nucleases. In this work, a novel type of fluorescent probe was designed and synthesized for monitoring of phosphatases by integrating the DNA substrate and the signaling structures into a single molecule. Such a new design not only significantly simplified the probing system and greatly enhanced the sensitivity, but also offered a practical way to guard against the false-positive signal problems in the application to real samples. The unique design of the assay format should be widely applicable to many other enzymatic assays using oligonucleotide fluorescent probes. 相似文献
19.
MOTIVATION: There is not a widely applicable method to determine the sample size for experiments basing statistical significance on the false discovery rate (FDR). RESULTS: We propose and develop the anticipated FDR (aFDR) as a conceptual tool for determining sample size. We derive mathematical expressions for the aFDR and anticipated average statistical power. These expressions are used to develop a general algorithm to determine sample size. We provide specific details on how to implement the algorithm for a k-group (k > or = 2) comparisons. The algorithm performs well for k-group comparisons in a series of traditional simulations and in a real-data simulation conducted by resampling from a large, publicly available dataset. AVAILABILITY: Documented S-plus and R code libraries are freely available from www.stjuderesearch.org/depts/biostats. 相似文献
20.
Given a set of microarray data, the problem is to detect differentially expressed genes, using a false discovery rate (FDR) criterion. As opposed to common procedures in the literature, we do not base the selection criterion on statistical significance only, but also on the effect size. Therefore, we select only those genes that are significantly more differentially expressed than some f-fold (e.g., f = 2). This corresponds to use of an interval null domain for the effect size. Based on a simple error model, we discuss a naive estimator for the FDR, interpreted as the probability that the parameter of interest lies in the null-domain (e.g., mu < log(2)(2) = 1) given that the test statistic exceeds a threshold. We improve the naive estimator by using deconvolution. That is, the density of the parameter of interest is recovered from the data. We study performance of the methods using simulations and real data. 相似文献