首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In comparative genomics, gene order data is often modeled as signed permutations. A classical problem for genome comparison is to detect common intervals in permutations, that is, genes that are colocalized in several species, indicating that they remained grouped during evolution. A second largely studied problem related to gene order is to compute a minimum scenario of reversals that transforms a signed permutation into another. Several studies began to mix the two problems and it was observed that their results are not always compatible: Often, parsimonious scenarios of reversals break common intervals. If a scenario does not break any common interval, it is called perfect. In two recent studies, Berard et al. defined a class of permutations for which building a perfect scenario of reversals sorting a permutation was achieved in polynomial time and stated as an open question whether it is possible to decide, given a permutation, if there exists a minimum scenario of reversals that is perfect. In this paper, we give a solution to this problem and prove that this widens the class of permutations addressed by the aforementioned studies. We implemented and tested this algorithm on gene order data of chromosomes from several mammal species and we compared it to other methods. The algorithm helps to choose among several possible scenarios of reversals and indicates that the minimum scenario of reversals is not always the most plausible  相似文献   

2.
MOTIVATION: In analyses of microarray data with a design of different biological conditions, ranking genes by their differential 'importance' is often desired so that biologists can focus research on a small subset of genes that are most likely related to the experiment conditions. Permutation methods are often recommended and used, in place of their parametric counterparts, due to the small sample sizes of microarray experiments and possible non-normality of the data. The recommendations, however, are based on classical knowledge in the hypothesis test setting. RESULTS: We explore the relationship between hypothesis testing and gene ranking. We indicate that the permutation method does not provide a metric for the distance between two underlying distributions. In our simulation studies permutation methods tend to be equally or less accurate than parametric methods in ranking genes. This is partially due to the discreteness of the permutation distributions, as well as the non-metric property. In data analysis the variability in ranking genes can be assessed by bootstrap. It turns out that the variability is much lower for permutation than parametric methods, which agrees with the known robustness of permutation methods to individual outliers in the data.  相似文献   

3.
4.
Spatial extent inference (SEI) is widely used across neuroimaging modalities to adjust for multiple comparisons when studying brain‐phenotype associations that inform our understanding of disease. Recent studies have shown that Gaussian random field (GRF)‐based tools can have inflated family‐wise error rates (FWERs). This has led to substantial controversy as to which processing choices are necessary to control the FWER using GRF‐based SEI. The failure of GRF‐based methods is due to unrealistic assumptions about the spatial covariance function of the imaging data. A permutation procedure is the most robust SEI tool because it estimates the spatial covariance function from the imaging data. However, the permutation procedure can fail because its assumption of exchangeability is violated in many imaging modalities. Here, we propose the (semi‐) parametric bootstrap joint (PBJ; sPBJ) testing procedures that are designed for SEI of multilevel imaging data. The sPBJ procedure uses a robust estimate of the spatial covariance function, which yields consistent estimates of standard errors, even if the covariance model is misspecified. We use the methods to study the association between performance and executive functioning in a working memory functional magnetic resonance imaging study. The sPBJ has similar or greater power to the PBJ and permutation procedures while maintaining the nominal type 1 error rate in reasonable sample sizes. We provide an R package to perform inference using the PBJ and sPBJ procedures.  相似文献   

5.
MOTIVATION: False discovery rate (FDR) is defined as the expected percentage of false positives among all the claimed positives. In practice, with the true FDR unknown, an estimated FDR can serve as a criterion to evaluate the performance of various statistical methods under the condition that the estimated FDR approximates the true FDR well, or at least, it does not improperly favor or disfavor any particular method. Permutation methods have become popular to estimate FDR in genomic studies. The purpose of this paper is 2-fold. First, we investigate theoretically and empirically whether the standard permutation-based FDR estimator is biased, and if so, whether the bias inappropriately favors or disfavors any method. Second, we propose a simple modification of the standard permutation to yield a better FDR estimator, which can in turn serve as a more fair criterion to evaluate various statistical methods. RESULTS: Both simulated and real data examples are used for illustration and comparison. Three commonly used test statistics, the sample mean, SAM statistic and Student's t-statistic, are considered. The results show that the standard permutation method overestimates FDR. The overestimation is the most severe for the sample mean statistic while the least for the t-statistic with the SAM-statistic lying between the two extremes, suggesting that one has to be cautious when using the standard permutation-based FDR estimates to evaluate various statistical methods. In addition, our proposed FDR estimation method is simple and outperforms the standard method.  相似文献   

6.
We investigate rank‐based studentized permutation methods for the nonparametric Behrens–Fisher problem, that is, inference methods for the area under the ROC curve. We hereby prove that the studentized permutation distribution of the Brunner‐Munzel rank statistic is asymptotically standard normal, even under the alternative. Thus, incidentally providing the hitherto missing theoretical foundation for the Neubert and Brunner studentized permutation test. In particular, we do not only show its consistency, but also that confidence intervals for the underlying treatment effects can be computed by inverting this permutation test. In addition, we derive permutation‐based range‐preserving confidence intervals. Extensive simulation studies show that the permutation‐based confidence intervals appear to maintain the preassigned coverage probability quite accurately (even for rather small sample sizes). For a convenient application of the proposed methods, a freely available software package for the statistical software R has been developed. A real data example illustrates the application.  相似文献   

7.
Condorcet (1785) proposed that a majority vote drawn from individual, independent and fallible (but not totally uninformed) opinions provides near-perfect accuracy if the number of voters is adequately large. Research in social psychology has since then repeatedly demonstrated that collectives can and do fail more often than expected by Condorcet. Since human collective decisions often follow from exchange of opinions, these failures provide an exquisite opportunity to understand human communication of metacognitive confidence. This question can be addressed by recasting collective decision-making as an information-integration problem similar to multisensory (cross-modal) perception. Previous research in systems neuroscience shows that one brain can integrate information from multiple senses nearly optimally. Inverting the question, we ask: under what conditions can two brains integrate information about one sensory modality optimally? We review recent work that has taken this approach and report discoveries about the quantitative limits of collective perceptual decision-making, and the role of the mode of communication and feedback in collective decision-making. We propose that shared metacognitive confidence conveys the strength of an individual's opinion and its reliability inseparably. We further suggest that a functional role of shared metacognition is to provide substitute signals in situations where outcome is necessary for learning but unavailable or impossible to establish.  相似文献   

8.

Background  

The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility.  相似文献   

9.
10.
Rich clubs arise when nodes that are ‘rich’ in connections also form an elite, densely connected ‘club’. In brain networks, rich clubs incur high physical connection costs but also appear to be especially valuable to brain function. However, little is known about the selection pressures that drive their formation. Here, we take two complementary approaches to this question: firstly we show, using generative modelling, that the emergence of rich clubs in large-scale human brain networks can be driven by an economic trade-off between connection costs and a second, competing topological term. Secondly we show, using simulated neural networks, that Hebbian learning rules also drive the emergence of rich clubs at the microscopic level, and that the prominence of these features increases with learning time. These results suggest that Hebbian learning may provide a neuronal mechanism for the selection of complex features such as rich clubs. The neural networks that we investigate are explicitly Hebbian, and we argue that the topological term in our model of large-scale brain connectivity may represent an analogous connection rule. This putative link between learning and rich clubs is also consistent with predictions that integrative aspects of brain network organization are especially important for adaptive behaviour.  相似文献   

11.
A stepped-wedge cluster randomized trial (CRT) is a unidirectional crossover study in which timings of treatment initiation for clusters are randomized. Because the timing of treatment initiation is different for each cluster, an emerging question is whether the treatment effect depends on the exposure time, namely, the time duration since the initiation of treatment. Existing approaches for assessing exposure-time treatment effect heterogeneity either assume a parametric functional form of exposure time or model the exposure time as a categorical variable, in which case the number of parameters increases with the number of exposure-time periods, leading to a potential loss in efficiency. In this article, we propose a new model formulation for assessing treatment effect heterogeneity over exposure time. Rather than a categorical term for each level of exposure time, the proposed model includes a random effect to represent varying treatment effects by exposure time. This allows for pooling information across exposure-time periods and may result in more precise average and exposure-time-specific treatment effect estimates. In addition, we develop an accompanying permutation test for the variance component of the heterogeneous treatment effect parameters. We conduct simulation studies to compare the proposed model and permutation test to alternative methods to elucidate their finite-sample operating characteristics, and to generate practical guidance on model choices for assessing exposure-time treatment effect heterogeneity in stepped-wedge CRTs.  相似文献   

12.
The etiology of chronic Inflammatory Bowel Diseases (IBD) remains unknown, with both genetic and environmental risk factors having been implicated. A recent collaborative study of IBD provides clinical data from families with three or more affected first-degree relatives. The scientific question is whether specific clinical characteristics aggregate among affected individuals within families. Gastroenterological researchers have examined the number of concordant familial pairs in familial aggregation studies, but methods and results have been discrepant. This article investigates concepts of concordance and gives a comprehensive statistical treatment for testing concordance of various clinical traits in familial studies. For dichotomous traits, the distribution of this statistic under the null hypothesis of no familial aggregation is obtained by three methods: asymptotic, probability generating function, and permutation. The permutation method is extended to analyze aggregation for non-dichotomous traits and co-aggregations between two traits. We apply the permutation method to analyze the aforementioned multiply-affected IBD family data. Evidence is found for familial clustering of various traits, some of which are not revealed in existing studies. Such analyses provide a basis for investigating the dependence of trait aggregation upon genetic or environmental risk factors.  相似文献   

13.
Many traits of evolutionary interest, when placed in their developmental, physiological, or environmental contexts, are function-valued. For instance, gene expression during development is typically a function of the age of an organism and physiological processes are often a function of environment. In comparative and experimental studies, a fundamental question is whether the function-valued trait of one group is different from another. To address this question, evolutionary biologists have several statistical methods available. These methods can be classified into one of two types: multivariate and functional. Multivariate methods, including univariate repeated-measures analysis of variance (ANOVA), treat each trait as a finite list of data. Functional methods, such as repeated-measures regression, view the data as a sample of points drawn from an underlying function. A key difference between multivariate and functional methods is that functional methods retain information about the ordering and spacing of a set of data values, information that is discarded by multivariate methods. In this study, we evaluated the importance of that discarded information in statistical analyses of function-valued traits. Our results indicate that functional methods tend to have substantially greater statistical power than multivariate approaches to detect differences in a function-valued trait between groups.  相似文献   

14.
Lexical gap in cQA search, resulted by the variability of languages, has been recognized as an important and widespread phenomenon. To address the problem, this paper presents a question reformulation scheme to enhance the question retrieval model by fully exploring the intelligence of paraphrase in phrase-level. It compensates for the existing paraphrasing research in a suitable granularity, which either falls into fine-grained lexical-level or coarse-grained sentence-level. Given a question in natural language, our scheme first detects the involved key-phrases by jointly integrating the corpus-dependent knowledge and question-aware cues. Next, it automatically extracts the paraphrases for each identified key-phrase utilizing multiple online translation engines, and then selects the most relevant reformulations from a large group of question rewrites, which is formed by full permutation and combination of the generated paraphrases. Extensive evaluations on a real world data set demonstrate that our model is able to characterize the complex questions and achieves promising performance as compared to the state-of-the-art methods.  相似文献   

15.
Zhang X  Huang S  Sun W  Wang W 《Genetics》2012,190(4):1511-1520
Genome-wide expression quantitative trait loci (eQTL) studies have emerged as a powerful tool to understand the genetic basis of gene expression and complex traits. In a typical eQTL study, the huge number of genetic markers and expression traits and their complicated correlations present a challenging multiple-testing correction problem. The resampling-based test using permutation or bootstrap procedures is a standard approach to address the multiple-testing problem in eQTL studies. A brute force application of the resampling-based test to large-scale eQTL data sets is often computationally infeasible. Several computationally efficient methods have been proposed to calculate approximate resampling-based P-values. However, these methods rely on certain assumptions about the correlation structure of the genetic markers, which may not be valid for certain studies. We propose a novel algorithm, rapid and exact multiple testing correction by resampling (REM), to address this challenge. REM calculates the exact resampling-based P-values in a computationally efficient manner. The computational advantage of REM lies in its strategy of pruning the search space by skipping genetic markers whose upper bounds on test statistics are small. REM does not rely on any assumption about the correlation structure of the genetic markers. It can be applied to a variety of resampling-based multiple-testing correction methods including permutation and bootstrap methods. We evaluate REM on three eQTL data sets (yeast, inbred mouse, and human rare variants) and show that it achieves accurate resampling-based P-value estimation with much less computational cost than existing methods. The software is available at http://csbio.unc.edu/eQTL.  相似文献   

16.
赵萌萌  薛林贵 《微生物学通报》2021,48(11):4432-4443
为更好地响应教育部“金课建设”的号召,我们将“以学生为中心”作为教学改革的根本目标,将全面落实培养学生能力素质作为教学改革的根本任务,结合现代教育教学理念和方法,采用线上线下混合式教学模式对微生物学课程教学进行改革。线上:(1)依托超星泛雅网络教学平台提供课程教学相关的全部资源。主要有课程基本资源、辅助资源和拓展资源,供学生课外自学;章节教学设计,包括自学提纲、名师慕课、教学互动和作业练习等,供学生课前自学、课内互动和课后复习;阶段性调查问卷和意见建议,供学生进行教学评价和教师教学反思。(2)依托“学习通”手机APP实现教师和学生课内外实时互动交流,包括任务布置、学习检测、答疑质疑、教学互动、课程反馈等。线下:主要以课堂教学为载体,根据学生线上学习情况和教学设计,针对不同类型的知识进行分类教学,采用多种教学互动方法,巩固自学效果、加强重点、解析难点,在夯实基础的同时注重对学生多方面能力素质的培养。改革结果显示,学生对教学改革的方法和效果都非常认可,认为教学改革的目标明确、重点突出、手段优化,能够启发思维、激发兴趣,能够有效提高课堂参与度,提升主观能动性,使自身能力素质得到综合提升。  相似文献   

17.
Cambridge mathematician and philosopher W. K. Clifford (1879/1999) con-cluded his famous essay, "The Ethics of Belief" with the bold claim that "it is wrong always, everywhere, and for anyone to believe anything upon insufficient evidence" (p.77). Clifford's enthusiasm for evidentialism-the principle that one should proportion one's belief to the strength of the evidence-may have been overzealous, but a plausible interpretation of his view is this: Because beliefs of-ten have serious moral consequences, one should base one's beliefs on the evi-dence, and it is intellectually and morally irresponsible not to do so. This per-spective motivates recent so-called "evidence-based" methods in the fields of medicine and education. Balcombe's (2000, 2001) case for replacing learning methods that require pain, suffering, and death for animals with methods that do not (computer-assisted learning, three-dimensional models, videotapes, and other alternatives) can be seen as motivated by this evidentialist perspective. Balcombe provided a wealth of empirical evidence from educational studies to show that in most contexts animal dissection is not necessary-and even counterproductive-to achieve valid educa-tional goals, especially higher order goals (concept learning and problem solving). He demonstrated that no sound defense of dissection has been given. Can we learn as effectively without hurting or killing another being? If so, why do we not try? Many of the studies Balcombe cites have supported sufficiently the adequacy and, often, superiority of learning methods that do not harm animals or students. The first of the aforementioned questions is being answered; we can learn effectively with these non-detrimental methods. Those who seek to educate (and accept the prin-ciple of "do no harm") must seize the second question because they see, in the big pic-ture, the benefit for themselves, their students, their society, and other sentient beings. (p. 132)  相似文献   

18.
Estimating p-values in small microarray experiments   总被引:5,自引:0,他引:5  
MOTIVATION: Microarray data typically have small numbers of observations per gene, which can result in low power for statistical tests. Test statistics that borrow information from data across all of the genes can improve power, but these statistics have non-standard distributions, and their significance must be assessed using permutation analysis. When sample sizes are small, the number of distinct permutations can be severely limited, and pooling the permutation-derived test statistics across all genes has been proposed. However, the null distribution of the test statistics under permutation is not the same for equally and differentially expressed genes. This can have a negative impact on both p-value estimation and the power of information borrowing statistics. RESULTS: We investigate permutation based methods for estimating p-values. One of methods that uses pooling from a selected subset of the data are shown to have the correct type I error rate and to provide accurate estimates of the false discovery rate (FDR). We provide guidelines to select an appropriate subset. We also demonstrate that information borrowing statistics have substantially increased power compared to the t-test in small experiments.  相似文献   

19.
目的:研究PBL联合TBL的双轨教学法在胸外科专业学位研究生教学中的应用。方法:选取进入我院进入科室进行临床工作的硕士研究生及博士研究生共60人,随机分为教学改革组与传统教学组,每组30人,教学改革组采用PBL联合TBL教学法。在为期1年时间结束后,分别采用闭卷考试、小组成员互评及导师对学员的评价等多种评价模式比较教学改革组与传统教学组的学习效果。同时还采用闭卷考试与问卷调查方式评价教学效果。结果:教学改革组客观成绩与学习效果即出科考试成绩、增设考试成绩与对专业知识掌握程度的客观成绩优于传统教学组,差异具有统计学意义,并且PBL联合TBL的双轨教学法对于提高学生临床思维培养能力、表达能力、交流能力均优于传统教学组,并且提高了学生在实习期间的热情度。结论:联合PBL和TBL的双轨教学发在胸外科专业学位研究生教学中有较好的效果。  相似文献   

20.
MOTIVATION: Small non-coding RNA (ncRNA) genes play important regulatory roles in a variety of cellular processes. However, detection of ncRNA genes is a great challenge to both experimental and computational approaches. In this study, we describe a new approach called positive sample only learning (PSoL) to predict ncRNA genes in the Escherichia coli genome. Although PSoL is a machine learning method for classification, it requires no negative training data, which, in general, is hard to define properly and affects the performance of machine learning dramatically. In addition, using the support vector machine (SVM) as the core learning algorithm, PSoL can integrate many different kinds of information to improve the accuracy of prediction. Besides the application of PSoL for predicting ncRNAs, PSoL is applicable to many other bioinformatics problems as well. RESULTS: The PSoL method is assessed by 5-fold cross-validation experiments which show that PSoL can achieve about 80% accuracy in recovery of known ncRNAs. We compared PSoL predictions with five previously published results. The PSoL method has the highest percentage of predictions overlapping with those from other methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号