首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The inefficiency of least squares   总被引:22,自引:0,他引:22  
  相似文献   

2.
Prediction and the efficiency of least squares   总被引:1,自引:0,他引:1  
WATSON  G. S. 《Biometrika》1972,59(1):91-98
  相似文献   

3.
On the minimum efficiency of least squares   总被引:7,自引:0,他引:7  
KNOTT  M. 《Biometrika》1975,62(1):129-132
  相似文献   

4.
Current methods of identifying positively selected regions in the genome are limited in two key ways: the underlying models cannot account for the timing of adaptive events and the comparison between models of selective sweeps and sequence data is generally made via simple summaries of genetic diversity. Here, we develop a tractable method of describing the effect of positive selection on the genealogical histories in the surrounding genome, explicitly modeling both the timing and context of an adaptive event. In addition, our framework allows us to go beyond analyzing polymorphism data via the site frequency spectrum or summaries thereof and instead leverage information contained in patterns of linked variants. Tests on both simulations and a human data example, as well as a comparison to SweepFinder2, show that even with very small sample sizes, our analytic framework has higher power to identify old selective sweeps and to correctly infer both the time and strength of selection. Finally, we derived the marginal distribution of genealogical branch lengths at a locus affected by selection acting at a linked site. This provides a much-needed link between our analytic understanding of the effects of sweeps on sequence variation and recent advances in simulation and heuristic inference procedures that allow researchers to examine the sequence of genealogical histories along the genome.  相似文献   

5.

Background  

The least squares (LS) method for constructing confidence sets of trees is closely related to LS tree building methods, in which the goodness of fit of the distances measured on the tree (patristic distances) to the observed distances between taxa is the criterion used for selecting the best topology. The generalized LS (GLS) method for topology testing is often frustrated by the computational difficulties in calculating the covariance matrix and its inverse, which in practice requires approximations. The weighted LS (WLS) allows for a more efficient albeit approximate calculation of the test statistic by ignoring the covariances between the distances.  相似文献   

6.
Molecular phylogenies contribute to the study of the patterns and processes of macroevolution even though past events (fossils) are not recorded in these data. In this article, I consider the general time-dependent birth-death model to fit any model of temporal variation in speciation and extinction to phylogenies. I establish formulae to compute the expected cumulative distribution function of branching times for any model, and, building on previous published works, I derive maximum likelihood estimators. Some limitations of the likelihood approach are described, and a fitting procedure based on least squares is developed that alleviates the shortcomings of maximum likelihood in the present context. Parametric and nonparametric bootstrap procedures are developed to assess uncertainty in the parameter estimates, the latter version giving narrower confidence intervals and being faster to compute. I also present several general algorithms of tree simulation in continuous time. I illustrate the application of this approach with the analysis of simulated datasets, and two published phylogenies of primates (Catarrhinae) and lizards (Agamidae).  相似文献   

7.
This paper deals with the synthesis of information from different studies when there is lack of independence in some of the contrasts to be combined. This problem can arise in several different situations in both case-control studies and clinical trials. For efficient estimation we appeal to the method of generalized least squares to estimate the summary effect and its standard error. This method requires estimates of the covariances between those contrasts that are not independent. Although it is not possible to estimate the covariance between effects that have been adjusted for confounding factors we present a method for finding upper and lower bounds for this covariance. In the simplest discussion homogeneity of the relative risks is assumed but the method is then extended to allow for heterogeneity in an overall estimate. We then illustrate the method with several examples from an analysis in which case-control studies of cervical cancer and oral contraceptive use are synthesized.  相似文献   

8.
Methods of least squares and SIRT in reconstruction.   总被引:1,自引:0,他引:1  
In this paper we show that a particular version of the Simultaneous Iterative Reconstruction Technique (SIRT) proposed by Gilbert in 1972 strongly resembles the Richardson least-squares algorithm.By adopting the adjustable parameters of the general Richardson algorithm, we have been able to produce generalized SIRT algorithms with improved convergence.A particular generalization of the SIRT algorithm, GSIRT, has an adjustable parameter σ and the starting picture ρ0 as input. A value 12 for σ and a weighted back-projection for ρ0 produce a stable algorithm.We call the SIRT-like algorithms for the solution of the weighted leastsquares problems LSIRT and present two such algorithms, LSIRT1 and LSIRT2, which have definite computational advantages over SIRT and GSIRT.We have tested these methods on mathematically simulated phantoms and find that the new SIRT methods converge faster than Gilbert's SIRT but are more sensitive to noise present in the data. However, the faster convergence rates allow termination before the noise contribution degrades the reconstructed image excessively.  相似文献   

9.
10.
11.
It is shown that a recently published least squares method for the estimation of the average center of rotation is biased. Consequently, a correction term is proposed, and an iterative algorithm is derived for finding a bias compensated solution to the least squares problem.The accuracy of the proposed bias compensated least squares method is compared to the previously proposed least squares method by Monte-Carlo simulations. The tests show that the new method gives a substantial improvement in accuracy.  相似文献   

12.
When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data.  相似文献   

13.
14.
We modified the phylogenetic program MrBayes 3.1.2 to incorporate the compound Dirichlet priors for branch lengths proposed recently by Rannala, Zhu, and Yang (2012. Tail paradox, partial identifiability and influential priors in Bayesian branch length inference. Mol. Biol. Evol. 29:325-335.) as a solution to the problem of branch-length overestimation in Bayesian phylogenetic inference. The compound Dirichlet prior specifies a fairly diffuse prior on the tree length (the sum of branch lengths) and uses a Dirichlet distribution to partition the tree length into branch lengths. Six problematic data sets originally analyzed by Brown, Hedtke, Lemmon, and Lemmon (2010. When trees grow too long: investigating the causes of highly inaccurate Bayesian branch-length estimates. Syst. Biol. 59:145-161) are reanalyzed using the modified version of MrBayes to investigate properties of Bayesian branch-length estimation using the new priors. While the default exponential priors for branch lengths produced extremely long trees, the compound Dirichlet priors produced posterior estimates that are much closer to the maximum likelihood estimates. Furthermore, the posterior tree lengths were quite robust to changes in the parameter values in the compound Dirichlet priors, for example, when the prior mean of tree length changed over several orders of magnitude. Our results suggest that the compound Dirichlet priors may be useful for correcting branch-length overestimation in phylogenetic analyses of empirical data sets.  相似文献   

15.
The constant rate birth–death process is a popular null model for speciation and extinction. If one removes extinct and non-sampled lineages, this process induces ‘reconstructed trees’ which describe the relationship between extant lineages. We derive the probability density of the length of a randomly chosen pendant edge in a reconstructed tree. For the special case of a pure-birth process with complete sampling, we also provide the probability density of the length of an interior edge, of the length of an edge descending from the root, and of the diversity (which is the sum of all edge lengths). We show that the results depend on whether the reconstructed trees are conditioned on the number of leaves, the age, or both.  相似文献   

16.
MOTIVATION: Gene association/interaction networks provide vast amounts of information about essential processes inside the cell. A complete picture of gene-gene associations/interactions would open new horizons for biologists, ranging from pure appreciation to successful manipulation of biological pathways for therapeutic purposes. Therefore, identification of important biological complexes whose members (genes and their products proteins) interact with each other is of prime importance. Numerous experimental methods exist but, for the most part, they are costly and labor intensive. Computational techniques, such as the one proposed in this work, provide a quick 'budget' solution that can be used as a screening tool before more expensive techniques are attempted. Here, we introduce a novel computational method based on the partial least squares (PLS) regression technique for reconstruction of genetic networks from microarray data. RESULTS: The proposed PLS method is shown to be an effective screening procedure for the detection of gene-gene interactions from microarray data. Both simulated and real microarray experiments show that the PLS-based approach is superior to its competitors both in terms of performance and applicability. AVAILABILITY: R code is available from the supplementary web-site whose URL is given below.  相似文献   

17.
18.
Microarray experiments generate data sets with information on the expression levels of thousands of genes in a set of biological samples. Unfortunately, such experiments often produce multiple missing expression values, normally due to various experimental problems. As many algorithms for gene expression analysis require a complete data matrix as input, the missing values have to be estimated in order to analyze the available data. Alternatively, genes and arrays can be removed until no missing values remain. However, for genes or arrays with only a small number of missing values, it is desirable to impute those values. For the subsequent analysis to be as informative as possible, it is essential that the estimates for the missing gene expression values are accurate. A small amount of badly estimated missing values in the data might be enough for clustering methods, such as hierachical clustering or K-means clustering, to produce misleading results. Thus, accurate methods for missing value estimation are needed. We present novel methods for estimation of missing values in microarray data sets that are based on the least squares principle, and that utilize correlations between both genes and arrays. For this set of methods, we use the common reference name LSimpute. We compare the estimation accuracy of our methods with the widely used KNNimpute on three complete data matrices from public data sets by randomly knocking out data (labeling as missing). From these tests, we conclude that our LSimpute methods produce estimates that consistently are more accurate than those obtained using KNNimpute. Additionally, we examine a more classic approach to missing value estimation based on expectation maximization (EM). We refer to our EM implementations as EMimpute, and the estimate errors using the EMimpute methods are compared with those our novel methods produce. The results indicate that on average, the estimates from our best performing LSimpute method are at least as accurate as those from the best EMimpute algorithm.  相似文献   

19.

Background

The conventional superposition methods use an ordinary least squares (LS) fit for structural comparison of two different conformations of the same protein. The main problem of the LS fit that it is sensitive to outliers, i.e. large displacements of the original structures superimposed.

Results

To overcome this problem, we present a new algorithm to overlap two protein conformations by their atomic coordinates using a robust statistics technique: least median of squares (LMS). In order to effectively approximate the LMS optimization, the forward search technique is utilized. Our algorithm can automatically detect and superimpose the rigid core regions of two conformations with small or large displacements. In contrast, most existing superposition techniques strongly depend on the initial LS estimating for the entire atom sets of proteins. They may fail on structural superposition of two conformations with large displacements. The presented LMS fit can be considered as an alternative and complementary tool for structural superposition.

Conclusion

The proposed algorithm is robust and does not require any prior knowledge of the flexible regions. Furthermore, we show that the LMS fit can be extended to multiple level superposition between two conformations with several rigid domains. Our fit tool has produced successful superpositions when applied to proteins for which two conformations are known. The binary executable program for Windows platform, tested examples, and database are available from https://engineering.purdue.edu/PRECISE/LMSfit.  相似文献   

20.
Continuous estimation of time-varying respiratory mechanical parameters is required to fully characterize the time course of bronchoconstriction. To achieve such estimation, we developed an estimator that uses the recursive linear least-squares algorithm to fit the equation Ptr = RV + EV + K to measurements of tracheal pressure (Ptr) and flow (V). The volume (V) is obtained by numerical integration of V. The estimator has a finite memory with length into the past at each point in time that varies inversely with the difference between the current measurement of Ptr and that predicted by the model, to allow the algorithm to track rapidly varying parameters (R, E, and K). V usually exhibits significant drift and must be corrected. Of the several correction methods investigated, subtraction of the recursively weighted average of V before integration to V was found to perform best. The estimator was tested on simulated noisy data where it successfully followed a fivefold increase in R and a twofold increase in E occurring over 10 s. Three dogs and two cats were anesthetized, paralyzed, tracheostomized, and challenged with a bolus of methacholine (approximately 13 mg/kg iv). Increases of 3- to 10-fold were observed in R and 2- to 3-fold in E, beginning within 10-40 s after the bolus injection. In some animals we found that the increase in E occurred more slowly than that in R, which the V signal suggested was due to dynamic hyperinflation of the lungs. These results demonstrate that our recursive estimator is able to track rapid changes in respiratory mechanical parameters during bronchoconstrictor challenge.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号