首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of Explorations in Statistics explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection. Regression helps us answer three questions: does some variable Y depend on another variable X; if so, what is the nature of the relationship between Y and X; and for some value of X, what value of Y do we predict? Residual plots are an essential component of a thorough regression analysis: they help us decide if our statistical regression model of the relationship between Y and X is appropriate.  相似文献   

2.
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This sixth installment of Explorations in Statistics explores correlation, a familiar technique that estimates the magnitude of a straight-line relationship between two variables. Correlation is meaningful only when the two variables are true random variables: for example, if we restrict in some way the variability of one variable, then the magnitude of the correlation will decrease. Correlation cannot help us decide if changes in one variable result in changes in the second variable, if changes in the second variable result in changes in the first variable, or if changes in a third variable result in concurrent changes in the first two variables. Correlation can help provide us with evidence that study of the nature of the relationship between x and y may be warranted in an actual experiment in which one of them is controlled.  相似文献   

3.
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eighth installment of Explorations in Statistics explores permutation methods, empiric procedures we can use to assess an experimental result-to test a null hypothesis-when we are reluctant to trust statistical theory alone. Permutation methods operate on the observations-the data-we get from an experiment. A permutation procedure answers this question: out of all the possible ways we can rearrange the observations we got, in what proportion of those arrangements is the sample statistic we care about at least as extreme as the one we got? The answer to that question is the P value.  相似文献   

4.

Background  

The optimal score for ungapped local alignments of infinitely long random sequences is known to follow a Gumbel extreme value distribution. Less is known about the important case, where gaps are allowed. For this case, the distribution is only known empirically in the high-probability region, which is biologically less relevant.  相似文献   

5.
《Fly》2013,7(4):327-332
While many quantifiable biological phenomena can be described by making use of an assumption of normality in the distribution of individual values, many biological phenomena are not accurately described by the normal distribution. An unquestioned assumption of normality of distribution of possible outcomes can lead to misinterpretation of data, which could have serious consequences. Thus it is extremely important to test the validity of an assumption of normality of possible outcomes. As it turns out, the logarithmic-normal (log-normal) distribution pattern is often far more accurate in describing statistical biological phenomena. Herein I examine large samples of values for circulating blood cell (hemocyte) concentration (CHC) among both wild-type and mutant Drosophila larvae, and demonstrate in both cases that the distribution of individual values does not conform to normality, but does conform to log-normality.  相似文献   

6.
7.
8.
Statistics on Markov chains are widely used for the study of patterns in biological sequences. Statistics on these models can be done through several approaches. Central limit theorem (CLT) producing Gaussian approximations are one of the most popular ones. Unfortunately, in order to find a pattern of interest, these methods have to deal with tail distribution events where CLT is especially bad. In this paper, we propose a new approach based on the large deviations theory to assess pattern statistics. We first recall theoretical results for empiric mean (level 1) as well as empiric distribution (level 2) large deviations on Markov chains. Then, we present the applications of these results focusing on numerical issues. LD-SPatt is the name of GPL software implementing these algorithms. We compare this approach to several existing ones in terms of complexity and reliability and show that the large deviations are more reliable than the Gaussian approximations in absolute values as well as in terms of ranking and are at least as reliable as compound Poisson approximations. We then finally discuss some further possible improvements and applications of this new method.  相似文献   

9.
While many quantifiable biological phenomena can be described by making use of an assumption of normality in the distribution of individual values, many biological phenomena are not accurately described by the normal distribution. An unquestioned assumption of normality of distribution of possible outcomes can lead to misinterpretation of data, which could have serious consequences. Thus it is extremely important to test the validity of an assumption of normality of possible outcomes. As it turns out, the logarithmic-normal (log-normal) distribution pattern is often far more accurate in describing statistical biological phenomena. Herein I examine large samples of values for circulating blood cell (hemocyte) concentration (CHC) among both wild-type and mutant Drosophila larvae, and demonstrate in both cases that the distribution of individual values does not conform to normality, but does conform to log-normality.Key words: hemocyte, CHC, concentration, logarithm, normal, log-normal, blood  相似文献   

10.
Durban M  Hackett CA  Currie ID 《Biometrics》1999,55(3):699-703
We consider semiparametric models with p regressor terms and q smooth terms. We obtain an explicit expression for the estimate of the regression coefficients given by the back-fitting algorithm. The calculation of the standard errors of these estimates based on this expression is a considerable computational exercise. We present an alternative, approximate method of calculation that is less demanding. With smoothing splines, the method is exact, while with loess, it gives good estimates of standard errors. We assess the adequacy of our approximation and of another approximation with the help of two examples.  相似文献   

11.
12.
13.
Markovian models of ion channels have proven useful in the reconstruction of experimental data and prediction of cellular electrophysiology. We present the stochastic Galerkin method as an alternative to Monte Carlo and other stochastic methods for assessing the impact of uncertain rate coefficients on the predictions of Markovian ion channel models. We extend and study two different ion channel models: a simple model with only a single open and a closed state and a detailed model of the cardiac rapidly activating delayed rectifier potassium current. We demonstrate the efficacy of stochastic Galerkin methods for computing solutions to systems with random model parameters. Our studies illustrate the characteristic changes in distributions of state transitions and electrical currents through ion channels due to random rate coefficients. Furthermore, the studies indicate the applicability of the stochastic Galerkin technique for uncertainty and sensitivity analysis of bio-mathematical models.  相似文献   

14.
We reviewed written and audio records of paramedic-base hospital radio contact to determine whether care differed from that suggested in standard prehospital care protocols. Records of all 659 contacts for seizure, syncope, abdominal pain, or altered mental state during 1987 (28.4% of all contacts) were scored for the use of standard therapies (such as intravenous access, oxygen, naloxone hydrochloride) and unanticipated therapies (intubation, nitroglycerin). Cases that involved unanticipated treatments were reviewed to determine whether they could have been prospectively identified by simple clinical findings. Standard therapies were used in the majority of patients. Unanticipated therapies were administered to 13 patients, all of whom had abnormal vital signs, diaphoresis, respiratory distress, or a second prominent symptom. Data suggest that protocols could replace radio contact for most patients and that the few who might benefit from radio contact can be easily identified. A 90% reduction in radio contacts in Los Angeles county could save $3 million each year.  相似文献   

15.
Microsatellite genotyping from samples with varying quality can result in an uneven distribution of errors. Previous studies reporting error rates have focused on estimating the effects of both randomly distributed and locus‐specific errors. Sample‐specific errors, however, can also significantly affect results in population studies despite a large sample size. From two studies including six microsatellite markers genotyped from 272 sperm whale DNA samples, and 33 microsatellites genotyped from 213 bowhead whales, we investigated the effects of sample‐ and locus‐specific errors on calculations of Hardy–Weinberg equilibrium. The results of a jackknife analysis in these two studies identified seven individuals that were highly influential on estimates of Hardy–Weinberg equilibrium for six different markers. In each case, the influential individual was homozygous for a rare allele. Our results demonstrate that Hardy–Weinberg P values are very sensitive to homozygosity in rare alleles for single individuals, and that > 50% of these cases involved genotype errors likely due to low sample quality. This raises the possibility that even small, normal levels of laboratory errors can result in an overestimate of the degree to which markers are out of Hardy–Weinberg equilibrium and hence overestimate population structure. To avoid such bias, we recommend routine identification of influential individuals and multiple replications of those samples.  相似文献   

16.
17.
Dickson RJ  Gloor GB 《PloS one》2012,7(6):e37645
The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files/.  相似文献   

18.
Estimating haplotype frequencies becomes increasingly important in the mapping of complex disease genes, as millions of single nucleotide polymorphisms (SNPs) are being identified and genotyped. When genotypes at multiple SNP loci are gathered from unrelated individuals, haplotype frequencies can be accurately estimated using expectation-maximization (EM) algorithms (Excoffier and Slatkin, 1995; Hawley and Kidd, 1995; Long et al., 1995), with standard errors estimated using bootstraps. However, because the number of possible haplotypes increases exponentially with the number of SNPs, handling data with a large number of SNPs poses a computational challenge for the EM methods and for other haplotype inference methods. To solve this problem, Niu and colleagues, in their Bayesian haplotype inference paper (Niu et al., 2002), introduced a computational algorithm called progressive ligation (PL). But their Bayesian method has a limitation on the number of subjects (no more than 100 subjects in the current implementation of the method). In this paper, we propose a new method in which we use the same likelihood formulation as in Excoffier and Slatkin's EM algorithm and apply the estimating equation idea and the PL computational algorithm with some modifications. Our proposed method can handle data sets with large number of SNPs as well as large numbers of subjects. Simultaneously, our method estimates standard errors efficiently, using the sandwich-estimate from the estimating equation, rather than the bootstrap method. Additionally, our method admits missing data and produces valid estimates of parameters and their standard errors under the assumption that the missing genotypes are missing at random in the sense defined by Rubin (1976).  相似文献   

19.
J M Geramita  J T Smith 《Biometrics》1985,41(1):281-285
We give estimates of variances and covariances of survival rates for subgroups of a wild population surveyed by mark-recapture methods. These are used in forming test statistics to compare subgroup survival rates. We show that using correct standard errors based on the appropriate approximate sampling model and incorporating important covariances of observed frequencies avoids spurious significance of survival rate differences.  相似文献   

20.
Explorations in Anthropology and Theology. Frank A. Salamone and Walter Randolph Adams. eds. Lanham, MD: University Press of America, 1997. 279 pp.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号