首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 7 毫秒
1.
Under certain assumptions the expectation of a product of functions of a random variable is greater (smaller) than the product of expectations. The multivariate distribution function of m independent random variables at a random point is greater than the product of the distribution functions of the m variables.  相似文献   

2.
Variation of biological populations is required for evolution by natural selection, and variance is a fundamental component in quantitative characterization of evolutionary differences and rates of change. Biological variation is widely understood to be normally distributed because of a general theoretical law of error. The law of error has two forms, and resulting normality may be arithmetic-where equivalent positive and negative deviations from expectation differ by equal amounts, or normality may be geometric-where equivalent deviations differ by equal proportions. Which law of error applies in biology can only be determined empirically, and this is surprisingly difficult. A new likelihood approach is developed here using data from anthropometric surveys of humans in two states in India: Maharashtra and Uttar Pradesh. Each state sample is large, but more importantly, each includes a large number of smaller subsamples. Likelihood support is additive, and subsamples are advantageous because (1) they are more homogeneous, (2) they yield probabilities and support scores in every case, and (3) significance can be evaluated first by tracing signs of the subsample support scores and then by comparing subsample support sums. Sign traces that fluctuate randomly show arithmetic and geometric normality to be indistinguishable. Two of 14 measurement variables studied here have subsample support sign traces differing from random, and one is significant in having a subsample support sum falling outside a 95% prediction interval for the 12 fluctuating traces: geometric normality is favored by a factor of ca. 10(60). Six of 14 index variables have support sign traces differing from random, and all are significant in having subsample support sums falling outside a 95 % prediction interval for the 8 fluctuating traces: geometric normality is favored by factors of 10(8) or more. Arithmetic and geometric normality cannot be distinguished for 21 of 28 variables studied here, but whenever alternatives are distinguishable geometric normality is consistently and strongly favored. This means that the applicable law of errors is proportional. In practical terms, arithmetic measurements must be transformed using logarithms to represent both the geometric normality of biological variation and the relative functional significance of measurements appropriately.  相似文献   

3.
Genome–environment association methods aim to detect genetic markers associated with environmental variables. The detected associations are usually analysed separately to identify the genomic regions involved in local adaptation. However, a recent study suggests that single‐locus associations can be combined and used in a predictive way to estimate environmental variables for new individuals on the basis of their genotypes. Here, we introduce an original approach to predict the environmental range (values and upper and lower limits) of species genotypes from the genetic markers significantly associated with those environmental variables in an independent set of individuals. We illustrate this approach to predict aridity in a database constituted of 950 individuals of wild beets and 299 individuals of cultivated beets genotyped at 14,409 random single nucleotide polymorphisms (SNPs). We detected 66 alleles associated with aridity and used them to calculate the fraction (I) of aridity‐associated alleles in each individual. The fraction I correctly predicted the values of aridity in an independent validation set of wild individuals and was then used to predict aridity in the 299 cultivated individuals. Wild individuals had higher median values and a wider range of values of aridity than the cultivated individuals, suggesting that wild individuals have higher ability to resist to stress‐aridity conditions and could be used to improve the resistance of cultivated varieties to aridity.  相似文献   

4.
Missing outcomes or irregularly timed multivariate longitudinal data frequently occur in clinical trials or biomedical studies. The multivariate t linear mixed model (MtLMM) has been shown to be a robust approach to modeling multioutcome continuous repeated measures in the presence of outliers or heavy‐tailed noises. This paper presents a framework for fitting the MtLMM with an arbitrary missing data pattern embodied within multiple outcome variables recorded at irregular occasions. To address the serial correlation among the within‐subject errors, a damped exponential correlation structure is considered in the model. Under the missing at random mechanism, an efficient alternating expectation‐conditional maximization (AECM) algorithm is used to carry out estimation of parameters and imputation of missing values. The techniques for the estimation of random effects and the prediction of future responses are also investigated. Applications to an HIV‐AIDS study and a pregnancy study involving analysis of multivariate longitudinal data with missing outcomes as well as a simulation study have highlighted the superiority of MtLMMs on the provision of more adequate estimation, imputation and prediction performances.  相似文献   

5.
Microarrays provide a valuable tool for the quantification of gene expression. Usually, however, there is a limited number of replicates leading to unsatisfying variance estimates in a gene‐wise mixed model analysis. As thousands of genes are available, it is desirable to combine information across genes. When more than two tissue types or treatments are to be compared it might be advisable to consider the array effect as random. Then information between arrays may be recovered, which can increase accuracy in estimation. We propose a method of variance component estimation across genes for a linear mixed model with two random effects. The method may be extended to models with more than two random effects. We assume that the variance components follow a log‐normal distribution. Assuming that the sums of squares from the gene‐wise analysis, given the true variance components, follow a scaled χ2‐distribution, we adopt an empirical Bayes approach. The variance components are estimated by the expectation of their posterior distribution. The new method is evaluated in a simulation study. Differentially expressed genes are more likely to be detected by tests based on these variance estimates than by tests based on gene‐wise variance estimates. This effect is most visible in studies with small array numbers. Analyzing a real data set on maize endosperm the method is shown to work well. (© 2008 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

6.
A new family of distributions for circular random variables is proposed. It is based on nonnegative trigonometric sums and can be used to model data sets which present skewness and/or multimodality. In this family of distributions, the trigonometric moments are easily expressed in terms of the parameters of the distribution. The proposed family is applied to two data sets, one related with the directions taken by ants and the other with the directions taken by turtles, to compare their goodness of fit versus common distributions used in the literature.  相似文献   

7.
Indocyanine green (ICG) dye angiography has been used by ophthalmologists for routine examination of the choroidal vasculature in human eyes for more than 20 years. In this study, a new approach is developed to extract information from ICG dye angiograms about blood velocity distribution in the choriocapillaris and its feeding blood vessels. ICG dye fluorescence intensity rise and decay curves are constructed for each pixel location in each image of the choriocapillaris in an ICG angiogram. It is shown that at each instant of time the magnitude of the local instantaneous dye velocity in the choriocapillaris is proportional to both the slope of the ICG dye fluorescence intensity curve and the dye concentration. This approach leads to determination of the absolute value of blood velocity in the choriocapillaris, assuming an appropriate scaling, or conversion factor can be determined. It also enables comparison of velocities in different regions of the choriocapillaris, since the conversion factor is independent of the vessel location. The computer algorithm developed in this study can be used in clinical applications for diagnostic purposes and for assessment of the efficacy of laser therapy in human eyes.  相似文献   

8.
The problem of testing for treatment effect when some subjects in the treatment group may be unaffected by the treatment is considered. A form of the Lehmann alternative suggested by Conover and Salsburg is used that assumes that each control score has the same distribution as the minimum of the known number of responses in the treatment group. It is shown that the locally most powerful test leads to a test statistic that, under the hypothesis of no treatment effect, is the sum of independent pareto random variables whereas under the alternative hypothesis it is the sum of independent random variables from a mixture of two pareto distributions. The limiting distribution of the test statistic under both hypotheses is in the domain of attraction of a stable distribution whose indices are derived. The power of the test is given, and its properties are discussed. A set of data from clinical research involving development of a new drug is used to show application of the procedure and demonstrate its usefulness.  相似文献   

9.
Proteomics uses tandem mass spectrometers and correlation algorithms to match peptides and their fragment spectra to amino acid sequences. The replication of multiple liquid chromatography experiments with electrospray ionization of peptides and tandem mass spectrometry (LC–ESI–MS/MS) produces large sets of MS/MS spectra. There is a need to assess the quality of large sets of experimental results by statistical comparison with that of random expectation. Classical frequency-based statistics such as goodness-of-fit tests for peptide-to-protein distributions could be used to calculate the probability that an entire set of experimental results has arisen by random chance. The frequency distributions of authentic MS/MS spectra from human blood were compared with those of false positive MS/MS spectra generated by a computer, or instrument noise, using the chi-square test. Here the mechanics of the chi-square test to compare the results in toto from a set of LC–ESI–MS/MS experiments with those of random expectation is detailed. The chi-square analysis of authentic spectra demonstrates unambiguously that the analysis of blood proteins separated by partition chromatography prior to tryptic digestions has a low probability that the cumulative peptide-to-protein distribution is the same as that of random or noise false positive spectra.  相似文献   

10.
Whether bacterial drug-resistance is drug-induced or results from rapid propagation of random spontaneous mutations in the flora prior to exposure, remains a long-term key issue concerned and debated in both genetics and medicinal fields. In a pioneering study, Luria and Delbrück exposed E. coli to T1 phage, to investigate whether the number of resistant colonies followed the Poisson distribution. They deduced that the development of resistant colonies is independent of phage presence. Similar results have since been obtained on solid medium containing antibacterial agents. Luria and Delbrück??s conclusions were long considered a gold standard for analyzing drug resistance mutations. More recently, the concept of adaptive mutation has triggered controversy over this approach. Microbiological observation shows that, following exposure to drugs of various concentrations, drug-resistant cells emerge and multiply depending on the time course, and show a process function, inconsistent with the definition of Poisson distribution (which assumes not only that resistance is independent of drug quantity but follows no specific time course). At the same time, since cells tend to aggregate after division rather than separating, colonies growing on drug plates arise from the multiplication of resistant bacteria cells of various initial population sizes. Thus, statistical analysis based on equivalence of initial populations will yield erroneous results. In this paper, 310 data from the Luria-Delbrück fluctuation experiment were reanalyzed from this perspective. In most cases, a high-end abnormal value, resulting from the non-synchronous variation of the two above-mentioned time variables, was observed. Therefore, the mean value cannot be regarded as an unbiased expectation estimate. The ratio between mean value and variance was similarly incomparable, because two different sampling methods were used. In fact, the Luria-Delbrück data appear to follow an aggregated, rather than Poisson distribution. In summary, the statistical analysis of Luria and Delbrück is insufficient to describe rules of resistant mutant development and multiplication. Correction of this historical misunderstanding will enable new insight into bacterial resistance mechanisms.  相似文献   

11.
Ewens' sampling formula, the probability distribution of a configuration of alleles in a sample of genes under the infinitely-many-alleles model of mutation, is proved by a direct combinatorial argument. The distribution is extended to a model where the population size may vary back in time. The distribution of age-ordered frequencies in the population is also derived in the model, extending the GEM distribution of age-ordered frequencies in a model with a constant-sized population. The genealogy of a rare allele is studied using a combinatorial approach. A connection is explored between the distribution of age-ordered frequencies and ladder indices and heights in a sequence of random variables. In a sample of n genes the connection is with ladder heights and indices in a sequence of draws from an urn containing balls labelled 1,2,...,n; and in the population the connection is with ladder heights and indices in a sequence of independent uniform random variables.  相似文献   

12.
Measures of reproductive output in turtles are generally positively correlated with female body size. However, a full understanding of reproductive allometry in turtles requires logarithmic transformation of reproductive and body size variables prior to regression analyses. This allows for slope comparisons with expected linear or cubic relationships for linear to linear and linear to volumetric variables, respectively. We compiled scaling data using this approach from published and unpublished turtle studies (46 populations of 25 species from eight families) to quantify patterns among taxa. Our results suggest that for log–log comparisons of clutch size, egg width, egg mass, clutch mass, and pelvic aperture width to shell length, all scale hypoallometrically despite theoretical predictions of isometry. Clutch size generally scaled at ~1.7 to 2.0 (compared to an isometric expectation of 3.0), egg width at ~0.5 (compared to an expectation of 1.0), egg mass at ~1.1 to 1.3 (3.0), clutch mass at ~2.5 to 2.8 (3.0), and pelvic aperture width at 0.8–0.9 (1.0). We also found preliminary evidence that scaling may differ across years and clutches even in the same population, as well as across populations of the same species. Future investigators should aspire to collect data on all these reproductive parameters and to report log–log allometric analyses to test our preliminary conclusions regarding reproductive allometry in turtles.  相似文献   

13.
A very practical application of Bayes's theorem, for the analysis of binomial random variables, is presented. Previous papers (Walters, 1985; Walters, 1986a) have already demonstrated the reliability of the technique for one, or two random variables, and the extension of the approach to several random variables is described. Two biometrical examples are used to illustrate the method.  相似文献   

14.
Summary It is of great practical interest to simultaneously identify the important predictors that correspond to both the fixed and random effects components in a linear mixed‐effects (LME) model. Typical approaches perform selection separately on each of the fixed and random effect components. However, changing the structure of one set of effects can lead to different choices of variables for the other set of effects. We propose simultaneous selection of the fixed and random factors in an LME model using a modified Cholesky decomposition. Our method is based on a penalized joint log likelihood with an adaptive penalty for the selection and estimation of both the fixed and random effects. It performs model selection by allowing fixed effects or standard deviations of random effects to be exactly zero. A constrained expectation–maximization algorithm is then used to obtain the final estimates. It is further shown that the proposed penalized estimator enjoys the Oracle property, in that, asymptotically it performs as well as if the true model was known beforehand. We demonstrate the performance of our method based on a simulation study and a real data example.  相似文献   

15.
Connell DJ 《EcoHealth》2010,7(3):351-360
Using ecohealth as a transdisciplinary lens to explore the connections among overlapping domains of inquiry, this article examines methodological relations between Sustainable Livelihoods and Ecosystem Health, two approaches for improving rural health and well-being. The experience of working on a project tasked with developing an integrated, systems-based approach for understanding the nature of rural livelihoods and ecosystems provides the base for analysis. Several key insights are discussed: The overarching goals of health and sustainability facilitate collaboration among disciplines; differences arise from how each approach operationalizes systems as variables and indicators; the dependent variables for one approach can be used as the independent variables for the other. In summary, while broad concepts like health and sustainability help transcend differences across disciplines and scales of analysis, variables and indicators cannot, as they are bound to how an observed system is operationalized. An advantage of using an ecohealth lens is that it creates conceptual and analytical spaces in which differences can be reconciled and used as sources of synergy. A source of synergy revealed in this article is the interdependence of variables used by each approach.  相似文献   

16.
Johnson and Wehrly (1978, Journal of the American Statistical Association 73, 602-606) and Wehrly and Johnson (1980, Biometrika 67, 255-256) show one way to construct the joint distribution of a circular and a linear random variable, or the joint distribution of a pair of circular random variables from their marginal distributions and the density of a circular random variable, which in this article is referred to as joining circular density. To construct flexible models, it is necessary that the joining circular density be able to present multimodality and/or skewness in order to model different dependence patterns. Fernández-Durán (2004, Biometrics 60, 499-503) constructed circular distributions based on nonnegative trigonometric sums that can present multimodality and/or skewness. Furthermore, they can be conveniently used as a model for circular-linear or circular-circular joint distributions. In the current work, joint distributions for circular-linear and circular-circular data constructed from circular distributions based on nonnegative trigonometric sums are presented and applied to two data sets, one for circular-linear data related to the air pollution patterns in Mexico City and the other for circular-circular data related to the pair of dihedral angles between consecutive amino acids in a protein.  相似文献   

17.
Some results of conditional expectation minimization theory and canonical correlations theory are used to give a unified approach to the derivation of several known results of correlation maximization theory involving prediction of one vector variate by using linear functions of another correlated vector variate. The present unified approach is more direct and straight-forward than the varied and different methods used by several authors to study population linear relationships between two vector variables within the framework of canonical correlation theory. The basis of our unified approach is the single underlying principle of correlation maximization.  相似文献   

18.
Johnson DS  Hoeting JA 《Biometrics》2003,59(2):341-350
In this article, we incorporate an autoregressive time-series framework into models for animal survival using capture-recapture data. Researchers modeling animal survival probabilities as the realization of a random process have typically considered survival to be independent from one time period to the next. This may not be realistic for some populations. Using a Gibbs sampling approach, we can estimate covariate coefficients and autoregressive parameters for survival models. The procedure is illustrated with a waterfowl band recovery dataset for northern pintails (Anas acuta). The analysis shows that the second lag autoregressive coefficient is significantly less than 0, suggesting that there is a triennial relationship between survival probabilities and emphasizing that modeling survival rates as independent random variables may be unrealistic in some cases. Software to implement the methodology is available at no charge on the Internet.  相似文献   

19.
Recently, a random breakage model has been proposed to explain the negative correlation between mean chromosome length and chromosome number that is found in many groups of species and is consistent with Menzerath-Altmann law, a statistical law that defines the dependency between the mean size of the whole and the number of parts in quantitative linguistics. Here, the central assumption of the model, namely that genome size is independent from chromosome number is reviewed. This assumption is shown to be unrealistic from the perspective of chromosome structure and the statistical analysis of real genomes. A general class of random models, including that random breakage model, is analyzed. For any model within this class, a power law with an exponent of -1 is predicted for the expectation of the mean chromosome size as a function of chromosome length, a functional dependency that is not supported by real genomes. The random breakage and variants keeping genome size and chromosome number independent raise no serious objection to the relevance of correlations consistent with Menzerath-Altmann law across taxonomic groups and the possibility of a connection between human language and genomes through that law.  相似文献   

20.
Liu M  Taylor JM  Belin TR 《Biometrics》2000,56(4):1157-1163
This paper outlines a multiple imputation method for handling missing data in designed longitudinal studies. A random coefficients model is developed to accommodate incomplete multivariate continuous longitudinal data. Multivariate repeated measures are jointly modeled; specifically, an i.i.d. normal model is assumed for time-independent variables and a hierarchical random coefficients model is assumed for time-dependent variables in a regression model conditional on the time-independent variables and time, with heterogeneous error variances across variables and time points. Gibbs sampling is used to draw model parameters and for imputations of missing observations. An application to data from a study of startle reactions illustrates the model. A simulation study compares the multiple imputation procedure to the weighting approach of Robins, Rotnitzky, and Zhao (1995, Journal of the American Statistical Association 90, 106-121) that can be used to address similar data structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号