首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most existing statistical methods for mapping quantitative trait loci (QTL) are not suitable for analyzing survival traits with a skewed distribution and censoring mechanism. As a result, researchers incorporate parametric and semi-parametric models of survival analysis into the framework of the interval mapping for QTL controlling survival traits. In survival analysis, accelerated failure time (AFT) model is considered as a de facto standard and fundamental model for data analysis. Based on AFT model, we propose a parametric approach for mapping survival traits using the EM algorithm to obtain the maximum likelihood estimates of the parameters. Also, with Bayesian information criterion (BIC) as a model selection criterion, an optimal mapping model is constructed by choosing specific error distributions with maximum likelihood and parsimonious parameters. Two real datasets were analyzed by our proposed method for illustration. The results show that among the five commonly used survival distributions, Weibull distribution is the optimal survival function for mapping of heading time in rice, while Log-logistic distribution is the optimal one for hyperoxic acute lung injury.  相似文献   

2.
Yin G  Li Y  Ji Y 《Biometrics》2006,62(3):777-787
A Bayesian adaptive design is proposed for dose-finding in phase I/II clinical trials to incorporate the bivariate outcomes, toxicity and efficacy, of a new treatment. Without specifying any parametric functional form for the drug dose-response curve, we jointly model the bivariate binary data to account for the correlation between toxicity and efficacy. After observing all the responses of each cohort of patients, the dosage for the next cohort is escalated, deescalated, or unchanged according to the proposed odds ratio criteria constructed from the posterior toxicity and efficacy probabilities. A novel class of prior distributions is proposed through logit transformations which implicitly imposes a monotonic constraint on dose toxicity probabilities and correlates the probabilities of the bivariate outcomes. We conduct simulation studies to evaluate the operating characteristics of the proposed method. Under various scenarios, the new Bayesian design based on the toxicity-efficacy odds ratio trade-offs exhibits good properties and treats most patients at the desirable dose levels. The method is illustrated with a real trial design for a breast medical oncology study.  相似文献   

3.
With the increasing use of survival models in animal breeding to address the genetic aspects of mainly longevity of livestock but also disease traits, the need for methods to infer genetic correlations and to do multivariate evaluations of survival traits and other types of traits has become increasingly important. In this study we derived and implemented a bivariate quantitative genetic model for a linear Gaussian and a survival trait that are genetically and environmentally correlated. For the survival trait, we considered the Weibull log-normal animal frailty model. A Bayesian approach using Gibbs sampling was adopted. Model parameters were inferred from their marginal posterior distributions. The required fully conditional posterior distributions were derived and issues on implementation are discussed. The two Weibull baseline parameters were updated jointly using a Metropolis-Hasting step. The remaining model parameters with non-normalized fully conditional distributions were updated univariately using adaptive rejection sampling. Simulation results showed that the estimated marginal posterior distributions covered well and placed high density to the true parameter values used in the simulation of data. In conclusion, the proposed method allows inferring additive genetic and environmental correlations, and doing multivariate genetic evaluation of a linear Gaussian trait and a survival trait.  相似文献   

4.
An interactive computer program, SWELL, displayss and analyzes bivariate distributions generated by flow cytometers. SWELL is modular with options available via a menu, is written in Fortran, and utilizes a video color display system. Data are accumulated as a bivariate distribution that is transferred to the computer as a 64 x 64 matrix. For ease of visualization, matrices are displayed in pseudocolor. The distribution values are broken into eight ranges and each range is represented by a color. Each element of the matrix is then displayed in its assigned color. To allow pooling and comparison, distributions are aligned, edited, and standardized. Unknown samples are pooled or analyzed singly and compared to the normal pool by subtraction. Differences are displayed as pseudocolor matrices of sign, magnitude, or statistical magnitude in units of standard deviation. This latter display, scaled to tolerance limits, readily reveals regions of significant difference between normal and abnormal samples. Counts within such regions can be compared to diagnose samples automatically.  相似文献   

5.
A fundamental problem in bioinformatics is to characterize the secondary structure of a protein, which has traditionally been carried out by examining a scatterplot (Ramachandran plot) of the conformational angles. We examine two natural bivariate von Mises distributions--referred to as Sine and Cosine models--which have five parameters and, for concentrated data, tend to a bivariate normal distribution. These are analyzed and their main properties derived. Conditions on the parameters are established which result in bimodal behavior for the joint density and the marginal distribution, and we note an interesting situation in which the joint density is bimodal but the marginal distributions are unimodal. We carry out comparisons of the two models, and it is seen that the Cosine model may be preferred. Mixture distributions of the Cosine model are fitted to two representative protein datasets using the expectation maximization algorithm, which results in an objective partition of the scatterplot into a number of components. Our results are consistent with empirical observations; new insights are discussed.  相似文献   

6.
In most quantitative trait loci (QTL) mapping studies, phenotypes are assumed to follow normal distributions. Deviations from this assumption may affect the accuracy of QTL detection, leading to detection of false positive QTL. To improve the robustness of QTL mapping methods, we replace the normal distribution assumption for residuals in a multiple QTL model with a Student-t distribution that is able to accommodate residual outliers. A Robust Bayesian mapping strategy is proposed on the basis of the Bayesian shrinkage analysis for QTL effects. The simulations show that Robust Bayesian mapping approach can substantially increase the power of QTL detection when the normality assumption does not hold and applying it to data already normally distributed does not influence the result. The proposed QTL mapping method is applied to mapping QTL for the traits associated with physics–chemical characters and quality in rice. Similarly to the simulation study in the real data case the robust approach was able to detect additional QTLs when compared to the traditional approach. The program to implement the method is available on request from the first or the corresponding author. Xin Wang and Zhongze Piao contributed equally to this study.  相似文献   

7.
The selection of a specific statistical distribution as a model for describing the population behavior of a given variable is seldom a simple problem. One strategy consists in testing different distributions (normal, lognormal, Weibull, etc.), and selecting the one providing the best fit to the observed data and being the most parsimonious. Alternatively, one can make a choice based on theoretical arguments and simply fit the corresponding parameters to the observed data. In either case, different distributions can give similar results and provide almost equivalent models for a given data set. Model selection can be more complicated when the goal is to describe a trend in the distribution of a given variable. In those cases, changes in shape and skewness are difficult to represent by a single distributional form. As an alternative to the use of complicated families of distributions as models for data, the S‐distribution [Voit, E. O. (1992) Biom. J. 7 , 855–878] provides a highly flexible mathematical form in which the density is defined as a function of the cumulative. S‐distributions can accurately approximate many known continuous and unimodal distributions, preserving the well known limit relationships between them. Besides representing well‐known distributions, S‐distributions provide an infinity of new possibilities that do not correspond with known classical distributions. Although the utility and performance of this general form has been clearly proved in different applications, its definition as a differential equation is a potential drawback for some problems. In this paper we obtain an analytical solution for the quantile equation that highly simplifies the use of S‐distributions. We show the utility of this solution in different applications. After classifying the different qualitative behaviors of the S‐distribution in parameter space, we show how to obtain different S‐distributions that accomplish specific constraints. One of the most interesting cases is the possibility of obtaining distributions that acomplish P(XXc) = 0. Then, we demonstrate that the quantile solution facilitates the use of S‐distributions in Monte‐Carlo experiments through the generation of random samples. Finally, we show how to fit an S‐distribution to actual data, so that the resulting distribution can be used as a statistical model for them.  相似文献   

8.
The copula of a bivariate distribution, constructed by making marginal transformations of each component, captures all the information in the bivariate distribution about the dependence between two variables. For frailty models for bivariate data the choice of a family of distributions for the random frailty corresponds to the choice of a parametric family for the copula. A class of tests of the hypothesis that the copula is in a given parametric family, with unspecified association parameter, based on bivariate right censored data is proposed. These tests are based on first making marginal Kaplan-Meier transformations of the data and then comparing a non-parametric estimate of the copula to an estimate based on the assumed family of models. A number of options are available for choosing the scale and the distance measure for this comparison. Significance levels of the test are found by a modified bootstrap procedure. The procedure is used to check the appropriateness of a gamma or a positive stable frailty model in a set of survival data on Danish twins.  相似文献   

9.
MOTIVATION: We present a new approach to the analysis of images for complementary DNA microarray experiments. The image segmentation and intensity estimation are performed simultaneously by adopting a two-component mixture model. One component of this mixture corresponds to the distribution of the background intensity, while the other corresponds to the distribution of the foreground intensity. The intensity measurement is a bivariate vector consisting of red and green intensities. The background intensity component is modeled by the bivariate gamma distribution, whose marginal densities for the red and green intensities are independent three-parameter gamma distributions with different parameters. The foreground intensity component is taken to be the bivariate t distribution, with the constraint that the mean of the foreground is greater than that of the background for each of the two colors. The degrees of freedom of this t distribution are inferred from the data but they could be specified in advance to reduce the computation time. Also, the covariance matrix is not restricted to being diagonal and so it allows for nonzero correlation between R and G foreground intensities. This gamma-t mixture model is fitted by maximum likelihood via the EM algorithm. A final step is executed whereby nonparametric (kernel) smoothing is undertaken of the posterior probabilities of component membership. The main advantages of this approach are: (1) it enjoys the well-known strengths of a mixture model, namely flexibility and adaptability to the data; (2) it considers the segmentation and intensity simultaneously and not separately as in commonly used existing software, and it also works with the red and green intensities in a bivariate framework as opposed to their separate estimation via univariate methods; (3) the use of the three-parameter gamma distribution for the background red and green intensities provides a much better fit than the normal (log normal) or t distributions; (4) the use of the bivariate t distribution for the foreground intensity provides a model that is less sensitive to extreme observations; (5) as a consequence of the aforementioned properties, it allows segmentation to be undertaken for a wide range of spot shapes, including doughnut, sickle shape and artifacts. RESULTS: We apply our method for gridding, segmentation and estimation to cDNA microarray real images and artificial data. Our method provides better segmentation results in spot shapes as well as intensity estimation than Spot and spotSegmentation R language softwares. It detected blank spots as well as bright artifact for the real data, and estimated spot intensities with high-accuracy for the synthetic data. AVAILABILITY: The algorithms were implemented in Matlab. The Matlab codes implementing both the gridding and segmentation/estimation are available upon request. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.  相似文献   

10.
A popular product testing procedure is to obtain sensory intensity and liking ratings from the same consumers. Consumers are instructed to attend to the sensory attribute, such as sweetness, when generating their liking response. We propose a new model of this concurrent ratings task that conjoins a unidimensional Thurstonian model of the ratings on the sensory dimension with a probabilistic version of Coombs' (1964) unfolding model for the liking dimension. The model assumes that the sensory characteristic of the product has a normal distribution over consumers. An individual consumer selects a sensory rating by comparing the perceived value on the sensory dimension to a set of criteria that partitions the axis into intervals. Each value on the rating scale is associated with a unique interval. To rate liking, the consumer imagines an ideal product, then computes the discrepancy or distance between the product as perceived by the consumer and this imagined ideal. A set of criteria are constructed on this discrepancy dimension that partition the axis into intervals. Each interval is associated with a unique liking rating. The ideal product is assumed to have a univariate normal distribution over consumers on the sensory attribute evaluated. The model is shown to account for 94.2% of the variance in a set of sample data and to fit this data significantly better than a bivariate normal model of the data (concurrent ratings, Thurstonian scaling, Coombs' unfolding model, sensory and liking ratings).  相似文献   

11.
In dose-finding clinical study, it is common that multiple endpoints are of interest. For instance, efficacy and toxicity endpoints are both primary in clinical trials. In this article, we propose a joint model for correlated efficacy-toxicity outcome constructed with Archimedean Copula, and extend the continual reassessment method (CRM) to a bivariate trial design in which the optimal dose for phase III is based on both efficacy and toxicity. Specially, considering numerous cases that continuous and discrete outcomes are observed in drug study, we will extend our joint model to mixed correlated outcomes. We demonstrate through simulations that our algorithm based on Archimedean Copula model has excellent operating characteristics.  相似文献   

12.
根据Hutchinson的n维超体积概念以及物种与资源利用之间的关系, 构建了青海云杉(Picea crassifolia)在三维环境资源空间中的生物-地理模型, 并利用该模型模拟了青海云杉的潜在分布及其对环境资源的利用状况。结果表明: 青海云杉在生长季平均气温、多年平均降水量及太阳直接辐射三维环境资源空间上的最佳配置为9 ℃、360 mm和1.9 × 103 kW·h·m-2; 用三元方程式的拟合结果在大范围上预测了青海云杉的潜在分布区, 并给出了其在对应地理位置上的生长状况。  相似文献   

13.
We use bootstrap simulation to characterize uncertainty in parametric distributions, including Normal, Lognormal, Gamma, Weibull, and Beta, commonly used to represent variability in probabilistic assessments. Bootstrap simulation enables one to estimate sampling distributions for sample statistics, such as distribution parameters, even when analytical solutions are not available. Using a two-dimensional framework for both uncertainty and variability, uncertainties in cumulative distribution functions were simulated. The mathematical properties of uncertain frequency distributions were evaluated in a series of case studies during which the parameters of each type of distribution were varied for sample sizes of 5, 10, and 20. For positively skewed distributions such as Lognormal, Weibull, and Gamma, the range of uncertainty is widest at the upper tail of the distribution. For symmetric unbounded distributions, such as Normal, the uncertainties are widest at both tails of the distribution. For bounded distributions, such as Beta, the uncertainties are typically widest in the central portions of the distribution. Bootstrap simulation enables complex dependencies between sampling distributions to be captured. The effects of uncertainty, variability, and parameter dependencies were studied for several generic functional forms of models, including models in which two-dimensional random variables are added, multiplied, and divided, to show the sensitivity of model results to different assumptions regarding model input distributions, ranges of variability, and ranges of uncertainty and to show the types of errors that may be obtained from mis-specification of parameter dependence. A total of 1,098 case studies were simulated. In some cases, counter-intuitive results were obtained. For example, the point value of the 95th percentile of uncertainty for the 95th percentile of variability of the product of four Gamma or Weibull distributions decreases as the coefficient of variation of each model input increases and, therefore, may not provide a conservative estimate. Failure to properly characterize parameter uncertainties and their dependencies can lead to orders-of-magnitude mis-estimates of both variability and uncertainty. In many cases, the numerical stability of two-dimensional simulation results was found to decrease as the coefficient of variation of the inputs increases. We discuss the strengths and limitations of bootstrap simulation as a method for quantifying uncertainty due to random sampling error.  相似文献   

14.
Lin M  Wu R 《Genetics》2005,170(2):919-928
Almost all drugs that produce a favorable response (efficacy) may also produce adverse effects (toxicity). The relative strengths of drug efficacy and toxicity that vary in human populations are controlled by the combined influences of multiple genes and environmental influences. Genetic mapping has proven to be a powerful tool for detecting and identifying specific DNA sequence variants on the basis of the haplotype map (HapMap) constructed from single-nucleotide polymorphisms (SNPs). In this article, we present a novel statistical model for sequence mapping of two different but related drug responses. This model is incorporated by mathematical functions of drug response to varying doses or concentrations and the statistical device used to model the correlated structure of the residual (co)variance matrix. We implement a closed-form solution for the EM algorithm to estimate the population genetic parameters of SNPs and the simplex algorithm to estimate the curve parameters describing the pharmacodynamic changes of different genetic variants and matrix-structuring parameters. Extensive simulations are performed to investigate the statistical properties of our model. The implications of our model in pharmacogenetic and pharmacogenomic research are discussed.  相似文献   

15.
In most QTL mapping studies, phenotypes are assumed to follow normal distributions. Deviations from this assumption may lead to detection of false positive QTL. To improve the robustness of Bayesian QTL mapping methods, the normal distribution for residuals is replaced with a skewed Student-t distribution. The latter distribution is able to account for both heavy tails and skewness, and both components are each controlled by a single parameter. The Bayesian QTL mapping method using a skewed Student-t distribution is evaluated with simulated data sets under five different scenarios of residual error distributions and QTL effects.  相似文献   

16.
Many of the functional traits considered in animal breeding can be analyzed as threshold traits or survival traits with examples including disease traits, conformation scores, calving difficulty and longevity. In this paper we derive and implement a bivariate quantitative genetic model for a threshold character and a survival trait that are genetically and environmentally correlated. For the survival trait, we considered the Weibull log-normal animal frailty model. A Bayesian approach using Gibbs sampling was adopted in which model parameters were augmented with unobserved liabilities associated with the threshold trait. The fully conditional posterior distributions associated with parameters of the threshold trait reduced to well known distributions. For the survival trait the two baseline Weibull parameters were updated jointly by a Metropolis-Hastings step. The remaining model parameters with non-normalized fully conditional distributions were updated univariately using adaptive rejection sampling. The Gibbs sampler was tested in a simulation study and illustrated in a joint analysis of calving difficulty and longevity of dairy cattle. The simulation study showed that the estimated marginal posterior distributions covered well and placed high density to the true values used in the simulation of data. The data analysis of calving difficulty and longevity showed that genetic variation exists for both traits. The additive genetic correlation was moderately favorable with marginal posterior mean equal to 0.37 and 95% central posterior credibility interval ranging between 0.11 and 0.61. Therefore, this study suggests that selection for improving one of the two traits will be beneficial for the other trait as well.  相似文献   

17.
MOTIVATION: In most quantitative trait locus (QTL) mapping studies, phenotypes are assumed to follow normal distributions. Deviations from this assumption may affect the accuracy of QTL detection and lead to detection of spurious QTLs. To improve the robustness of QTL mapping methods, we replaced the normal distribution for residuals in multiple interacting QTL models with the normal/independent distributions that are a class of symmetric and long-tailed distributions and are able to accommodate residual outliers. Subsequently, we developed a Bayesian robust analysis strategy for dissecting genetic architecture of quantitative traits and for mapping genome-wide interacting QTLs in line crosses. RESULTS: Through computer simulations, we showed that our strategy had a similar power for QTL detection compared with traditional methods assuming normal-distributed traits, but had a substantially increased power for non-normal phenotypes. When this strategy was applied to a group of traits associated with physical/chemical characteristics and quality in rice, more main and epistatic QTLs were detected than traditional Bayesian model analyses under the normal assumption.  相似文献   

18.
Quantitative trait loci (QTL) are usually searched for using classical interval mapping methods which assume that the trait of interest follows a normal distribution. However, these methods cannot take into account features of most survival data such as a non-normal distribution and the presence of censored data. We propose two new QTL detection approaches which allow the consideration of censored data. One interval mapping method uses a Weibull model (W), which is popular in parametrical modelling of survival traits, and the other uses a Cox model (C), which avoids making any assumption on the trait distribution. Data were simulated following the structure of a published experiment. Using simulated data, we compare W, C and a classical interval mapping method using a Gaussian model on uncensored data (G) or on all data (G'=censored data analysed as though records were uncensored). An adequate mathematical transformation was used for all parametric methods (G, G' and W). When data were not censored, the four methods gave similar results. However, when some data were censored, the power of QTL detection and accuracy of QTL location and of estimation of QTL effects for G decreased considerably with censoring, particularly when censoring was at a fixed date. This decrease with censoring was observed also with G', but it was less severe. Censoring had a negligible effect on results obtained with the W and C methods.  相似文献   

19.
In this paper we describe and test a new method for characterizing the space use patterns of individual animals on the basis of successive locations of marked individuals. Existing methods either do not describe space use in probabilistic terms, e.g. the maximum distance between locations or the area of the convex hull of all locations, or they assume a priori knowledge of the probabilistic shape of each individual's use pattern, e.g. bivariate or circular normal distributions. We develop a method for calculating a probability of location distribution for an average individual member of a population that requires no assumptions about the shape of the distribution (we call this distribution the population utilization distribution or PUD). Using nine different sets of location data, we demonstrate that these distributions accurately characterize the space use patterns of the populations from which they were derived. The assumption of normality is found to result in a consistent and significant overestimate of the area of use. We then describe a function which relates probability of location to area (termed the MAP index) which has a number of advantages over existing size indices. Finally, we show how any quantities such as the MAP index derived from our average distributions can be subjected to standard statistical tests of significance.  相似文献   

20.
Univariate and bivariate analyses of cholesterol and triglycerides are performed after appropriate age adjustment on 247 individuals in 33 families where the probands have elevations of cholesterol, low density lipoprotein and triglycerides, and type IIb lipoprotein phenotype. Mixture of lognormal distributions are fitted by maximum likelihood to the data. Best fitting single and mixtures of lognormal distributions are compared with empirical cumulative plots, and the likelihood-ratio criterion is used to test for significance. A mixture of two lognormal distributions fits significantly better than one lognormal distribution for cholesterol but not for triglycerides. When a mixture of bivariate lognormals is fitted to the data, only one local maximum is found, suggesting action of a single genetic determinant in this sample. The best cutoff line is almost parallel to the triglyceride axis, indicating the relatively high involvement of cholesterol compared to triglycerides in separating the normal and abnormal groups. Using the best linear function, the difference in the two bivariate means is found to account for 61% of the total variation in log cholesterol and log triglycerides. To determine if the results are due to enrichment of the sample with familial hypercholesterolemia syndrome, seven families where the proband and/or any relative has tendon xanthomas are removed and the analyses repeated on the remaining 26 kindreds. The results of these analyses are virtually the same as those of the total sample. Also, a subsample of 21 families in which the proband and at least one additional kindred member are affected is analyzed in the same manner with similar results. For comparison, data from a study of families with combined hyperlipidemia [1] are analyzed in an analogous manner, bearing in mind that the populations sampled are probably different. Fitting a mixture of two bivariate distributions and finding the best cutoff to these data indicate that triglycerides are more involved in separating the two groups. Probably because of major differences in ascertainment, the distribution of lipid levels in oour patient group is practically indistinguishable from that of hypercholesterolemia, and the Seattle data [1] are more nearly similar to hypertriglyceridemia. It may be premature to consider familial combined hyperlipidemia as an entity distinct from both hypercholesterolemia and hypertriglyceridemia. We hope it will eventually be possible to analyze these data using a refined genetic model that includes both major gene and polygenic effects and to combine this form of analysis with quantitative tissue culture methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号