首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits,which usually show discontinuous distribution and less information,using conventional statisti-cal methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits,which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence,Bayesian estimates of all interested vari-ables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study,utilities of Bayesian-MCMC are demonstrated using simulated several ani-mal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model,three samplers basing on MCMC,including Gibbs sampling,Metropolis algorithm and reversible jump MCMC,were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases,the accuracy of the parameter estimates will be im-proved. When the true QTL has a small effect,using outbred population experiment design with large family size is the optimal mapping strategy.  相似文献   

2.
Xie J B  Liu T  Wei P  Jia Y M  Luo C 《农业工程》2007,27(7):2704-2714
Ecological experiments are usually conducted on small scales, but the ecological and environmental issues are usually on large scales. Hence, there is a clear need of scaling. Namely, when we deal with patterns and processes on larger scales, a special connection needs to be established on the small scales that we are familiar with. Here we presented a wavelet analysis method that could build relationships between spatial distribution patterns on different scales. With this method, we also studied how spatial heterogeneity and distribution patterns changed with the scale. We investigated the distribution and the habitat of C. ewersmanniana in two plots (200 m × 200 m; the distance between these two plots is 15 km) at Mosuowan desert. The results demonstrated that spatial heterogeneity and distribution patterns were incorporated into larger scales when the wavelet scale varied from one (5 m) to four (20 m). However, if the wavelet scale was above five (25 m), the spatial distribution patterns varied placidly, the oscillation frequency of landforms stabilized at 110 m, and the dynamic quantity period of C. ewersmanniana stabilized at 115–125 m. We also identified signal mutation points with wavelet analysis and verified the heterogeneity degree of local space with position variance. We found that position variance decomposed the distribution patterns on large scales into small sampling plots, and the position with the largest variance also had the strongest heterogeneity. In a word, the wavelet analysis method could scale-up spatial distribution patterns and habitat heterogeneity. With this method and other methods derived from this one, such as wavelet scale, wavelet variance, position variance and extremely direct-viewing graphs, wavelet analysis could be widely applied in solving the scaling problem in ecological and environmental studies.  相似文献   

3.
4.
遗传参数不同估计方法的比较   总被引:6,自引:0,他引:6  
利用模拟方法比较了5种方差组分估计方法(ANOVA,Henderson-Ⅲ,MLMT,R EMLMT和MIVQUE)对遗传参数的估计效果。结果表明:REMLMT法在各种情况下均能得到较好的参数估值,估计的准确度相对较高;ANOVA方法在小群体和不均衡资料情况下估计效果最差。同时,群体含量和结构能影响各种方法的估计效果, 群体含量较小或资料来源的公畜数太少,将导致遗传参数的估计误差增大,准确性降低。 Abstract:Five methods for estimating components of (Co) variance,including ANOVA,Henderson-III,MLMT,REMLMT and MIVQUE were compared using computer simulation.The results showed that REMLMT could obtain more accurate estimates for all data sets,while the estimates obtained by ANOVA always had greater deviations from the true values,especial for small and well-unbalanced data sets.Also,the effects of estimation were dependent on structure of population size,accuracy of estimates would be decreased when number of sires was too few or data contained small number of animals.  相似文献   

5.
MIXED MODEL APPROACHES FOR ESTIMATING GENETIC VARIANCES AND COVARIANCES   总被引:62,自引:4,他引:58  
The limitations of methods for analysis of variance(ANOVA)in estimating genetic variances are discussed. Among the three methods(maximum likelihood ML, restricted maximum likelihood REML, and minimum norm quadratic unbiased estimation MINQUE)for mixed linear models, MINQUE method is presented with formulae for estimating variance components and covariances components and for predicting genetic effects. Several genetic models, which cannot be appropriately analyzed by ANOVA methods, are introduced in forms of mixed linear models. Genetic models with independent random effects can be analyzed by MINQUE(1)method whieh is a MINQUE method with all prior values setting 1. MINQUE(1)method can give unbiased estimation for variance components and covariance components, and linear unbiased prediction (LUP) for genetic effects. There are more complicate genetic models for plant seeds which involve correlated random effects. MINQUE(0/1)method, which is a MINQUE method with all prior covariances setting 0 and all prior variances setting 1, is suitable for estimating variance and covariance components in these models. Mixed model approaches have advantage over ANOVA methods for the capacity of analyzing unbalanced data and complicated models. Some problems about estimation and hypothesis test by MINQUE method are discussed.  相似文献   

6.
7.
Nonlinear mixed-eirects (NLME) modek have become popular in various disciplines over the past several decades.However,the existing methods for parameter estimation imple-mented in standard statistical packages such as SAS and R/S-Plus are generally limited k) single-or multi-level NLME models that only allow nested random effects and are unable to cope with crossed random effects within the framework of NLME modeling.In t his study,wc propose a general formulation of NLME models that can accommodate both nested and crassed random effects,and then develop a computational algorit hm for parameter estimation based on normal assumptions.The maximum likelihood estimation is carried out using the first-order conditional expansion (FOCE) for NLME model linearization and sequential quadratic programming (SCJP) for computational optimization while ensuring positive-definiteness of the estimated variance-covariance matrices of both random effects and error terms.The FOCE-SQP algorithm is evaluated using the height and diameter data measured on trees from Korean larch (L.olgeiisis var,Chang-paienA.b) experimental plots aa well as simulation studies.We show that the FOCE-SQP method converges fast with high accuracy.Applications of the general formulation of NLME models are illustrated with an analysis of the Korean larch data.  相似文献   

8.
Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent “noise” within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.  相似文献   

9.
Unsteady aerodynamic characteristics of a seagull wing in level flight are investigated using a boundary element method.Anew no-penetration boundary condition is imposed on the surface of the wing by considering its deformation.The geometry andkinematics of the seagull wing are reproduced using the functions and data in the previously published literature.The proposedmethod is validated by comparing the computed results with the published data in the literature.The unsteady aerodynamicscharacteristics of the seagull wing are investigated by changing flapping frequency and advance ratio.It is found that the peakvalues of aerodynamic coefficients increase with the flapping frequency.The thrust and drag generations are complicatedfunctions of frequency and wing stroke motions.The lift is inversely proportional to the advance ratio.The effects of severalflapping modes on the lift and induced drag(or thrust)generation are also investigated.Among three single modes(flapping,folding and lead & lag),flapping generates the largest lift and can produce thrust alone.For three combined modes,both flapping/foldingand flapping/lead & lag can produce lift and thrust larger than the flapping-alone mode can.Folding is shown toincrease thrust when combined with flapping,whereas lead & lag has an effect of increasing the lift when also combined withflapping.When three modes are combined together,the bird can obtain the largest lift among the investigated modes.Eventhough the proposed method is limited to the inviscid flow assumption,it is believed that this method can be used to the designof flapping micro aerial vehicle.  相似文献   

10.
Protein domains are conserved and functionally independent structures that play an important role in interactions among related proteins. Domain-domain inter- actions have been recently used to predict protein-protein interactions (PPI). In general, the interaction probability of a pair of domains is scored using a trained scoring function. Satisfying a threshold, the protein pairs carrying those domains are regarded as “interacting“. In this study, the signature contents of proteins were utilized to predict PPI pairs in Saccharomyces cerevisiae, Caenorhabditis ele- gans, and Homo sapiens. Similarity between protein signature patterns was scored and PPI predictions were drawn based on the binary similarity scoring function. Results show that the true positive rate of prediction by the proposed approach is approximately 32% higher than that using the maximum likelihood estimation method when compared with a test set, resulting in 22% increase in the area un- der the receiver operating characteristic (ROC) curve. When proteins containing one or two signatures were removed, the sensitivity of the predicted PPI pairs in- creased significantly. The predicted PPI pairs are on average 11 times more likely to interact than the random selection at a confidence level of 0.95, and on aver- age 4 times better than those predicted by either phylogenetic profiling or gene expression profiling.  相似文献   

11.
The case-cohort study involves two-phase samplings: simple random sampling from an infinite superpopulation at phase one and stratified random sampling from a finite cohort at phase two. Standard analyses of case-cohort data involve solution of inverse probability weighted (IPW) estimating equations, with weights determined by the known phase two sampling fractions. The variance of parameter estimates in (semi)parametric models, including the Cox model, is the sum of two terms: (i) the model-based variance of the usual estimates that would be calculated if full data were available for the entire cohort; and (ii) the design-based variance from IPW estimation of the unknown cohort total of the efficient influence function (IF) contributions. This second variance component may be reduced by adjusting the sampling weights, either by calibration to known cohort totals of auxiliary variables correlated with the IF contributions or by their estimation using these same auxiliary variables. Both adjustment methods are implemented in the R survey package. We derive the limit laws of coefficients estimated using adjusted weights. The asymptotic results suggest practical methods for construction of auxiliary variables that are evaluated by simulation of case-cohort samples from the National Wilms Tumor Study and by log-linear modeling of case-cohort data from the Atherosclerosis Risk in Communities Study. Although not semiparametric efficient, estimators based on adjusted weights may come close to achieving full efficiency within the class of augmented IPW estimators.  相似文献   

12.
刘文忠  王钦德 《遗传学报》2004,31(7):695-700
探讨R法遗传参数估值置信区间的计算方法和重复估计次数(NORE)对参数估值的影响,利用4种模型通过模拟产生数据集。基础群中公、母畜数分别为200和2000头,BLUP育种值选择5个世代。利用多变量乘法迭代(MMI)法,结合先决条件的共扼梯度(PCG)法求解混合模型方程组估计方差组分。用经典方法、Box-Cox变换后的经典方法和自助法计算参数估值的均数、标准误和置信区间。结果表明,重复估计次数较多时,3种方法均可;重复估计次数较少时,建议使用自助法。简单模型下需要较少的重复估计,但对于复杂模型则需要较多的重复估计。随模型中随机效应数的增加,直接遗传力高估。随着PCG和MMI轮次的增大,参数估值表现出低估的趋势。  相似文献   

13.
Eventing competitions in Great Britain (GB) comprise three disciplines, each split into four grades, yielding 12 discipline-grade traits. As there is a demand for tools to estimate (co)variance matrices with a large number of traits, the aim of this work was to investigate different methods to produce large (co)variance matrices using GB eventing data. Data from 1999 to 2008 were used and penalty points were converted to normal scores. A sire model was utilised to estimate fixed effects of gender, age and class, and random effects of sire, horse and rider. Three methods were used to estimate (co)variance matrices. Method 1 used a method based on Gibbs sampling and data augmentation and imputation. Methods 2a and 2b combined sub-matrices from bivariate analyses; one took samples from a multivariate Normal distribution defined by the covariance matrix from each bivariate analysis, then analysed these data in a 12-trait multivariate analysis; the other replaced negative eigenvalues in the matrix with positive values to obtain a positive definite (co)variance matrix. A formal comparison of models could not be conducted; however, estimates from all methods, particularly Methods 2a/2b, were in reasonable agreement. The computational requirements of Method 1 were much less compared with Methods 2a or 2b. Method 2a heritability estimates were as follows: for dressage 7.2% to 9.0%, for show jumping 8.9% to 16.2% and for cross-country 1.3% to 1.4%. Method 1 heritability estimates were higher for the advanced grades, particularly for dressage (17.1%) and show jumping (22.6%). Irrespective of the model, genetic correlations between grades, for dressage and show jumping, were positive, high and significant, ranging from 0.59 to 0.99 for Method 2a and 0.78 to 0.95 for Method 1. For cross-country, using Method 2a, genetic correlations were only significant between novice and pre-novice (0.75); however, using Method 1 estimates were all significant and low to moderate (0.36 to 0.70). Between-discipline correlations were all low and of mixed sign. All methods produced positive definite 12 × 12 (co)variance matrices, suitable for the prediction of breeding values. Method 1 benefits from much reduced computational requirements, and by performing a true multivariate analysis.  相似文献   

14.
Summary Effects of data imbalance on bias, sampling variance and mean square error of heritability estimated with variance components were examined using a random two-way nested classification. Four designs, ranging from zero imbalance (balanced data) to low, medium and high imbalance, were considered for each of four combinations of heritability (h2=0.2 and 0.4) and sample size (N=120 and 600). Observations were simulated for each design by drawing independent pseudo-random deviates from normal distributions with zero means, and variances determined by heritability. There were 100 replicates of each simulation; the same design matrix was used in all replications. Variance components were estimated by analysis of variance (Henderson's Method 1) and by maximum likelihood (ML). For the design and model used in this study, bias in heritability based on Method 1 and ML estimates of variance components was negligible. Effect of imbalance on variance of heritability was smaller for ML than for Method 1 estimation, and was smaller for heritability based on estimates of sire-plus-dam variance components than for heritability based on estimates of sire or dam variance components. Mean square error for heritability based on estimates of sire-plus-dam variance components appears to be less sensitive to data imbalance than heritability based on estimates of sire or dam variance components, especially when using Method 1 estimation. Estimation of heritability from sire-plus-dam components was insensitive to differences in data imbalance, especially for the larger sample size.Supported by grants from the Illinois Agricultural Experiment Station and the University of Illinois Research Board. Charles Smith, H. W. Norton and D. Gianola contributed valuable suggestions  相似文献   

15.
最近,人们突变积累实验(MA)中测定有害基因突变(DGM)的兴趣大增。在MA实验中有两种常见的DGM估计方法(极大似然法ML和距法MM),依靠计算机模拟和处理真实数据的应用软件来比较这两种方法。结论是:ML法难于得到最大似然估计(MLEs),所以ML法不如MM法估计有效;即使MLEs可得,也因其具严重的微样误差(据偏差和抽样差异)而产生估计偏差;似然函数曲线较平坦而难于区分高峰态和低峰态的分布。  相似文献   

16.
Data from a litter matched tumorigenesis experiment are analysed using a generalised linear mixed model (GLMM) approach to the analysis of clustered survival data in which there is a dependence of failure time observations within the same litter. Maximum likelihood (ML) and residual maximum likelihood (REML) estimates of risk variable parameters, variance component parameters and the prediction of random effects are given. Estimation of treatment effect parameter (carcinogen effect) has good agreement with previous analyses obtained in the literature though the dependence structure within a litter is modelled in different ways. The variance component estimation provides the estimated dispersion of the random effects. The prediction of random effects, is useful, for instance, in identifying high risk litters and individuals. The present analysis illustrates its wider application to detecting increased risk of occurrence of disease in particular families of a study population.  相似文献   

17.
The value of an ecological indicator is no better than the uncertainty associated with its estimate. Nevertheless, indicator uncertainty is seldom estimated, even though legislative frameworks such as the European Water Framework Directive stress that the confidence of an assessment should be quantified. We introduce a general framework for quantifying uncertainties associated with indicators employed to assess ecological status in waterbodies. The framework is illustrated with two examples: eelgrass shoot density and chlorophyll a in coastal ecosystems. Aquatic monitoring data vary over time and space; variations that can only partially be described using fixed parameters, and remaining variations are deemed random. These spatial and temporal variations can be partitioned into uncertainty components operating at different scales. Furthermore, different methods of sampling and analysis as well as people involved in the monitoring introduce additional uncertainty. We have outlined 18 different sources of variation that affect monitoring data to a varying degree and are relevant to consider when quantifying the uncertainty of an indicator calculated from monitoring data. However, in most cases it is not possible to estimate all relevant sources of uncertainty from monitoring data from a single ecosystem, and those uncertainty components that can be quantified will not be well determined due to the lack of replication at different levels of the random variations (e.g. number of stations, number of years, and number of people). For example, spatial variations cannot be determined from datasets with just one station. Therefore, we recommend that random variations are estimated from a larger dataset, by pooling observations from multiple ecosystems with similar characteristics. We also recommend accounting for predictable patterns in time and space using parametric approaches in order to reduce the magnitude of the unpredictable random components and reduce potential bias introduced by heterogeneous monitoring across time. We propose to use robust parameter estimates for both fixed and random variations, determined from a large pooled dataset and assumed common across the range of ecosystems, and estimate a limited subset of parameters from ecosystem-specific data. Partitioning the random variation onto multiple uncertainty components is important to obtain correct estimates of the ecological indicator variance, and the magnitude of the different components provide useful information for improving methods applied and design of monitoring programs. The proposed framework allows comparing different indicators based on their precision relative to the cost of monitoring.  相似文献   

18.
Z. B. Zeng  D. Houle    C. C. Cockerham 《Genetics》1990,126(1):235-247
S. Wright suggested an estimator, m, of the number of loci, m, contributing to the difference in a quantitative character between two differentiated populations, which is calculated from the phenotypic means and variances in the two parental populations and their F1 and F2 hybrids. The same method can also be used to estimate m contributing to the genetic variance within a single population, by using divergent selection to create differentiated lines from the base population. In this paper we systematically examine the utility and problems of this technique under the influences of unequal allelic effects and initial allele frequencies, and linkage, which are known to lead m to underestimate m. In addition, we examine the effects of population size and selection intensity during the generations of selection. During selection, the estimator m rapidly approaches its expected value at the selection limit. With reasonable assumptions about unequal allelic effects and initial allele frequencies, the expected value of m without linkage is likely to be on the order of one-third of the number of genes. The estimates suffer most seriously from linkage. The practical maximum expectation of m is just about the number of chromosomes, considerably less than the "recombination index" which has been assumed to be the upper limit. The estimates are also associated with large sampling variances. An estimator of the variance of m derived by R. Lande substantially underestimates the actual variance. Modifications to the method can ameliorate some of the problems. These include using F3 or later generation variances or the genetic variance in the base population, and replicating the experiments and estimation procedure. However, even in the best of circumstances, information from m is very limited and can be misleading.  相似文献   

19.
K. R. Koots  J. P. Gibson 《Genetics》1996,143(3):1409-1416
A data set of 1572 heritability estimates and 1015 pairs of genetic and phenotypic correlation estimates, constructed from a survey of published beef cattle genetic parameter estimates, provided a rare opportunity to study realized sampling variances of genetic parameter estimates. The distribution of both heritability estimates and genetic correlation estimates, when plotted against estimated accuracy, was consistent with random error variance being some three times the sampling variance predicted from standard formulae. This result was consistent with the observation that the variance of estimates of heritabilities and genetic correlations between populations were about four times the predicted sampling variance, suggesting few real differences in genetic parameters between populations. Except where there was a strong biological or statistical expectation of a difference, there was little evidence for differences between genetic and phenotypic correlations for most trait combinations or for differences in genetic correlations between populations. These results suggest that, even for controlled populations, estimating genetic parameters specific to a given population is less useful than commonly believed. A serendipitous discovery was that, in the standard formula for theoretical standard error of a genetic correlation estimate, the heritabilities refer to the estimated values and not, as seems generally assumed, the true population values.  相似文献   

20.
Aims In ecology and conservation biology, the number of species counted in a biodiversity study is a key metric but is usually a biased underestimate of total species richness because many rare species are not detected. Moreover, comparing species richness among sites or samples is a statistical challenge because the observed number of species is sensitive to the number of individuals counted or the area sampled. For individual-based data, we treat a single, empirical sample of species abundances from an investigator-defined species assemblage or community as a reference point for two estimation objectives under two sampling models: estimating the expected number of species (and its unconditional variance) in a random sample of (i) a smaller number of individuals (multinomial model) or a smaller area sampled (Poisson model) and (ii) a larger number of individuals or a larger area sampled. For sample-based incidence (presence–absence) data, under a Bernoulli product model, we treat a single set of species incidence frequencies as the reference point to estimate richness for smaller and larger numbers of sampling units.Methods The first objective is a problem in interpolation that we address with classical rarefaction (multinomial model) and Coleman rarefaction (Poisson model) for individual-based data and with sample-based rarefaction (Bernoulli product model) for incidence frequencies. The second is a problem in extrapolation that we address with sampling-theoretic predictors for the number of species in a larger sample (multinomial model), a larger area (Poisson model) or a larger number of sampling units (Bernoulli product model), based on an estimate of asymptotic species richness. Although published methods exist for many of these objectives, we bring them together here with some new estimators under a unified statistical and notational framework. This novel integration of mathematically distinct approaches allowed us to link interpolated (rarefaction) curves and extrapolated curves to plot a unified species accumulation curve for empirical examples. We provide new, unconditional variance estimators for classical, individual-based rarefaction and for Coleman rarefaction, long missing from the toolkit of biodiversity measurement. We illustrate these methods with datasets for tropical beetles, tropical trees and tropical ants.Important findings Surprisingly, for all datasets we examined, the interpolation (rarefaction) curve and the extrapolation curve meet smoothly at the reference sample, yielding a single curve. Moreover, curves representing 95% confidence intervals for interpolated and extrapolated richness estimates also meet smoothly, allowing rigorous statistical comparison of samples not only for rarefaction but also for extrapolated richness values. The confidence intervals widen as the extrapolation moves further beyond the reference sample, but the method gives reasonable results for extrapolations up to about double or triple the original abundance or area of the reference sample. We found that the multinomial and Poisson models produced indistinguishable results, in units of estimated species, for all estimators and datasets. For sample-based abundance data, which allows the comparison of all three models, the Bernoulli product model generally yields lower richness estimates for rarefied data than either the multinomial or the Poisson models because of the ubiquity of non-random spatial distributions in nature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号