首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Tests of applicability of several substitution models for DNA sequence data   总被引:8,自引:3,他引:5  
Using linear invariants for various models of nucleotide substitution, we developed test statistics for examining the applicability of a specific model to a given dataset in phylogenetic inference. The models examined are those developed by Jukes and Cantor (1969), Kimura (1980), Tajima and Nei (1984), Hasegawa et al. (1985), Tamura (1992), Tamura and Nei (1993), and a new model called the eight-parameter model. The first six models are special cases of the last model. The test statistics developed are independent of evolutionary time and phylogeny, although the variances of the statistics contain phylogenetic information. Therefore, these statistics can be used before a phylogenetic tree is estimated. Our objective is to find the simplest model that is applicable to a given dataset, keeping in mind that a simple model usually gives an estimate of evolutionary distance (number of nucleotide substitutions per site) with a smaller variance than a complicated model when the simple model is correct. We have also developed a statistical test of the homogeneity of nucleotide frequencies of a sample of several sequences that takes into account possible phylogenetic correlations. This test is used to examine the stationarity in time of the base frequencies in the sample. For Hasegawa et al.'s and the eight-parameter models, analytical formulas for estimating evolutionary distances are presented. Application of the above tests to several sets of real data has shown that the assumption of stationarity of base composition is usually acceptable when the sequences studied are closely related but otherwise it is rejected. Similarly, the simple models of nucleotide substitution are almost always rejected when actual genes are distantly related and/or the total number of nucleotides examined is large.   相似文献   

2.

Background

Genomic prediction requires estimation of variances of effects of single nucleotide polymorphisms (SNPs), which is computationally demanding, and uses these variances for prediction. We have developed models with separate estimation of SNP variances, which can be applied infrequently, and genomic prediction, which can be applied routinely.

Methods

SNP variances were estimated with Bayes Stochastic Search Variable Selection (BSSVS) and BayesC. Genome-enhanced breeding values (GEBV) were estimated with RR-BLUP (ridge regression best linear unbiased prediction), using either variances obtained from BSSVS (BLUP-SSVS) or BayesC (BLUP-C), or assuming equal variances for each SNP. Datasets used to estimate SNP variances comprised (1) all animals, (2) 50% random animals (RAN50), (3) 50% best animals (TOP50), or (4) 50% worst animals (BOT50). Traits analysed were protein yield, udder depth, somatic cell score, interval between first and last insemination, direct longevity, and longevity including information from predictors.

Results

BLUP-SSVS and BLUP-C yielded similar GEBV as the equivalent Bayesian models that simultaneously estimated SNP variances. Reliabilities of these GEBV were consistently higher than from RR-BLUP, although only significantly for direct longevity. Across scenarios that used data subsets to estimate GEBV, observed reliabilities were generally higher for TOP50 than for RAN50, and much higher than for BOT50. Reliabilities of TOP50 were higher because the training data contained more ancestors of selection candidates. Using estimated SNP variances based on random or non-random subsets of the data, while using all data to estimate GEBV, did not affect reliabilities of the BLUP models. A convergence criterion of 10−8 instead of 10−10 for BLUP models yielded similar GEBV, while the required number of iterations decreased by 71 to 90%. Including a separate polygenic effect consistently improved reliabilities of the GEBV, but also substantially increased the required number of iterations to reach convergence with RR-BLUP. SNP variances converged faster for BayesC than for BSSVS.

Conclusions

Combining Bayesian variable selection models to re-estimate SNP variances and BLUP models that use those SNP variances, yields GEBV that are similar to those from full Bayesian models. Moreover, these combined models yield predictions with higher reliability and less bias than the commonly used RR-BLUP model.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-014-0052-x) contains supplementary material, which is available to authorized users.  相似文献   

3.
Nei and Gojobori (1986) developed a simple method to estimate the numbers of synonymous (ds) and nonsynonymous (dN) substitutions per site. In the present paper, we have developed a method for computing variances and covariances of ds's and dN's and of the proportions of synonymous (ps) and nonsynonymous (pN) differences. We also have developed a method for computing the variances of mean dS, dN, pS, pN, without constructing a phylogenetic tree of the genes. We have conducted computer simulations based on simple evolutionary models and have shown that the new method gives good estimates of variances and covariances.   相似文献   

4.
Keightley PD  Bataillon TM 《Genetics》2000,154(3):1193-1201
We develop a maximum-likelihood (ML) approach to estimate genomic mutation rates (U) and average homozygous mutation effects (s) from mutation-accumulation (MA) experiments in which phenotypic assays are carried out in several generations. We use simulations to compare the procedure's performance with the method of moments traditionally used to analyze MA data. Similar precision is obtained if mutation effects are small relative to the environmental standard deviation, but ML can give estimates of mutation parameters that have lower sampling variances than those obtained by the method of moments if mutations with large effects have accumulated. The inclusion of data from intermediate generations may improve the precision. We analyze life-history trait data from two Caenorhabditis elegans MA experiments. Under a model with equal mutation effects, the two experiments provide similar estimates for U of approximately 0.005 per haploid, averaged over traits. Estimates of s are more divergent and average at -0.51 and -0.13 in the two studies. Detailed analysis shows that changes of mean and variance of genetic values of MA lines in both C. elegans experiments are dominated by mutations with large effects, but the analysis does not rule out the presence of a large class of deleterious mutations with very small effects.  相似文献   

5.
云南栽培稻种SSR 遗传多样性比较   总被引:13,自引:0,他引:13  
采用64个SSR标记对96份云南水稻(Oryz a sativa)地方品种和选育品种的遗传多样性进行比较分析。结果发现64个标记都具有多态性, 共检测到741个等位基因, 每个多态性位点检测到的等位基因数为2-29个, 平均11.57个; Nei基因多样性指数(He)范围在0.345(RM321)-0.932(RM1)之间, 平均为0.56。水稻品种的遗传多样性并非按地理位置均匀分布, 而是在相 似系数为0.17的水平上明显分为2个不同类群, 即籼稻类群和粳稻类群, 且籼粳亚种间的SSR多样性差异不明显, 籼稻平均等位基因数(Ap)和Nei基因多样性指数(Ap=10.6, He=0.46)与粳稻品种(Ap=10.7, He=0.48)十分接近, 可能与这些品种间存在一定频率的基因交流有关。糯稻和非糯稻在籼稻群和粳稻群中都有表现, 没有特别的分布规律。云南栽培稻选育品种与地方稻亲缘关系较近, 其遗传基础可能来源于云南水稻地方品种。本研究结果表明, SSR标记能较好地区分云南栽培稻品种, 且云南水稻地方品种遗传多样性丰富, 存在大量的优质性状可供育种实践选择。  相似文献   

6.
采用64个SSR标记对96份云南水稻(Oryza sativa)地方品种和选育品种的遗传多样性进行比较分析。结果发现64个标记都具有多态性,共检测到741个等位基因,每个多态性位点检测到的等位基因数为2—29个,平均11.57个:Nei基因多样性指数(He)范围在0.345(RM321)-0.932(RM1)之间,平均为0.56。水稻品种的遗传多样性并非按地理位置均匀分布,而是在相似系数为0.17的水平上明显分为2个不同类群,即籼稻类群和粳稻类群,且籼粳亚种间的SSR多样性差异不明显,籼稻平均等位基因数(Ap)和Nei基因多样性指数(Ap=10.6,He=0.46)与粳稻品种(Ap=10.7,He=0.48)十分接近,可能与这些品种间存在一定频率的基因交流有关。糯稻和非糯稻在籼稻群和粳稻群中都有表现,没有特别的分布规律。云南栽培稻选育品种与地方稻亲缘关系较近,其遗传基础可能来源于云南水稻地方品种。本研究结果表明,SSR标记能较好地区分云南栽培稻品种,且云南水稻地方品种遗传多样性丰富,存在大量的优质性状可供育种实践选择。  相似文献   

7.
L. Ollivier  LLG. Janss 《Genetics》1993,135(3):907-909
A method of estimating the number of loci contributing to quantitative variation has been proposed by S. Wright in 1921. The method makes use of the means of inbred lines and the variances of their F(1), F(2) and backcrosses. The method has been extended to crosses between outbreeding populations by R. Lande in 1981. Additive gene action is one of the major assumptions required for obtaining valid estimates. It is shown here that this assumption may be relaxed. One can estimate both a total number of effective loci and a number of dominant loci (the latter only when the parents are inbred) by comparing the variances of the F(1), F(2) and backcrosses. Numerical illustrations are given, based on crossbreeding data.  相似文献   

8.
General and craniofacial measurements from young marmosets in a recently established Australian colony are presented and compared with similar data published for a long-established colony in the United Kingdom. Statistically significant differences between the colonies were those associated with large error variances, suggesting difficulties in locating certain landmarks rather than real differences in growth. Thus, direct comparison can be made between craniofacial growth studies conducted at these two different marmoset colonies.  相似文献   

9.
In well-known methods of estimating rates of irreversible disposal (utilization) in vivo the rates are calculated from the areas to infinity under specific radioactivity-time (S-t) or quantity-of-label-time (q-t) curves obtained by measurements on samples of plasma after intravenous injection of labelled substrate. The errors in the calculated rates are mostly those of the estimates of the areas. These errors are of two kinds: random, caused by the variances of the values of S or q, and systematic, caused by differences between the curves used to interpolate between these values and the true curves. A rigorous method is given for calculating the random errors from the variances of the values of S or q, and is applied to choosing the best times to sample the plasma from small animals from which few plasma samples can be taken. A procedure for estimating systematic errors is also given. Programs in BASIC language to carry out the calculations are deposited as Supplementary Publication SUP 50058 (5 pages) at the British Library (Lending Division), Boston Spa, Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies can be obtained on the terms given in Biochem. J. (1975) 145, 5.  相似文献   

10.
Detecting quantitative trait loci (QTL) and estimating QTL variances (represented by the squared QTL effects) are two main goals of QTL mapping and genome-wide association studies (GWAS). However, there are issues associated with estimated QTL variances and such issues have not attracted much attention from the QTL mapping community. Estimated QTL variances are usually biased upwards due to estimation being associated with significance tests. The phenomenon is called the Beavis effect. However, estimated variances of QTL without significance tests can also be biased upwards, which cannot be explained by the Beavis effect; rather, this bias is due to the fact that QTL variances are often estimated as the squares of the estimated QTL effects. The parameters are the QTL effects and the estimated QTL variances are obtained by squaring the estimated QTL effects. This square transformation failed to incorporate the errors of estimated QTL effects into the transformation. The consequence is biases in estimated QTL variances. To correct the biases, we can either reformulate the QTL model by treating the QTL effect as random and directly estimate the QTL variance (as a variance component) or adjust the bias by taking into account the error of the estimated QTL effect. A moment method of estimation has been proposed to correct the bias. The method has been validated via Monte Carlo simulation studies. The method has been applied to QTL mapping for the 10-week-body-weight trait from an F2 mouse population.  相似文献   

11.
New methods for estimating the numbers of synonymous and nonsynonymous substitutions per site were developed. The methods are unweighted pathway methods based on Kimura's two-parameter model. Computer simulations were conducted to evaluate the accuracies of the new methods, Nei and Gojobori's (NG) method, Miyata and Yasunaga's (MY) method, Li, Wu, and Luo's (LWL) method, and Pamilo, Bianchi, and Li's (PBL) method. The following results were obtained: (1) The NG, MY, and LWL methods give overestimates of the number of synonymous substitutions and underestimates of the number of nonsynonymous substitutions. The major cause for the biased estimation is that these three methods underestimate the number of synonymous sites and overestimate the number of nonsynonymous sites. (2) The PBL method gives better estimates of the numbers of synonymous and nonsynonymous substitutions than those obtained by the NG, MY, and LWL methods. (3) The new methods also give better estimates of the numbers of synonymous and nonsynonymous substitutions than those obtained by the NG, MY, and LWL methods. In addition, estimates of the numbers of synonymous and nonsynonymous sites obtained by the new methods are reasonably accurate. (4) In some cases, the new methods and the PBL method give biased estimates of substitution numbers. However, from the number of nucleotide substitutions at the third position of codons, we can examine whether estimates obtained by the new methods are good or not, whereas we cannot make an examination of estimates obtained by the PBL method. (5) When there are strong transition/transversion and nucleotide-frequency biases like mitochondrial genes, all of the above methods give biased estimates of substitution numbers. In such cases, Kondo et al.'s method is recommended to be used for estimating the number of synonymous substitutions, although their method cannot estimate the number of nonsynonymous substitutions and is time-consuming. These results, particularly result (1), call for reexaminations of some genes. This is because evolutionary pictures of genes have often been discussed on the basis of results obtained by the NG, MY, and LWL methods, which are favorable for the neutral theory of molecular evolution.  相似文献   

12.
M. I. Chiu  T. L. Mason    G. R. Fink 《Genetics》1992,132(4):987-1001
Wright's method of estimating the number of genes contributing to the difference in a quantitative character between two populations involves observing the means and variances of the two parental populations and their hybrid populations. Although simple, Wright's method provides seriously biased estimates, largely due to linkage and unequal effects of alleles. A method is suggested to evaluate the bias of Wright's estimate, which relies on estimation of the mean recombination frequency between a pair of loci and a composite parameter of variability of allelic effects and frequencies among loci. Assuming that the loci are uniformly distributed in the genome, the mean recombination frequency can be calculated for some organisms. Theoretical analysis and an analysis of the Drosophila data on distributions of effects of P element inserts on bristle numbers indicate that the value of the composite parameter is likely to be about three or larger for many quantitative characters. There are, however, some serious problems with the current method, such as the irregular behavior of the statistic and large sampling variances of estimates. Because of that, the method is generally not recommended for use unless several favorable conditions are met. These conditions are: the two parental populations are many phenotypic standard deviations apart, linkage is not tight, and the sample size is very large. An example is given on the fruit weight of tomato from a cross with parental populations differing in means by more than 14 phenotypic standard deviations. It is estimated that the number of loci which account for 95% of the genic variance in the F2 population is 16, with a 95% confidence interval of 7-28, and the effect of the leading locus is 13% of the parental difference, with 95% confidence interval 8.5-25.7%.  相似文献   

13.
J L Jinks  P Towey 《Heredity》1976,37(1):69-81
A new method, genotype assay, is described for estimating k the number of genes or more strictly the number of effective factors responsible for variation of a continuous kind. The central feature is the determination of the proportion of individuals in the Fn generation of a cross between two pure breeding lines that are heterozygous at, at least, one locus by an assay of their Fn+2 grand progeny families. The observed proportion is then equated to a theoretical expectation which is a function of the number of genes involved. Expectations generalised to cover any generation n for experimental designs in which every Fn individual is assayed by comparing two Fn+2 grand progeny families have been derived for two limiting cases; one in which all genotypic differences are expressed as phenotypic differences and the other where the expression is minimised by imposing the maximum and relational balancing out of the contributions of individual gene loci. Equating the observed proportion of heterozygotes to these expectations therefore, leads to an upper and a lower estimate of k corresponding with these two limiting conditions. The reliability and sensitivity of the estimates depends primarily on n the generation chosen for study, the number of individuals (m) assayed from that generation and the number of individuals (l) raised in each Fn+2 grand progeny family. The two variables m and l being the principal determinants of the variances of the family means set the lower limit to the size of the gene effects that can be detected. The method is illustrated by assays of the F3 and F5 generations of two crosses between conditioned lines of Nicotiana rustica for three characters. The estimates are, without exception, as great as or greater than those obtained by alternative procedures. They show large, consistent increases between the F3 and F5 that cannot be traced to greater sensitivity of the latter generation and hence are presumably genuine.  相似文献   

14.
The field size at which a bone is read affects the results obtained when using Kerley's histological method for age estimation, even after applying the recommended correction factor. Whereas there is no tendency for any one of three field sizes tested to consistently underestimate or overestimate age, a field size closest to that used by Kerley in his original study had significantly lower variances for its age estimates, and thus provides greater reliability. This particular field size yields more precise estimates because it is sampling a pattern and number of structures more similar to that of Kerley. Correction factors cannot equalize the counts of osteons and osteon fragments because of spatial variations in the distributions of these histological structures. A field size similar to that used by Kerley in gathering the data from which he developed his regression equations must be used to assure that the same pattern and number of structures is being sampled. For this reason, we suggest a field size as close to 2.06 mm2 as possible be used when employing Kerley's method.  相似文献   

15.
A system of p interacting species whose behaviour can be approximated by a Markovian model is considered. Estimates for system parameters are obtained by the method of moments, when the means, variances and covariances can be estimated from observed population sizes over a period of time. Further, approximate standard errors of these estimates are obtained using the -technique.  相似文献   

16.
The genetic variabilities of sternopleural and abdominal bristle numbers existing in local natural populations were assessed. Using second chromosome lines of Drosophila melanogaster extracted from three natural populations in Japan (the Ishigakijima, Ogasawara and Aomori populations), experiments were conducted to estimate the components of genetic variances, additive and dominance variances. The following results were obtained: For both sternopleural and abdominal bristle numbers, the additive genetic variances (sigma 2A) were much larger than the dominance variances (sigma 2D) especially in the southern populations. For example, in the Ishigakijima population, for females sternopleural bristle numbers of the inversion-free chromosome group, the additive and dominance variances were estimated to be 1.255 +/- 0.2034 and 0.0552 +/- 0.0180, respectively. The magnitudes of the estimates of additive genetic variances were nearly equal from north to south. By comparing the additive genetic variances of the inversion-free chromosome group with those of the In(2L)t-carrying chromosome group, it was inferred that sufficient number of generations to achieve the equilibrium state has not passed since the introduction of a single or a small number of the In(2L)t-carrying chromosomes to the Ishigakijima population.  相似文献   

17.
Z. B. Zeng  D. Houle    C. C. Cockerham 《Genetics》1990,126(1):235-247
S. Wright suggested an estimator, m, of the number of loci, m, contributing to the difference in a quantitative character between two differentiated populations, which is calculated from the phenotypic means and variances in the two parental populations and their F1 and F2 hybrids. The same method can also be used to estimate m contributing to the genetic variance within a single population, by using divergent selection to create differentiated lines from the base population. In this paper we systematically examine the utility and problems of this technique under the influences of unequal allelic effects and initial allele frequencies, and linkage, which are known to lead m to underestimate m. In addition, we examine the effects of population size and selection intensity during the generations of selection. During selection, the estimator m rapidly approaches its expected value at the selection limit. With reasonable assumptions about unequal allelic effects and initial allele frequencies, the expected value of m without linkage is likely to be on the order of one-third of the number of genes. The estimates suffer most seriously from linkage. The practical maximum expectation of m is just about the number of chromosomes, considerably less than the "recombination index" which has been assumed to be the upper limit. The estimates are also associated with large sampling variances. An estimator of the variance of m derived by R. Lande substantially underestimates the actual variance. Modifications to the method can ameliorate some of the problems. These include using F3 or later generation variances or the genetic variance in the base population, and replicating the experiments and estimation procedure. However, even in the best of circumstances, information from m is very limited and can be misleading.  相似文献   

18.
Summary An analysis is derived for a diallel experiment in which each cross is represented by a number of homozygous Unes developed by the doubled haploid method. Both additive and additive x additive genetic variances can be estimated with this analysis. A population-improvement scheme involving the doubled haploid or single seed descent methods is also proposed.  相似文献   

19.
A method for partitioning genetic variance estimated from twin data into additive and dominance variances was presented using Falconer's variance component model. The effects of dominance and environmental variances on a number of heritability estimates were also reviewed. A heritability estimate, based on the analysis of variance and the genetic variance estimates presented by HASEMAN and ELSTON and CHRISTIAN et al. which utilizes all available information from twin data, was proposed and discussed. This estimate seems to be the least affected by fluctuations in the magnitudes of dominance and environmental variances.  相似文献   

20.
Endonuclease VIII (Nei) is one of three enzymes in Escherichia coli that are involved in base-excision repair of oxidative damage to DNA. We investigated the substrate specificity and excision kinetics of this DNA glycosylase for bases in DNA that have been damaged by free radicals. Two different DNA substrates were prepared by gamma-irradiation of DNA solutions under N(2)O or air, such that they contained a multiplicity of modified bases. Although previous studies on the substrate specificity of Nei had demonstrated activity on several pyrimidine derivatives, this present study demonstrates excision of additional pyrimidine derivatives and a purine-derived lesion, 4,6-diamino-5-formamidopyrimidine, from DNA containing multiple modified bases. Excision was dependent on enzyme concentration, incubation time, and substrate concentration, and followed Michaelis-Menten kinetics. The kinetic parameters also depended on the identity of the individual modified base being removed. Substrates and excision kinetics of Nei and a naturally arising mutant form involving Leu-90-->Ser (L90S-Nei) were compared to those of Escherichia coli endonuclease III (Nth), which had previously been determined under experimental conditions similar to those in this study. This comparison showed that Nei and Nth significantly differ from each other in terms of excision rates, although they have common substrates. The present work extends the substrate specificity of Nei and shows the effect of a single mutation in the nei gene on the specificity of Nei.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号