首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have developed a rapid parsimony method for reconstructing ancestral nucleotide states that allows calculation of initial branch lengths that are good approximations to optimal maximum-likelihood estimates under several commonly used substitution models. Use of these approximate branch lengths (rather than fixed arbitrary values) as starting points significantly reduces the time required for iteration to a solution that maximizes the likelihood of a tree. These branch lengths are close enough to the optimal values that they can be used without further iteration to calculate approximate maximum-likelihood scores that are very close to the "exact" scores found by iteration. Several strategies are described for using these approximate scores to substantially reduce times needed for maximum-likelihood tree searches.  相似文献   

2.
Allele frequency estimation from data on relatives.   总被引:34,自引:18,他引:16       下载免费PDF全文
Given genetic marker data on unrelated individuals, maximum-likelihood allele-frequency estimates and their standard errors are easily calculated from sample proportions. When marker phenotypes are observed on relatives, this method cannot be used without either discarding a subset of the data or incorrectly assuming that all individuals are unrelated. Here, I describe a method for allele frequency estimation for data on relatives that is based on standard methods of pedigree analysis. This method makes use of all available marker information while correctly taking into account the dependence between relatives. I illustrate use of the method with family data for a VNTR polymorphism near the apolipoprotein B locus.  相似文献   

3.
Yang R  Tian Q  Xu S 《Genetics》2006,173(4):2339-2356
Quantitative traits whose phenotypic values change over time are called longitudinal traits. Genetic analyses of longitudinal traits can be conducted using any of the following approaches: (1) treating the phenotypic values at different time points as repeated measurements of the same trait and analyzing the trait under the repeated measurements framework, (2) treating the phenotypes measured from different time points as different traits and analyzing the traits jointly on the basis of the theory of multivariate analysis, and (3) fitting a growth curve to the phenotypic values across time points and analyzing the fitted parameters of the growth trajectory under the theory of multivariate analysis. The third approach has been used in QTL mapping for longitudinal traits by fitting the data to a logistic growth trajectory. This approach applies only to the particular S-shaped growth process. In practice, a longitudinal trait may show a trajectory of any shape. We demonstrate that one can describe a longitudinal trait with orthogonal polynomials, which are sufficiently general for fitting any shaped curve. We develop a mixed-model methodology for QTL mapping of longitudinal traits and a maximum-likelihood method for parameter estimation and statistical tests. The expectation-maximization (EM) algorithm is applied to search for the maximum-likelihood estimates of parameters. The method is verified with simulated data and demonstrated with experimental data from a pseudobackcross family of Populus (poplar) trees.  相似文献   

4.
Many methods are available for estimating ancestral values of continuous characteristics, but little is known about how well these methods perform. Here we compare six methods: linear parsimony, squared-change parsimony, one-parameter maximum likelihood (Brownian motion), two-parameter maximum likelihood (Ornstein-Uhlenbeck process), and independent comparisons with and without branch-length information. We apply these methods to data from 20 morphospecies of Pleistocene planktic Foraminifera in order to estimate ancestral size and shape variables, and compare these estimates with measurements on fossils close to the phylogenetic position of 13 ancestors. No method produced accurate estimates for any variable: estimates were consistently less good as predictors of the observed values than were the averages of the observed values. The two-parameter maximum-likelihood model consistently produces the most accurate size estimates overall. Estimation of ancestral sizes is confounded by an evolutionary trend towards increasing size. Shape showed no trend but was still estimated very poorly: we consider possible reasons. We discuss the implications of our results for the use of estimates of ancestral characteristics.  相似文献   

5.
Enjalbert J  David JL 《Genetics》2000,156(4):1973-1982
Using multilocus individual heterozygosity, a method is developed to estimate the outcrossing rates of a population over a few previous generations. Considering that individuals originate either from outcrossing or from n successive selfing generations from an outbred ancestor, a maximum-likelihood (ML) estimator is described that gives estimates of past outcrossing rates in terms of proportions of individuals with different n values. Heterozygosities at several unlinked codominant loci are used to assign n values to each individual. This method also allows a test of whether populations are in inbreeding equilibrium. The estimator's reliability was checked using simulations for different mating histories. We show that this ML estimator can provide estimates of outcrossing rates for the final generation outcrossing rate (t(0)) and a mean of the preceding rates (t(p)) and can detect major temporal variation in the mating system. The method is most efficient for low to intermediate outcrossing levels. Applied to nine populations of wheat, this method gave estimates of t(0) and t(p). These estimates confirmed the absence of outcrossing t(0) = 0 in the two populations subjected to manual selfing. For free-mating wheat populations, it detected lower final generation outcrossing rates t(0) = 0-0.06 than those expected from global heterozygosity t = 0.02-0.09. This estimator appears to be a new and efficient way to describe the multilocus heterozygosity of a population, complementary to Fis and progeny analysis approaches.  相似文献   

6.
Analytic approaches to twin data using structural equation models   总被引:5,自引:0,他引:5  
The classical twin study is the most popular design in behavioural genetics. It has strong roots in biometrical genetic theory, which allows predictions to be made about the correlations between observed traits of identical and fraternal twins in terms of underlying genetic and environmental components. One can infer the relative importance of these 'latent' factors (model parameters) by structural equation modelling (SEM) of observed covariances of both twin types. SEM programs estimate model parameters by minimising a goodness-of-fit function between observed and predicted covariance matrices, usually by the maximum-likelihood criterion. Likelihood ratio statistics also allow the comparison of fit of different competing models. The program Mx, specifically developed to model genetically sensitive data, is now widely used in twin analyses. The flexibility of Mx allows the modelling of multivariate data to examine the genetic and environmental relations between two or more phenotypes and the modelling to categorical traits under liability-threshold models.  相似文献   

7.
Previously reported maximum-likelihood pairwise relatedness (r) estimator of Thompson and Milligan (M) was extended to allow for negative r estimates under the regression interpretation of r. This was achieved by establishing the equivalency of the likelihoods used in the kinship program and the likelihoods of Thompson. The new maximum-likelihood (ML) estimator was evaluated by Monte Carlo simulations. It was found that the new ML estimator became unbiased significantly faster compared to the original M estimator when the amount of genotype information was increased. The effects of allele frequency estimation errors on the new and existing relatedness estimators were also considered.  相似文献   

8.
 Segregating quantitative trait loci can be detected via linkage to genetic markers. By selectively genotyping individuals with extreme phenotypes for the quantitative trait, the power per individual genotyped is increased at the expense of the power per individual phenotyped, but linear-model estimates of the quantitative-locus effect will be biased. The properties of single- and multiple-trait maximum-likelihood estimates of quantitative-loci parameters derived from selectively genotyped samples were investigated using Monte-Carlo simulations of backcross populations. All individuals with trait records were included in the analyses. All quantitative-locus parameters and the residual correlation were unbiasedly estimated by multiple-trait maximum-likelihood methodology. With single-trait maximum-likelihood, unbiased estimates for quantitative-locus effect and location, and the residual variance, were obtained for the trait under selection, but biased estimates were derived for a correlated trait that was analyzed separately. When an effect of the QTL was simulated only on the trait under selection, a “ghost” effect was also found for the correlated trait. Furthermore, if an effect was simulated only for the correlated trait, then the statistical power was less than that obtained with a random sample of equal size. With multiple-trait analyses, the power of quantitative-trait locus detection was always greater with selective genotyping. Received: 23 February 1998 / Accepted: 15 May 1998  相似文献   

9.
In the reconstruction of a large phylogenetic tree, the most difficult part is usually the problem of how to explore the topology space to find the optimal topology. We have developed a "divide-and-conquer" heuristic algorithm in which an initial neighbor-joining (NJ) tree is divided into subtrees at internal branches having bootstrap values higher than a threshold. The topology search is then conducted by using the maximum-likelihood method to reevaluate all branches with a bootstrap value lower than the threshold while keeping the other branches intact. Extensive simulation showed that our simple method, the neighbor-joining maximum-likelihood (NJML) method, is highly efficient in improving NJ trees. Furthermore, the performance of the NJML method is nearly equal to or better than existing time-consuming heuristic maximum-likelihood methods. Our method is suitable for reconstructing relatively large molecular phylogenetic trees (number of taxa >/= 16).  相似文献   

10.
P Beerli  J Felsenstein 《Genetics》1999,152(2):763-773
A new method for the estimation of migration rates and effective population sizes is described. It uses a maximum-likelihood framework based on coalescence theory. The parameters are estimated by Metropolis-Hastings importance sampling. In a two-population model this method estimates four parameters: the effective population size and the immigration rate for each population relative to the mutation rate. Summarizing over loci can be done by assuming either that the mutation rate is the same for all loci or that the mutation rates are gamma distributed among loci but the same for all sites of a locus. The estimates are as good as or better than those from an optimized FST-based measure. The program is available on the World Wide Web at http://evolution.genetics. washington.edu/lamarc.html/.  相似文献   

11.
Kirkpatrick M  Meyer K 《Genetics》2004,168(4):2295-2306
Estimating the genetic and environmental variances for multivariate and function-valued phenotypes poses problems for estimation and interpretation. Even when the phenotype of interest has a large number of dimensions, most variation is typically associated with a small number of principal components (eigen-vectors or eigenfunctions). We propose an approach that directly estimates these leading principal components; these then give estimates for the covariance matrices (or functions). Direct estimation of the principal components reduces the number of parameters to be estimated, uses the data efficiently, and provides the basis for new estimation algorithms. We develop these concepts for both multivariate and function-valued phenotypes and illustrate their application in the restricted maximum-likelihood framework.  相似文献   

12.
The present article discusses the use of computational methods based on generalized estimating equations (GEE), as a potential alternative to full maximum-likelihood methods, for performing segregation analysis of continuous phenotypes by using randomly selected family data. The method that we propose can estimate effect and degree of dominance of a major gene in the presence of additional nongenetic or polygenetic familial associations, by relating sample moments to their expectations calculated under the genetic model. It is known that all parameters in basic major-gene models cannot be identified, for estimation purposes, solely in terms of the first two sample moments of data from randomly selected families. Thus, we propose the use of higher (third order) sample moments to resolve this identifiability problem, in a pseudo-profile likelihood estimation scheme. In principle, our methods may be applied to fitting genetic models by using complex pedigrees and for estimation in the presence of missing phenotype data for family members. In order to assess its statistical efficiency we compare several variants of the method with each other and with maximum-likelihood estimates provided by the SAGE computer package in a simulation study.  相似文献   

13.
Estimation of the constants a, b, and λ0 by means of the standard Moffitt-Yang plot is evaluated. It is found that the method is very insensitive as an estimation procedure and that large errors in b may be expected. Expressions for the maximum-likelihood estimates of the constants are derived.  相似文献   

14.
For evolutionary studies of polyploid species estimates of the genetic identity between species with different degrees of ploidy are particularly required because gene counting in samples of polyploid individuals often cannot be done, e.g., in triploids the phenotype AB can be genotypically either ABB or AAB. We recently suggested a genetic distance measure that is based on phenotype counting and made available the computer program POPDIST. The program provides maximum-likelihood estimates of the genetic identities and distances between polyploid populations, but this approach is not informative for populations within species that only differ in their allele frequencies. We now close this gap by applying the frequencies of shared 'bands' in both populations to Nei's identity measure. Our simulation study demonstrates the close correlation between the band-sharing identity and the genetic identity calculated on the basis of gene frequencies for any degree of ploidy. The new extended version of POPDIST (version 1.2.0) provides the option of choosing either the maximum-likelihood estimator or the band-sharing measure.  相似文献   

15.
The temporal and spatial population genetic structure of ayu Plecoglossus altivelis (Salmoniformes: Plecoglossidae), an amphidromous fish, was examined using analysis of variation at six microsatellite DNA loci. Intracohort genetic diversities, as measured by the number of alleles and heterozygosity, were similar among six cohorts (2001–2006) within a population (Nezugaseki River), with the mean number of alleles per cohort ranging from 11·0 to 12·5 and the expected heterozygosity ranging from 0·74 to 0·77. Intrapopulational genetic diversities were also similar across the three studied populations along the 50 km coast, with the mean number of alleles and the expected heterozygosity ranging from 11·33 to 11·67 and from 0·75 to 0·76, respectively. The authors observed only one significant difference in pair-wise population differentiation ( F ST-value) between the cohorts within a population and among three populations. Estimates of the effective population size ( N e) based on maximum-likelihood method yielded small values (ranging from 94·8 to 135·5), whereas census population size ranged from c. 4800 to 24 000. As a result, the ratio of annual effective population sizes to census population size ( N e/ N ) ranged from 0·004 to 0·023. These estimates of N e/ N agree more closely with estimates for marine fishes than that of the larger estimates for freshwater fishes. The present study suggests that ayu which is highly fecund and shows low survival during the early life stages is also characterized by having low value of N e/ N , similar to marine species with a pelagic life cycle.  相似文献   

16.
Inference of haplotypes is important for many genetic approaches, including the process of assigning a phenotype to a genetic region. Usually, the population frequencies of haplotypes, as well as the diplotype configuration of each subject, are estimated from a set of genotypes of the subjects in a sample from the population. We have developed an algorithm to infer haplotype frequencies and the combination of haplotype copies in each pool by using pooled DNA data. The input data are the genotypes in pooled DNA samples, each of which contains the quantitative genotype data from one to six subjects. The algorithm infers by the maximum-likelihood method both frequencies of the haplotypes in the population and the combination of haplotype copies in each pool by an expectation-maximization algorithm. The algorithm was implemented in the computer program LDPooled. We also used the bootstrap method to calculate the standard errors of the estimated haplotype frequencies. Using this program, we analyzed the published genotype data for the SAA (n=156), MTHFR (n=80), and NAT2 (n=116) genes, as well as the smoothelin gene (n=102). Our study has shown that the frequencies of major (frequency >0.1 in a population) haplotypes can be inferred rather accurately from the pooled DNA data by the maximum-likelihood method, although with some limitations. The estimated D and D' values had large variations except when the /D/ values were >0.1. The estimated linkage-disequilibrium measure rho2 for 36 linked loci of the smoothelin gene when one- and two-subject pool protocols were used suggested that the gross pattern of the distribution of the measure can be reproduced using the two-subject pool data.  相似文献   

17.
 To evaluate how environmental and genetic factors influence mating-system evolution, accurate estimates of outcrossing rates of individual plants (families) are required. Using isozyme markers, we observed wide variation in family outcrossing rates in three natural populations of Asclepias incarnata using three statistical methods: (1) a multilocus maximum-likelihood procedure (t m); (2) a multilocus method-of-moments procedure (t a); and (3) a direct comparison of progeny phenotypes against maternal phenotypes (t d). Neighborhood floral-display size was positively correlated with t a in one population, but showed no relationship with any of the other estimates of outcrossing for any population. Monte-Carlo simulations revealed that statistical variation associated with these estimation procedures can be large enough to explain all of the observed variation in outcrossing. We also found that significant, spurious correlations with neighborhood floral display could arise, on average, 7% of the time by chance alone. Our observations suggest that it is difficult to obtain accurate estimates of outcrossing in naturally pollinated plants using the estimation procedures currently available. Moreover, we caution that attempts to interpret observed variation in family outcrossing estimates by observing variation in ecological parameters could be misleading. Received: 28 September 1998 / Accepted: 27 October 1998  相似文献   

18.
Selection can have a significant effect on sequence evolution and this will be reflected in the information contained within the phylogenetic relationships between species. Selection will reduce the frequency of any deleterious nucleotides, and this can be used to test for the presence of selection. The frequencies of different nucleotides can be predicted theoretically and compared to observed values. If a sample of sequences has an usually low frequency of a particular nucleotide then selection might be inferred to have acted upon these sequences. This conclusion can be true only if the sequences are not too closely related and if sufficient mutations have occurred during their evolution. Otherwise, the unusual pattern of nucleotides in the sequences may be caused by recent common ancestry. An algorithm is presented to obtain maximum-likelihood estimates of selection coefficients using the phylogenetic information contained within sequence data. A k-allele model is developed that uses the phylogeny to measure relative mutation rates and degrees of relatedness and to evaluate the likelihood in the presence of selection. The method is illustrated with examples from the NS2 genes of influenza viruses and the MHC genes of mice. It is shown that the maximum-likelihood estimate for mutation rates are very large for. influenza viruses and that statistically significant selection acts to maintain a specific coding sequence. Overall, the MHC genes also have significant selection to preserve the coding sequence, but at the antigen recognition site, this selection is reversed to promote genetic variation. Maximum-likelihood estimates of these selection coefficients are provided.  相似文献   

19.
Pybus OG  Rambaut A  Harvey PH 《Genetics》2000,155(3):1429-1437
We describe a unified set of methods for the inference of demographic history using genealogies reconstructed from gene sequence data. We introduce the skyline plot, a graphical, nonparametric estimate of demographic history. We discuss both maximum-likelihood parameter estimation and demographic hypothesis testing. Simulations are carried out to investigate the statistical properties of maximum-likelihood estimates of demographic parameters. The simulations reveal that (i) the performance of exponential growth model estimates is determined by a simple function of the true parameter values and (ii) under some conditions, estimates from reconstructed trees perform as well as estimates from perfect trees. We apply our methods to HIV-1 sequence data and find strong evidence that subtypes A and B have different demographic histories. We also provide the first (albeit tentative) genetic evidence for a recent decrease in the growth rate of subtype B.  相似文献   

20.
Phenotypes in an ABO-like system of a number of genetically-independent persons from a number of populations are supposed to be observed. The program which is written in FORTRAN calculates maximum likelihood estimates of gene frequencies and their standard errors in each population and in the populations taken together. Furthermore the program calculates expected values and likelihood ratio and goodness of fit chi-square tests of Hardy-Weinberg equilibrium. If several subpopulations are pooled together a likelihood ratio test of homogeneity is performed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号