共查询到20条相似文献,搜索用时 15 毫秒
1.
Effects of genotyping errors,missing values and segregation distortion in molecular marker data on the construction of linkage maps 总被引:12,自引:0,他引:12
A simulation study was performed to investigate the effects of missing values, typing errors and distorted segregation ratios in molecular marker data on the construction of genetic linkage maps, and to compare the performance of three locus-ordering criteria (weighted least squares, maximum likelihood and minimum sum of adjacent recombination fractions criteria) in the presence of such effects. The study was based upon three linkage groups of 10 loci at 2, 6, and 10 cM spacings simulated from a doubled-haploid population of size 150. Criteria performance were assessed using the number of replicates with correctly estimated orders, the mean rank correlation between the estimated and the true order and the mean total map length. Bootstrap samples from replicates in the maximum likelihood analysis produced a measure of confidence in the estimated locus order. The effects of missing values and/or typing errors in the data are to reduce the proportion of correctly ordered maps, and this problem worsens as the distances between loci decreases. The maximum likelihood criterion is most successful at ordering loci correctly, but gives estimated map lengths, which are substantially inflated when typing errors are present. The presence of missing values in the data produces shorter map lengths for more widely spaced markers, especially under the weighted least-squares criterion. Overall, the presence of segregation distortion has little effect on this population. 相似文献
2.
Maximum likelihood techniques for the mapping and analysis of quantitative trait loci with the aid of genetic markers 总被引:20,自引:0,他引:20
J I Weller 《Biometrics》1986,42(3):627-640
A method is presented to estimate the biometric parameters of a quantitative trait locus linked to a genetic marker when both loci are segregating in the F-2 generation of a cross between two inbred lines. The method, which assumes underlying normal distributions, is a combination of maximum likelihood and moments methods and uses the statistics of the genetic marker genotype samples for the quantitative trait to estimate the recombination frequency between the two loci and the means and variances of the genotypes of the quantitative trait locus. With this method, the genetic parameters of a locus affecting plant height linked to an electrophoretic marker for esterase were accurately estimated from a sample of 1596 F-2 progeny of a cross between two species of Lycopersicon (tomato). Linkage distance between the two loci was 38 map units and the effect of the quantitative trait locus was 1.6 phenotypic standard deviation units. Accurate estimates of the genetic parameters and linkage distance for populations of 2000 individuals simulated with a segregating codominant locus with an effect of 1.63 standard deviations linked to a genetic marker with .2 recombination were also derived by this method. The method is not effective in distinguishing between complete and partial linkage in samples of only 500 individuals or for quantitative loci with effects less than a phenotypic standard deviation. The method is more effective for codominant than for dominant loci. 相似文献
3.
Kang Huang Kermit Ritland Songtao Guo Milena Shattuck Baoguo Li 《Molecular ecology resources》2014,14(4):734-744
Studies in genetics and ecology often require estimates of relatedness coefficients based on genetic marker data. Many diploid estimators have been developed using either method‐of‐moments or maximum‐likelihood estimates. However, there are no relatedness estimators for polyploids. The development of a moment estimator for polyploids with polysomic inheritance, which simultaneously incorporates the two‐gene relatedness coefficient and various ‘higher‐order’ coefficients, is described here. The performance of the estimator is compared to other estimators under a variety of conditions. When using a small number of loci, the estimator is biased because of an increase in ill‐conditioned matrices. However, the estimator becomes asymptotically unbiased with large numbers of loci. The ambiguity of polyploid heterozygotes (when balanced heterozygotes cannot be distinguished from unbalanced heterozygotes) is also considered; as with low numbers of loci, genotype ambiguity leads to bias. A software, PolyRelatedness , implementing this method and supporting a maximum ploidy of 8 is provided. 相似文献
4.
As a result of previous large, multipoint linkage studies there is a substantial amount of existing marker data. Due to the increased sample size, genetic maps estimated from these data could be more accurate than publicly available maps. However, current methods for map estimation are restricted to data sets containing pedigrees with a small number of individuals, or cannot make full use of marker data that are observed at several loci on members of large, extended pedigrees. In this article, a maximum likelihood (ML) method for map estimation that can make full use of the marker data in a large, multipoint linkage study is described. The method is applied to replicate sets of simulated marker data involving seven linked loci, and pedigree structures based on the real multipoint linkage study of Abkevich et al. (2003, American Journal of Human Genetics 73, 1271-1281). The variance of the ML estimate is accurately estimated, and tests of both simple and composite null hypotheses are performed. An efficient procedure for combining map estimates over data sets is also suggested. 相似文献
5.
Abdallah JM Mangin B Goffinet B Cierco-Ayrolles C Pérez-Enciso M 《Genetical research》2004,83(1):41-47
We present a maximum likelihood method for mapping quantitative trait loci that uses linkage disequilibrium information from single and multiple markers. We made paired comparisons between analyses using a single marker, two markers and six markers. We also compared the method to single marker regression analysis under several scenarios using simulated data. In general, our method outperformed regression (smaller mean square error and confidence intervals of location estimate) for quantitative trait loci with dominance effects. In addition, the method provides estimates of the frequency and additive and dominance effects of the quantitative trait locus. 相似文献
6.
Estimation of the marker gene frequency and linkage disequilibrium from conditional marker data. 总被引:9,自引:9,他引:0
下载免费PDF全文
![点击此处可从《American journal of human genetics》网站下载免费的PDF全文](/ch/ext_images/free.gif)
A method is proposed to calculate the maximum likelihood estimate of gene frequency and linkage disequilibrium from disease-codominant marker conditional data. The method is illustrated using data on sickle-cell anemia and Duchenne muscular dystrophy and linked polymorphic restriction endonuclease cleavage sites. 相似文献
7.
Estimating the power of a proposed linkage study for a complex genetic trait. 总被引:27,自引:15,他引:12
下载免费PDF全文
![点击此处可从《American journal of human genetics》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Many genetic traits have complex modes of inheritance; they may exhibit incomplete or age-dependent penetrance or fail to show any clear Mendelian inheritance pattern. As primary linkage maps for the human genome near completion, it is becoming increasingly possible to map these traits. Prior to undertaking a linkage study, it is important to consider whether the pedigrees available for the proposed study are likely to provide sufficient information to demonstrate linkage, assuming a linked marker is tested. In the current paper, we describe a computer simulation method to estimate the power of a proposed study to detect linkage for a complex genetic trait, given a hypothesized genetic model for the trait. Our method simulates trait locus genotypes consistent with observed trait phenotypes, in such a way that the probability to detect linkage can be estimated by sample statistics of the maximum lod score distribution. The method uses terms available when calculating the likelihood of the trait phenotypes for the pedigree and is applicable to any trait determined by one or a few genetic loci; individual-specific environmental effects can also be dealt with. Our method provides an objective answer to the question, Will these pedigrees provide sufficient information to map this complex genetic trait? 相似文献
8.
Tightly linked markers for the neurofibromatosis type 1 gene 总被引:15,自引:0,他引:15
Ray White Yusuke Nakamura Peter O''Connell Mark Leppert Jean-Marc Lalouel David Barker David Goldgar Mark Skolnick John Carey C. E. Wallis C. P. Slater Chris Mathew Bruce Ponder 《Genomics》1987,1(4):364-367
Relationships among genetic markers in the region of the neurofibromatosis type 1 (NF1) gene on chromosome 17 were investigated by linkage studies in a large sample set of affected families and in a panel of 58 normal families. A new marker, pHHH202 (D17S33), was included along with two markers known to be closely linked to NF. The maximum likelihood estimate of the recombination rate between the pHHH202 and NF1 loci was found to be O. Multilocus analysis suggested the following marker order: pA10-41-(p3-6, pHHH202); the NF1 gene fell with equal likelihood between either pA10-41-p3-6 or p3-6-pHHH202. The odds against NF1 being outside this cluster of tightly linked markers were greater than 15:1. 相似文献
9.
Chathurika K. H. Hettiarachchige Richard M. Huggins 《Biometrical journal. Biometrische Zeitschrift》2018,60(3):463-479
Accurate estimation of the size of animal populations is an important task in ecological science. Recent advances in the field of molecular genetics researches allow the use of genetic data to estimate the size of a population from a single capture occasion rather than repeated occasions as in the usual capture–recapture experiments. Estimating the population size using genetic data also has sometimes led to estimates that differ markedly from each other and also from classical capture–recapture estimates. Here, we develop a closed form estimator that uses genetic information to estimate the size of a population consisting of mothers and daughters, focusing on estimating the number of mothers, using data from a single sample. We demonstrate the estimator is consistent and propose a parametric bootstrap to estimate the standard errors. The estimator is evaluated in a simulation study and applied to real data. We also consider maximum likelihood in this setting and discover problems that preclude its general use. 相似文献
10.
Bink MC Anderson AD van de Weg WE Thompson EA 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2008,117(6):843-855
Several estimators have been proposed that use molecular marker data to infer the degree of relatedness for pairs of individuals.
The objective of this study was to evaluate the performance of seven estimators when applied to marker data of a set of 33
key individuals from a large complex apple pedigree. The evaluation considered different scenarios of allele frequencies and
different numbers of marker loci. The method of moments estimators were Similarity, Queller-Goodknight, Lynch-Ritland and
Wang. The maximum likelihood estimators were Thompson, Anderson-Weir and Jacquard. The pedigree-based coancestry coefficients
were taken as the point of reference in calculating correlations and root mean square error (RMSE). The marker data comprised
86 multi-allelic SSR markers on 17 linkage groups, covering 11 Morgans. Additionally, we simulated 10 datasets conditional
on the real pedigree to support the results on the real dataset. None of the estimators outperformed the others. Knowledge
of allele frequencies appeared to be the most influential, i.e., the highest correlations and lowest RMSE were found when
frequencies from the founder population were available. When equal allele frequencies were used, all estimators resulted in
very similar, but on average lower, correlations. The use of allele frequencies estimated from the set of 33 individuals gave,
on average, the poorest results. The maximum likelihood estimators and the Lynch-Ritland estimator were the most sensitive
to allele frequencies. The results from the simulation study fully supported the trends in results of the real dataset. This
study indicated that high correlations (up to 0.90) and small RMSE (below 0.03), may be obtained when population allelic frequencies
are available. In this scenario, the performances of the various estimators were similar, but seemed to favor the maximum
likelihood estimators. In the absence of reliable allele frequencies the method of moments estimators were shown to be more
robust. The number of marker loci influenced the average performance of the estimators; however, the ranking was not affected.
Correlations up to 0.80 were obtained when two markers per chromosome and appropriate allele frequencies were available. Adding
more markers to the current dataset may lead to marginal improvements. 相似文献
11.
S. Maenhout B. De Baets G. Haesaert 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》2009,118(6):1181-1192
Molecular markers allow to estimate the pairwise relatedness between the members of a breeding pool when their selection history
is no longer available or has become too complex for a classical pedigree analysis. The field of population genetics has several
estimation procedures at its disposal, but when the genotyped individuals are highly selected inbred lines, their application
is not warranted as the theoretical assumptions on which these estimators were built, usually linkage equilibrium between
marker loci or even Hardy–Weinberg equilibrium, are not met. An alternative approach requires the availability of a genotyped
reference set of inbred lines, which allows to correct the observed marker similarities for their inherent upward bias when
used as a coancestry measure. However, this approach does not guarantee that the resulting coancestry matrix is at least positive
semi-definite (psd), a necessary condition for its use as a covariance matrix. In this paper we present the weighted alikeness
in state (WAIS) estimator. This marker-based coancestry estimator is compared to several other commonly applied relatedness
estimators under realistic hybrid breeding conditions in a number of simulations. We also fit a linear mixed model to phenotypical
data from a commercial maize breeding programme and compare the likelihood of the different variance structures. WAIS is shown
to be psd which makes it suitable for modelling the covariance between genetic components in linear mixed models involved
in breeding value estimation or association studies. Results indicate that it generally produces a low root mean squared error
under different breeding circumstances and provides a fit to the data that is comparable to that of several other marker-based
alternatives. Recommendations for each of the examined coancestry measures are provided. 相似文献
12.
S. J. Knapp W. C. Bridges Jr. D. Birkes 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1990,79(5):583-592
Summary High-density restriction fragment length polymorphism (RFLP) and allozyme linkage maps have been developed in several plant species. These maps make it technically feasible to map quantitative trait loci (QTL) using methods based on flanking marker genetic models. In this paper, we describe flanking marker models for doubled haploid (DH), recombinant inbred (RI), backcross (BC), F1 testcross (F1TC), DH testcross (DHTC), recombinant inbred testcross (RITC), F2, and F3 progeny. These models are functions of the means of quantitative trait locus genotypes and recombination frequencies between marker and quantitative trait loci. In addition to the genetic models, we describe maximum likelihood methods for estimating these parameters using linear, nonlinear, and univariate or multivariate normal distribution mixture models. We defined recombination frequency estimators for backcross and F2 progeny group genetic models using the parameters of linear models. In addition, we found a genetically unbiased estimator of the QTL heterozygote mean using a linear function of marker means. In nonlinear models, recombination frequencies are estimated less efficiently than the means of quantitative trait locus genotypes. Recombination frequency estimation efficiency decreases as the distance between markers decreases, because the number of progeny in recombinant marker classes decreases. Mean estimation efficiency is nearly equal for these methods. 相似文献
13.
Although a major component of fitness, male reproductive success is generally extremely difficult to estimate. As a result, genetic methods and maximum likelihood models have been developed to estimate male parentage, but all are limited in practice by the degree of genetic variation observable. Scoring individuals phenotypically at a large number of random loci exhibiting dominance (e.g. RAPD markers) may provide a means of detecting sufficient genetic variation. Dominance, however, represents a loss of information and therefore greater variation in the estimate of paternity. A mixture model describing mating in a population is presented to quantify the trade-off between marker types when estimates of male fertility are sought. A sample size 1.5-2.0 times greater is required for dominant markers under some conditions to obtain the same confidence in fertility estimates as for codominant markers, although with large sample sizes the fertility estimates are similar for either marker type. Since the number of dominant DN A markers is not limited in the same manner as is the number of codominant protein markers, one's confidence in the estimates can be increased above that possible from proteins by surveying additional loci. However, for a fixed sample size a trade-off exists between the number of progeny assayed per female and the number of loci surveyed. In many cases more progeny per female provide better estimates of fertility than more loci. 相似文献
14.
How Informative Is Wright's Estimator of the Number of Genes Affecting a Quantitative Character? 总被引:3,自引:1,他引:2
下载免费PDF全文
![点击此处可从《Genetics》网站下载免费的PDF全文](/ch/ext_images/free.gif)
S. Wright suggested an estimator, m, of the number of loci, m, contributing to the difference in a quantitative character between two differentiated populations, which is calculated from the phenotypic means and variances in the two parental populations and their F1 and F2 hybrids. The same method can also be used to estimate m contributing to the genetic variance within a single population, by using divergent selection to create differentiated lines from the base population. In this paper we systematically examine the utility and problems of this technique under the influences of unequal allelic effects and initial allele frequencies, and linkage, which are known to lead m to underestimate m. In addition, we examine the effects of population size and selection intensity during the generations of selection. During selection, the estimator m rapidly approaches its expected value at the selection limit. With reasonable assumptions about unequal allelic effects and initial allele frequencies, the expected value of m without linkage is likely to be on the order of one-third of the number of genes. The estimates suffer most seriously from linkage. The practical maximum expectation of m is just about the number of chromosomes, considerably less than the "recombination index" which has been assumed to be the upper limit. The estimates are also associated with large sampling variances. An estimator of the variance of m derived by R. Lande substantially underestimates the actual variance. Modifications to the method can ameliorate some of the problems. These include using F3 or later generation variances or the genetic variance in the base population, and replicating the experiments and estimation procedure. However, even in the best of circumstances, information from m is very limited and can be misleading. 相似文献
15.
Joint modeling of linkage and association: identifying SNPs responsible for a linkage signal
下载免费PDF全文
![点击此处可从《American journal of human genetics》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Once genetic linkage has been identified for a complex disease, the next step is often association analysis, in which single-nucleotide polymorphisms (SNPs) within the linkage region are genotyped and tested for association with the disease. If a SNP shows evidence of association, it is useful to know whether the linkage result can be explained, in part or in full, by the candidate SNP. We propose a novel approach that quantifies the degree of linkage disequilibrium (LD) between the candidate SNP and the putative disease locus through joint modeling of linkage and association. We describe a simple likelihood of the marker data conditional on the trait data for a sample of affected sib pairs, with disease penetrances and disease-SNP haplotype frequencies as parameters. We estimate model parameters by maximum likelihood and propose two likelihood-ratio tests to characterize the relationship of the candidate SNP and the disease locus. The first test assesses whether the candidate SNP and the disease locus are in linkage equilibrium so that the SNP plays no causal role in the linkage signal. The second test assesses whether the candidate SNP and the disease locus are in complete LD so that the SNP or a marker in complete LD with it may account fully for the linkage signal. Our method also yields a genetic model that includes parameter estimates for disease-SNP haplotype frequencies and the degree of disease-SNP LD. Our method provides a new tool for detecting linkage and association and can be extended to study designs that include unaffected family members. 相似文献
16.
D E Goldgar 《American journal of human genetics》1990,47(6):957-967
A unique method of partitioning human quantitative genetic variation into effects due to specific chromosomal regions is presented. This method is based on estimating the proportion of genetic material, R, shared identical by descent (IBD) by sibling pairs in a specified chromosomal region, on the basis of their marker genotypes at a set of marker loci spanning the region. The mean and variance of the distribution of R conditional on IBD status and recombination pattern between two marker loci are derived as a function of the distance between the two loci. The distribution of the estimates of R is exemplified using data on 22 loci on chromosome 7. A method of using the estimated R values and observed values of a quantitative trait in a set of sibships to estimate the proportion of total genetic variance explained by loci in the region of interest is presented. Monte Carlo simulation techniques are used to show that this method is more powerful than existing methods of quantitative linkage analysis based on sib pairs. It is also shown through simulation studies that the proposed method is sensitive to genetic variation arising from both a single locus of large effect as well as from several loosely linked loci of moderate phenotypic effect. 相似文献
17.
Two-locus models of disease: comparison of likelihood and nonparametric linkage methods. 总被引:12,自引:10,他引:2
下载免费PDF全文
![点击此处可从《American journal of human genetics》网站下载免费的PDF全文](/ch/ext_images/free.gif)
The power to detect linkage for likelihood and nonparametric (Haseman-Elston, affected-sib-pair, and affected-pedigree-member) methods is compared for the case of a common, dichotomous trait resulting from the segregation of two loci. Pedigree data for several two-locus epistatic and heterogeneity models have been simulated, with one of the loci linked to a marker locus. Replicate samples of 20 three-generation pedigrees (16 individuals/pedigree) were simulated and then ascertained for having at least 6 affected individuals. The power of linkage detection calculated under the correct two-locus model is only slightly higher than that under a single locus model with reduced penetrance. As expected, the nonparametric linkage methods have somewhat lower power than does the lod-score method, the difference depending on the mode of transmission of the linked locus. Thus, for many pedigree linkage studies, the lod-score method will have the best power. However, this conclusion depends on how many times the lod score will be calculated for a given marker. The Haseman-Elston method would likely be preferable to calculating lod scores under a large number of genetic models (i.e., varying both the mode of transmission and the penetrances), since such an analysis requires an increase in the critical value of the lod criterion. The power of the affected-pedigree-member method is lower than the other methods, which can be shown to be largely due to the fact that marker genotypes for unaffected individuals are not used. 相似文献
18.
远缘杂交中不育基因的位置和效应的最大似然估计 总被引:2,自引:1,他引:1
提出了一种统计方法,利用标记位点的异常分离,来估计远缘杂交中不育基因位点的位置和效应,在回交群体中,用最大似然法对不育基因与标记位点之间的生组值和配子存活率进行估计。将表现连续分布的育性指标转化为百连续变异的遗传标记的分离,可以避免对育性直接观测所带来的重组值估计结果的不稳定,还可以同时估计雌雄配子的存活率。 相似文献
19.
Arindam RoyChoudhury 《Journal of mathematical biology》2011,62(1):65-80
The structure of dependence between neighboring genetic loci is intractable under some models that treat each locus as a single
data-point. Composite likelihood-based methods present a simple approach under such models by treating the data as if they
are independent. A maximum composite likelihood estimator (MCLE) is not easy to find numerically, as in most cases we do not
have a way of knowing if a maximum is global. We study the local maxima of the composite likelihood (ECLE, the efficient composite
likelihood estimators), which is straightforward to compute. We establish desirable properties of the ECLE and provide an
estimator of the variance of MCLE and ECLE. We also modify two proper likelihood-based tests to be used with composite likelihood.
We modify our methods to make them applicable to datasets where some loci are excluded. 相似文献
20.
We propose a new method for simultaneously detecting linkage disequilibrium and genetic structure in subdivided populations. Taking subpopulation structure into account with a hierarchical model, we estimate the magnitude of genetic differentiation and linkage disequilibrium in a metapopulation on the basis of geographical samples, rather than decompose a population into a finite number of random-mating subpopulations. We assume that Hardy-Weinberg equilibrium is satisfied in each locality, but do not assume independence between marker loci. Linkage states remain unknown. Genetic differentiation and linkage disequilibrium are expressed as hyperparameters describing the prior distribution of genotypes or haplotypes. We estimate related parameters by maximizing marginal-likelihood functions and detect linkage equilibrium or disequilibrium by the Akaike information criterion. Our empirical Bayesian model analyzes genotype and haplotype frequencies regardless of haploid or diploid data, so it can be applied to most commonly used genetic markers. The performance of our procedure is examined via numerical simulations in comparison with classical procedures. Finally, we analyze isozyme data of ayu, a severely exploited fish species, and single-nucleotide polymorphisms in human ALDH2. 相似文献