首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The development of molecular genotyping techniques makes it possible to analyze quantitative traits on the basis of individual loci. With marker information, the classical theory of estimating the genetic covariance between relatives can be reformulated to improve the accuracy of estimation. In this study, an algorithm was derived for computing the conditional covariance between relatives given genetic markers. Procedures for calculating the conditional relationship coefficients for additive, dominance, additive by additive, additive by dominance, dominance by additive and dominance by dominance effects were developed. The relationship coefficients were computed based on conditional QTL allelic transmission probabilities, which were inferred from the marker allelic transmission probabilities. An example data set with pedigree and linked markers was used to demonstrate the methods developed. Although this study dealt with two QTLs coupled with linked markers, the same principle can be readily extended to the situation of multiple QTL. The treatment of missing marker information and unknown linkage phase between markers for calculating the covariance between relatives was discussed.  相似文献   

2.
Conditional probability methods for haplotyping in pedigrees   总被引:3,自引:0,他引:3  
Gao G  Hoeschele I  Sorensen P  Du F 《Genetics》2004,167(4):2055-2065
Efficient haplotyping in pedigrees is important for the fine mapping of quantitative trait locus (QTL) or complex disease genes. To reconstruct haplotypes efficiently for a large pedigree with a large number of linked loci, two algorithms based on conditional probabilities and likelihood computations are presented. The first algorithm (the conditional probability method) produces a single, approximately optimal haplotype configuration, with computing time increasing linearly in the number of linked loci and the pedigree size. The other algorithm (the conditional enumeration method) identifies a set of haplotype configurations with high probabilities conditional on the observed genotype data for a pedigree. Its computing time increases less than exponentially with the size of a subset of the set of person-loci with unordered genotypes and linearly with its complement. The size of the subset is controlled by a threshold parameter. The set of identified haplotype configurations can be used to estimate the identity-by-descent (IBD) matrix at a map position for a pedigree. The algorithms have been tested on published and simulated data sets. The new haplotyping methods are much faster and provide more information than several existing stochastic and rule-based methods. The accuracies of the new methods are equivalent to or better than those of these existing methods.  相似文献   

3.
Druet T  Farnir FP 《Genetics》2011,188(2):409-419
Identity-by-descent probabilities are important for many applications in genetics. Here we propose a method for modeling the transmission of the haplotypes from the closest genotyped relatives along an entire chromosome. The method relies on a hidden Markov model where hidden states correspond to the set of all possible origins of a haplotype within a given pedigree. Initial state probabilities are estimated from average genetic contribution of each origin to the modeled haplotype while transition probabilities are computed from recombination probabilities and pedigree relationships between the modeled haplotype and the various possible origins. The method was tested on three simulated scenarios based on real data sets from dairy cattle, Arabidopsis thaliana, and maize. The mean identity-by-descent probabilities estimated for the truly inherited parental chromosome ranged from 0.94 to 0.98 according to the design and the marker density. The lowest values were observed in regions close to crossing over or where the method was not able to discriminate between several origins due to their similarity. It is shown that the estimated probabilities were correctly calibrated. For marker imputation (or QTL allele prediction for fine mapping or genomic selection), the method was efficient, with 3.75% allelic imputation error rates on a dairy cattle data set with a low marker density map (1 SNP/Mb). The method should prove useful for situations we are facing now in experimental designs and in plant and animal breeding, where founders are genotyped with relatively high markers densities and last generation(s) genotyped with a lower-density panel.  相似文献   

4.
The accurate estimation of the probability of identity by descent (IBD) at loci or genome positions of interest is paramount to the genetic study of quantitative and disease resistance traits. We present a Monte Carlo Markov Chain method to compute IBD probabilities between individuals conditional on DNA markers and on pedigree information. The IBDs can be obtained in a completely general pedigree at any genome position of interest, and all marker and pedigree information available is used. The method can be split into two steps at each iteration. First, phases are sampled using current genotypic configurations of relatives and second, crossover events are simulated conditional on phases. Internal track is kept of all founder origins and crossovers such that the IBD probabilities averaged over replicates are rapidly obtained. We illustrate the method with some examples. First, we show that all pedigree information should be used to obtain line origin probabilities in F2 crosses. Second, the distribution of genetic relationships between half and full sibs is analysed in both simulated data and in real data from an F2 cross in pigs.  相似文献   

5.
Pedigree data can be evaluated, and subsequently corrected, by analysis of the distribution of genetic markers, taking account of the possibility of mistyping . Using a model of pedigree error developed previously, we obtained the maximum likelihood estimates of error parameters in pedigree data from Tokelau. Posterior probabilities for the possible true relationships in each family are conditional on the putative relationships and the marker data are calculated using the parameter estimates. These probabilities are used as a basis for discriminating between pedigree error and genetic marker errors in families where inconsistencies have been observed. When applied to the Tokelau data and compared with the results of retyping inconsistent families, these statistical procedures are able to discriminate between pedigree and marker error, with approximately 90% accuracy, for families with two or more offspring. The large proportion of inconsistencies inferred to be due to marker error (61%) indicates the importance of discriminating between error sources when judging the reliability of putative relationship data. Application of our model of pedigree error has proved to be an efficient way of determining and subsequently correcting sources of error in extensive pedigree data collected in large surveys.  相似文献   

6.
E A Thompson  R G Shaw 《Biometrics》1990,46(2):399-413
Recent developments in the animal breeding literature facilitate estimation of the variance components in quantitative genetic models. However, computation remains intensive, and many of the procedures are restricted to specialized designs and models, unsuited to data arising from studies of natural populations. We develop algorithms that allow maximum likelihood estimation of variance components for data on arbitrary pedigree structures. The proposed methods can be implemented on microcomputers, since no intensive matrix computations or manipulations are involved. Although parts of our procedures have been previously presented, we unify these into an overall scheme whose intuitive justification clarifies the approach. Two examples are analyzed: one of data on a natural population of Salivia lyrata and the other of simulated data on an extended pedigree.  相似文献   

7.
Detection and Integration of Genotyping Errors in Statistical Genetics   总被引:15,自引:0,他引:15       下载免费PDF全文
Detection of genotyping errors and integration of such errors in statistical analysis are relatively neglected topics, given their importance in gene mapping. A few inopportunely placed errors, if ignored, can tremendously affect evidence for linkage. The present study takes a fresh look at the calculation of pedigree likelihoods in the presence of genotyping error. To accommodate genotyping error, we present extensions to the Lander-Green-Kruglyak deterministic algorithm for small pedigrees and to the Markov-chain Monte Carlo stochastic algorithm for large pedigrees. These extensions can accommodate a variety of error models and refrain from simplifying assumptions, such as allowing, at most, one error per pedigree. In principle, almost any statistical genetic analysis can be performed taking errors into account, without actually correcting or deleting suspect genotypes. Three examples illustrate the possibilities. These examples make use of the full pedigree data, multiple linked markers, and a prior error model. The first example is the estimation of genotyping error rates from pedigree data. The second-and currently most useful-example is the computation of posterior mistyping probabilities. These probabilities cover both Mendelian-consistent and Mendelian-inconsistent errors. The third example is the selection of the true pedigree structure connecting a group of people from among several competing pedigree structures. Paternity testing and twin zygosity testing are typical applications.  相似文献   

8.
Programs using parallel tasks can be represented by task graphs so that scheduling algorithms can be used to find an efficient execution order of the parallel tasks. This article proposes a flexible, component-based and extensible scheduling framework called SEParAT that supports the scheduling of a parallel program in multiple ways. The article describes the functionality and the software architecture of SEParAT. The flexible interfaces enable the cooperation with other programming tools, e.g., tools exploiting a specification of the parallel task structure of an application. The core component of SEParAT is an extensible scheduling algorithm library that provides an infrastructure to determine efficient schedules for task graphs. Homogeneous as well as heterogeneous platforms can be handled. The article also includes detailed experimental results comprising the evaluation of SEParAT as well as the evaluation of a variety of scheduling algorithms.  相似文献   

9.
Stephens and Donnelly have introduced a simple yet powerful importance sampling scheme for computing the likelihood in population genetic models. Fundamental to the method is an approximation to the conditional probability of the allelic type of an additional gene, given those currently in the sample. As noted by Li and Stephens, the product of these conditional probabilities for a sequence of draws that gives the frequency of allelic types in a sample is an approximation to the likelihood, and can be used directly in inference. The aim of this note is to demonstrate the high level of accuracy of "product of approximate conditionals" (PAC) likelihood when used with microsatellite data. Results obtained on simulated microsatellite data show that this strategy leads to a negligible bias over a wide range of the scaled mutation parameter theta. Furthermore, the sampling variance of likelihood estimates as well as the computation time are lower than that obtained with importance sampling on the whole range of theta. It follows that this approach represents an efficient substitute to IS algorithms in computer intensive (e.g. MCMC) inference methods in population genetics.  相似文献   

10.
A novel gene selection algorithm based on the gene regulation probability is proposed. In this algorithm, a probabilistic model is established to estimate gene regulation probabilities using the maximum likelihood estimation method and then these probabilities are used to select key genes related by class distinction. The application on the leukemia data-set suggests that the defined gene regulation probability can identify the key genes to the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) class distinction and the result of our proposed algorithm is competitive to those of the previous algorithms.  相似文献   

11.
This paper describes a non-iterative, recursive method to compute the likelihood for a pedigree without loops, and hence an efficient way to compute genotype probabilities for every member of the pedigree. The method can be used with multiple mates and large sibships. Scaling is used in calculations to avoid numerical problems in working with large pedigrees.  相似文献   

12.
13.
In this study, historical phenotypic data from a potato breeding programme were used with an association mapping approach to identify alleles of candidate genes associated with cold‐induced sweetening of potato. Molecular marker analysis was used to determine allelic variation of candidate genes potentially involved in cold‐induced sweetening. Variations in the UDP‐glucose pyrophosphorylase (UGPase, EC 2.7.7.9) and apoplastic invertase genes (EC 3.2.1.26) were significantly associated with cold‐induced sweetening, and a possible interaction of apoplastic invertase and apoplastic invertase inhibitor was identified. This demonstrates that breeding programme phenotypic data collected over multiple years and environments can be used successfully with pedigree information for association mapping. It also confirms that the UGPase and apoplastic invertase markers are transferable across breeding programmes with distinct germplasm.  相似文献   

14.
Next generation Sequencing (NGS) provides a powerful tool for discovery of domestication genes in crop plants and their wild relatives. The accelerated domestication of new plant species as crops may be facilitated by this knowledge. Re-sequencing of domesticated genotypes can identify regions of low diversity associated with domestication. Species-specific data can be obtained from related wild species by whole-genome shot-gun sequencing. This sequence data can be used to design species specific polymerase chain reaction (PCR) primers. Sequencing of the products of PCR amplification of target genes can be used to explore genetic variation in large numbers of genes and gene families. Novel allelic variation in close or distant relatives can be characterized by NGS. Examples of recent applications of NGS to capture of genetic diversity for crop improvement include rice, sugarcane and Eucalypts. Populations of large numbers of individuals can be screened rapidly. NGS supports the rapid domestication of new plant species and the efficient identification and capture of novel genetic variation from related species.  相似文献   

15.
In populations with a known pedigree, exact joint probability distributions of numbers of surviving of genes from each founder can now be calculated for moderately large complex pedigrees (1,000–2,000 individuals and much inbreeding). The usefulness of such calculations is shown by our analysis of gene survival in the Asian wild horse (Equus przewalskii), a species now extinet in the wild with a captive population with 1,516 individuals in the known pedigree (12 generations). We calculate the genetic diversity of subsets of the current population interesting to the North American Species Survival Plan, trace the loss of genetic diversity in this species through its history in captivity, and determine genetically important individuals in the North American population—those with relatively high probabilities of having unique copy genes (genes not found in any other living individual in North America).  相似文献   

16.
A potential bias in estimation of inbreeding depression when using pedigree relationships to assess the degree of homozygosity for loci under selection is indicated. A comparison of inbreeding coefficients based on either pedigree or genotypic frequencies indicated that, as a result of selection, the inbreeding coefficient based on pedigree might not correspond with the random drift of allelic frequencies. Apparent differences in average levels of both inbreeding coefficients were obtained depending on the genetic model (additive versus dominance, initial allelic frequencies, heritability) and the selection system assumed (no versus mass selection). In the absence of selection, allelic frequencies within a small population change over generations due to random drift, and the pedigree-based inbreeding coefficient gives a proper assessment of the accompanying probability of increased homozygosity within a replicate by indicating the variance of allelic frequencies over replicates. With selection, in addition to random drift, directional change in allelic frequencies is not accounted for by the pedigree-based inbreeding coefficient. This result implies that estimation of inbreeding depression for traits under either direct or indirect selection, estimated by a regression of performance on pedigree-based coefficients, should be carefully interpreted.Deceased  相似文献   

17.
The Bayesian method for estimating species phylogenies from molecular sequence data provides an attractive alternative to maximum likelihood with nonparametric bootstrap due to the easy interpretation of posterior probabilities for trees and to availability of efficient computational algorithms. However, for many data sets it produces extremely high posterior probabilities, sometimes for apparently incorrect clades. Here we use both computer simulation and empirical data analysis to examine the effect of the prior model for internal branch lengths. We found that posterior probabilities for trees and clades are sensitive to the prior for internal branch lengths, and priors assuming long internal branches cause high posterior probabilities for trees. In particular, uniform priors with high upper bounds bias Bayesian clade probabilities in favor of extreme values. We discuss possible remedies to the problem, including empirical and full Bayesian methods and subjective procedures suggested in Bayesian hypothesis testing. Our results also suggest that the bootstrap proportion and Bayesian posterior probability are different measures of accuracy, and that the bootstrap proportion, if interpreted as the probability that the clade is true, can be either too liberal or too conservative.  相似文献   

18.
Gene content is the number of copies of a particular allele in a genotype of an animal. Gene content can be used to study additive gene action of candidate gene. Usually genotype data are available only for a part of population and for the rest gene contents have to be calculated based on typed relatives. Methods to calculate expected gene content for animals on large complex pedigrees are relatively complex. In this paper we proposed a practical method to calculate gene content using a linear regression. The method does not estimate genotype probabilities but these can be approximated from gene content assuming Hardy-Weinberg proportions. The approach was compared with other methods on multiple simulated data sets for real bovine pedigrees of 1 082 and 907 903 animals. Different allelic frequencies (0.4 and 0.2) and proportions of the missing genotypes (90, 70, and 50%) were considered in simulation. The simulation showed that the proposed method has similar capability to predict gene content as the iterative peeling method, however it requires less time and can be more practical for large pedigrees. The method was also applied to real data on the bovine myostatin locus on a large dual-purpose Belgian Blue pedigree of 235 133 animals. It was demonstrated that the proposed method can be easily adapted for particular pedigrees.  相似文献   

19.
An algorithm for automatic genotype elimination.   总被引:13,自引:4,他引:9       下载免费PDF全文
Automatic genotype elimination algorithms for a single locus play a central role in making likelihood computations on human pedigree data feasible. We present a simple algorithm that is fully efficient in pedigrees without loops. This algorithm can be easily coded and has been instrumental in greatly reducing computing times for pedigree analysis. A contrived counter-example demonstrates that some superfluous genotypes cannot be excluded for inbred pedigrees.  相似文献   

20.
Important methods for calculating likelihoods of genealogical relationships and mapping genes are based on hidden Markov models for the process of identity by descent along chromosomes. The computational time for the algorithms depends critically on the size of the statespace of the hidden Markov model. We describe the maximal grouping together of states of the model to reduce the size of the statespace. This grouping is based on pedigree symmetries. We also present an efficient algorithm for finding the maximal grouping.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号