首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary An Expectation-Maximization (EM)-algorithm procedure is presented that extends Cheliak et al. (1983) method of maximum-likelihood estimation of mating system parameters of mixed mating system models. The extension permits the estimation of the rate of self-fertilization (s) and allele frequencies (Pi) at loci in outcrossing pollen, at marker loci having recessive null alleles. The algorithm makes use of maternal and filial genotypic arrays obtained by the electrophoretic analysis of cohorts of progeny. The genotypes of maternal plants must be known. Explicit equations are given for cases when the genotype of the maternal gamete inherited by a seed can (gymnosperms) or cannot (angiosperms) be determined. The procedure can accommodate any number of codominant alleles, but only one recessive null allele at each locus. An example, using actual data from Pinus banksiana, is presented to illustrate the application of this EM algorithm to the estimation of mating system parameters using marker loci having both codominant and recessive alleles.Issued as AECL-8745  相似文献   

2.
Summary To maximize parameter estimation efficiency and statistical power and to estimate epistasis, the parameters of multiple quantitative trait loci (QTLs) must be simultaneously estimated. If multiple QTL affect a trait, then estimates of means of QTL genotypes from individual locus models are statistically biased. In this paper, I describe methods for estimating means of QTL genotypes and recombination frequencies between marker and quantitative trait loci using multilocus backcross, doubled haploid, recombinant inbred, and testcross progeny models. Expected values of marker genotype means were defined using no double or multiple crossover frequencies and flanking markers for linked and unlinked quantitative trait loci. The expected values for a particular model comprise a system of nonlinear equations that can be solved using an interative algorithm, e.g., the Gauss-Newton algorithm. The solutions are maximum likelihood estimates when the errors are normally distributed. A linear model for estimating the parameters of unlinked quantitative trait loci was found by transforming the nonlinear model. Recombination frequency estimators were defined using this linear model. Certain means of linked QTLs are less efficiently estimated than means of unlinked QTLs.  相似文献   

3.
Linkage analysis is commonly used to find marker-trait associations within the full-sib families of forest tree and other species. Study of marker-trait associations at the population level is termed linkage-disequilibrium (LD) mapping. A female-tester design comprising 200 full-sib families generated by crossing 40 pollen parents with five female parents was used to assess the relationship between the marker-allele frequency classes obtained from parental genotypes at SSR marker loci and the full-sib family performance (average predicted breeding value of two parents) in radiata pine (Pinus radiata D. Don). For alleles (at a marker locus) that showed significant association, the copy number of that allele in the parents was significantly correlated, either positively or negatively, with the full-sib family performance for various economic traits. Regression of parental breeding value on its genotype at marker loci revealed that most of the markers that showed significant association with full-sib family performance were not significantly associated with the parental breeding values. This suggests that over-representation of the female parents in our sample of 200 full-sib families could have biased the process of detecting marker-trait associations. The evidence for the existence of marker-trait LD in the population studied is rather weak and would require further testing. The exact test for genotypic disequilibrium between pairs of linked or unlinked marker loci revealed non-significant LD. Observed genotypic frequencies at several marker loci were significantly different from the expected Hardy-Weinberg equilibrium. The possibilities of utilising marker-trait associations for early selection, among-family selection and selecting parents for the next generation of breeding are also discussed.  相似文献   

4.
Autopolyploid taxa present numerous challenges for population genetic analyses due to difficulties determining allele dosage. Dosage ambiguity hinders accurate assessment of allele frequencies, multilocus genotypes (MLGTs), as well as levels and patterns of clonality. The pervasiveness of polyploidy in the evolutionary history of plant taxa makes this a recurring problem. Whereas diploidization of loci may occur over time, duplication of at least some loci is still frequently evident. Fortunately, with high-quality allozyme gels, it is possible to accurately infer allele dosage and, thus, determine exact MLGTs. However, accurately assessing dosage of microsatellite peaks is nearly impossible when studying wild populations with a large number of alleles per locus. Even if precise knowledge of genotypes is not required, for comparable numbers of alleles per locus and loci, the number of "phenotypes" is always lower with microsatellites than allozymes due to the inability to assess allele dosage. Microsatellite loci typically have more alleles per locus relative to allozymes although fewer loci are generally employed. Here, we present a mathematical model for comparing the relative utility of simple sequence repeat (SSR) versus allozyme markers to discriminate MLGTs. For example, the average plant allozyme study (2.6 alleles per locus, 10 polymorphic loci) has better discriminating power than SSR markers with 10 alleles at each of 3 loci, 9 alleles at 4 loci, 6 alleles at 5 loci, 5 alleles at 6 loci, and 4 alleles at 8 loci, demonstrating the value of assessing the relative discriminating power of these markers.  相似文献   

5.
Sample size considerations in genetic polymorphism studies.   总被引:6,自引:0,他引:6  
C B-Rao 《Human heredity》2001,52(4):191-200
OBJECTIVES: Molecular studies for genetic polymorphisms are being carried out for a number of different applications, such as genetic disorders in different populations, pharmacogenomics, genetic identification of ethnic groups for forensic and legal applications, genetic identification of breed/stock in animals and plants for commercial applications and conservation of germ plasm. In this paper, for a random sampling scheme, we address two questions: (A) What should be the minimum size of the sample so that, with a prespecified probability, all alleles at a given locus (or haplotypes at a given set of loci) are detected? (B) What should be the sample size so that the allele frequency distribution at a given locus (or haplotype frequency distribution at a given set of loci) is estimated reliably within permissible error limits? METHODS: We have used combinatorial probabilistic arguments and Monte Carlo simulations to answer these questions. RESULTS: We found that the minimum sample size required in case A depends mainly on the prespecified probability of detecting all alleles, while in case B, it varies greatly depending on the permissible error in estimation (which will vary with the application). We have obtained the minimum sample sizes for different degrees of polymorphism at a locus under high stringency, as well as a relaxed level of permissible error. We present a detailed sampling procedure for estimating allele frequencies at a given locus, which will be of use in practical applications. CONCLUSION: Since the sample size required for reliable estimation of allele frequency distribution increases with the number of alleles at the locus, there is a strong case for using biallelic markers (like single nucleotide polymorphisms) when the available sample size is about 800 or less.  相似文献   

6.
S. Xu  W. R. Atchley 《Genetics》1995,141(3):1189-1197
Mapping quantitative trait loci in outbred populations is important because many populations of organisms are noninbred. Unfortunately, information about the genetic architecture of the trait may not be available in outbred populations. Thus, the allelic effects of genes can not be estimated with ease. In addition, under linkage equilibrium, marker genotypes provide no information about the genotype of a QTL (our terminology for a single quantitative trait locus is QTL while multiple loci are referred to as QTLs). To circumvent this problem, an interval mapping procedure based on a random model approach is described. Under a random model, instead of estimating the effects, segregating variances of QTLs are estimated by a maximum likelihood method. Estimation of the variance component of a QTL depends on the proportion of genes identical-by-descent (IBD) shared by relatives at the locus, which is predicted by the IBD of two markers flanking the QTL. The marker IBD shared by two relatives are inferred from the observed marker genotypes. The procedure offers an advantage over the regression interval mapping in terms of high power and small estimation errors and provides flexibility for large sibships, irregular pedigree relationships and incorporation of common environmental and fixed effects.  相似文献   

7.
Using exact expected likelihoods, we have computed the average number of phase-unknown nuclear families needed to detect linkage and heterogeneity. We have examined the case of both dominant and recessive inheritance with reduced penetrance and phenocopies. Most of our calculations have been carried out under the assumption that 50% of families are linked to a marker locus. We have varied both the number of offspring per family and the sampling scheme. We have also investigated the increased power when the disease locus is midway between two marker loci 10 cM apart. For recessive inheritance, both linkage and heterogeneity can be detected in clinically feasible sample sizes. For dominant inheritance, linkage can be detected but heterogeneity cannot be detected unless larger sibships (four offspring) are sampled or two linked markers are available. As expected, if penetrance is reduced, sampling families with all sibs affected is most efficient. Our results provide a basis for estimating the amount of resources needed to find genes for complex disorders under conditions of heterogeneity.  相似文献   

8.
Deterministic paternity exclusion using RAPD markers   总被引:5,自引:0,他引:5  
The Random Amplified Polymorphic DNA (RAPD) technique can potentially provide hundreds of polymorphic markers for use by ecologists studying mating systems in natural populations. We consider here the implications of the dominance displayed by RAPD markers for deterministic paternity assignment. Our goal was to provide a means for assessing the costs associated with such a study for ecologists who might be considering the use of RAPD markers for paternity analysis. The theoretical expected proportion of offspring for which all males except the true father can be exlucded (P(ET)) is calculated for both dominant and codominant marker systems. The ability to assign paternity unambiguously generally increases with the number of loci and the frequency of the recessive allele (but only up to a point), and decreases with increasing sample size (number of individuals surveyed). The gain in P(ET) with decreasing sample size is unexpectedly slight. Not surprisingly, the performance of dominant markers at paternity exclusion is, in general, greatly exceeded by codominant markers, with the exception of the case in which the frequency of the recessive allele is high at all loci. In this case, codominant markers perform only slightly better than do dominant markers. Thus, a researcher should expect to score more than 50 RAPD loci for each offspring for most applications of paternity exclusion analysis.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

9.
Sibship reconstruction from genetic data with typing errors   总被引:13,自引:0,他引:13  
Wang J 《Genetics》2004,166(4):1963-1979
Likelihood methods have been developed to partition individuals in a sample into full-sib and half-sib families using genetic marker data without parental information. They invariably make the critical assumption that marker data are free of genotyping errors and mutations and are thus completely reliable in inferring sibships. Unfortunately, however, this assumption is rarely tenable for virtually all kinds of genetic markers in practical use and, if violated, can severely bias sibship estimates as shown by simulations in this article. I propose a new likelihood method with simple and robust models of typing error incorporated into it. Simulations show that the new method can be used to infer full- and half-sibships accurately from marker data with a high error rate and to identify typing errors at each locus in each reconstructed sib family. The new method also improves previous ones by adopting a fresh iterative procedure for updating allele frequencies with reconstructed sibships taken into account, by allowing for the use of parental information, and by using efficient algorithms for calculating the likelihood function and searching for the maximum-likelihood configuration. It is tested extensively on simulated data with a varying number of marker loci, different rates of typing errors, and various sample sizes and family structures and applied to two empirical data sets to demonstrate its usefulness.  相似文献   

10.
Y X Fu  R Chakraborty 《Genetics》1998,150(1):487-497
Minisatellite and microsatellite are short tandemly repetitive sequences dispersed in eukaryotic genomes, many of which are highly polymorphic due to copy number variation of the repeats. Because mutation changes copy numbers of the repeat sequences in a generalized stepwise fashion, stepwise mutation models are widely used for studying the dynamics of these loci. We propose a minimum chi-square (MCS) method for simultaneous estimation of all the parameters in a stepwise mutation model and the ancestral allelic type of a sample. The MCS estimator requires knowing the mean number of alleles of a certain size in a sample, which can be estimated using Monte Carlo samples generated by a coalescent algorithm. The method is applied to samples of seven (CA)n repeat loci from eight human populations and one chimpanzee population. The estimated values of parameters suggest that there is a general tendency for microsatellite alleles to expand in size, because (1) each mutation has a slight tendency to cause size increase and (2) the mean size increase is larger than the mean size decrease for a mutation. Our estimates also suggest that most of these CA-repeat loci evolve according to multistep mutation models rather than single-step mutation models. We also introduced several quantities for measuring the quality of the estimation of ancestral allelic type, and it appears that the majority of the estimated ancestral allelic types are reasonably accurate. Implications of our analysis and potential extensions of the method are discussed.SINCE the discovery that a large number of loci with tandemly repeated sequences in human and many eukaryote species are highly polymorphic because of copy number variation of the repeats in different individuals (Jeffreys 1985; Litt and Luty 1989; Weber and May 1989), allele size data from such loci are rapidly becoming the dominant source of genetic markers for genome mapping, forensic testing, and population studies. Loci with repeat sequences longer than 5 bp are generally referred to as minisatellite or variable number tandem repeat loci, and those with repeat sequences between 2 to 5 bp are referred to as microsatellite or short tandem repeat loci (Tautz 1993). Because mutations change the copy number of such loci in a stepwise fashion, rapid accumulation of population samples from minisatellite and microsatellite loci has resurrected the interest of the stepwise mutation model (SMM), which was popular in the 1970s.  相似文献   

11.
A PCR-based codominant marker has been developed which is tightly linked to Mi, a dominant genetic locus in tomato that confers resistance to several species of root-knot nematode. DNA from tomato lines differing in nematode resistance was screened for random amplified polymorphic DNA markers linked to Mi using decamer primers. Several markers were identified. One amplified product, REX-1, obtained using a pair of decamer primers, was present as a dominant marker in all nematode-resistant tomato lines tested. REX-1 was cloned and the DNA sequences of its ends were determined and used to develop 20-mer primers. PCR amplification with the 20-mer primers produced a single amplified band in both susceptible and resistant tomato lines. The amplified bands from susceptible and resistant lines were distinguishable after cleavage with the restriction enzyme Taq I. The linkage of REX-1 to Mi was verified in an F2 population. This marker is more tightly linked to Mi than is Aps-1, the currently-used isozyme marker, and allows screening of germplasm where the linkage between Mi and Aps-1 has been lost. Homozygous and heterozygous individuals can be distinguished and the procedure can be used for rapid, routine screening. The strategy used to obtain REX-1 is applicable to obtaining tightly-linked markers to other genetic loci. Such markers would allow rapid, concurrent screening for the segregation of several loci of interest.  相似文献   

12.
An estimator for pairwise relatedness using molecular markers   总被引:21,自引:0,他引:21  
Wang J 《Genetics》2002,160(3):1203-1215
I propose a new estimator for jointly estimating two-gene and four-gene coefficients of relatedness between individuals from an outbreeding population with data on codominant genetic markers and compare it, by Monte Carlo simulations, to previous ones in precision and accuracy for different distributions of population allele frequencies, numbers of alleles per locus, actual relationships, sample sizes, and proportions of relatives included in samples. In contrast to several previous estimators, the new estimator is well behaved and applies to any number of alleles per locus and any allele frequency distribution. The estimates for two- and four-gene coefficients of relatedness from the new estimator are unbiased irrespective of the sample size and have sampling variances decreasing consistently with an increasing number of alleles per locus to the minimum asymptotic values determined by the variation in identity-by-descent among loci per se, regardless of the actual relationship. The new estimator is also robust for small sample sizes and for unknown relatives being included in samples for estimating allele frequencies. Compared to previous estimators, the new one is generally advantageous, especially for highly polymorphic loci and/or small sample sizes.  相似文献   

13.
Sugar beet (Beta vulgaris L.) is a biennial species. Shoot elongation (bolting) starts after a period of low temperature. The dominant allele of locus B causes early bolting without cold treatment. This allele is abundant in wild beets whereas cultivated beets carry the recessive allele. Fifteen AFLP markers, tightly linked to the bolting locus, have been identified using bulked segregant analysis. The F2-population consisted of 2,134 individuals derived after selfing a single F1-plant (Bb). In a first step, a linkage map was established with 249 markers based on 775 F2-individuals with a coverage of 822.3 cM. The loci are dispersed over nine linkage groups corresponding to the haploid chromosome number of Beta species. Seventeen marker loci were placed at a distance less than 3.2 cM around the bolting gene. In a second step, four of those markers most closely linked to B were mapped with the entire F2-population. Two of the markers were mapped flanking the B gene at distances of 0.14 and 0.23 cM. The other two markers were mapped at a distance of 0.5 cM from the gene. The tight linkage could be verified by testing 88 unrelated plants from a breeding program. The closely linked markers will enable breeders to select for the non-bolting character without laborious test crossings. Moreover, these markers are being used for map-based cloning of the bolting gene.  相似文献   

14.
Summary It is shown that when an exotic strain and a commercial strain differ genetically at a quantitative locus and at an adjoining marker locus, repeated backcrosses to the commercial strain, retaining only backcross progeny carrying the exotic marker allele, will allow the effective introgression of the linked quantitative allele from the exotic to the commercial strain. The introgression procedure will be particularly effective when exotic and commercial strains differ at two nearby marker loci with the quantitative locus bracketed between them. The simultaneous introgression of a number of quantitative alleles from different exotic strains, and appropriate selection procedures in the intercross generations that follow are also considered.  相似文献   

15.
Swartz MD  Kimmel M  Mueller P  Amos CI 《Biometrics》2006,62(2):495-503
Mapping the genes for a complex disease, such as diabetes or rheumatoid arthritis (RA), involves finding multiple genetic loci that may contribute to the onset of the disease. Pairwise testing of the loci leads to the problem of multiple testing. Looking at haplotypes, or linear sets of loci, avoids multiple tests but results in a contingency table with sparse counts, especially when using marker loci with multiple alleles. We propose a hierarchical Bayesian model for case-parent triad data that uses a conditional logistic regression likelihood to model the probability of transmission to a diseased child. We define hierarchical prior distributions on the allele main effects to model the genetic dependencies present in the human leukocyte antigen (HLA) region of chromosome 6. First, we add a hierarchical level for model selection that accounts for both locus and allele selection. This allows us to cast the problem of identifying genetic loci relevant to the disease into a problem of Bayesian variable selection. Second, we attempt to include linkage disequilibrium as a covariance structure in the prior for model coefficients. We evaluate the performance of the procedure with some simulated examples and then apply our procedure to identifying genetic markers in the HLA region that influence risk for RA. Our software is available on the website http://www.epigenetic.org/Linkage/ssgs-public/.  相似文献   

16.
Historically, most methods for detecting linkage disequilibrium were designed for use with diallelic marker loci, for which the analysis is straightforward. With the advent of polymorphic markers with many alleles, the normal approach to their analysis has been either to extend the methodology for two-allele systems (leading to an increase in df and to a corresponding loss of power) or to select the allele believed to be associated and then collapse the other alleles, reducing, in a biased way, the locus to a diallelic system. I propose a likelihood-based approach to testing for linkage disequilibrium, an approach that becomes more conservative as the number of alleles increases, and as the number of markers considered jointly increases in a multipoint test for linkage disequilibrium, while maintaining high power. Properties of this method for detecting associations and fine mapping the location of disease traits are investigated. It is found to be, in general, more powerful than conventional methods, and it provides a tractable framework for the fine mapping of new disease loci. Application to the cystic fibrosis data of Kerem et al, is included to illustrate the method.  相似文献   

17.
For a linked marker locus to be useful for genetic counseling, the counselee must be heterozygous for both disease and marker loci and his or her linkage phase must be known. It is shown that when the phenotypes of the counselee's previous children for the disease and marker loci are known, the linkage phase can often be inferred with a high probability, and thus it is possible to conduct genetic counseling. To evaluate the utility of linked marker genes for genetic counseling, the accuracy of prediction of the risk for a prospective child with a given marker gene to develop the genetic disease and the proportion of families in which a particular marker locus can be used for genetic counseling are studied for X-linked recessive, autosomal dominant, and autosomal recessive diseases. In the case of X-linked genetic diseases, information from children is very useful for determining the linkage phase of the counselee and predicting the genetic disease. In the case of autosomal dominant diseases, not all children are informative, but if the number of children is large, the phenotypes of children are often more informative than the information from grandparents. In the case of autosomal recessive diseases, information from grandparents is usually useless, since they show a normal phenotype for the disease locus. If we use information on the phenotypes of children, however, the linkage phase of the counselee and the risk of a prospective child can be inferred with a high probability. The proportion of informative families depends on the dominance relationship and frequencies of marker alleles, and the number of children. In general, codominant markers are more useful than are dominant markers, and a locus with high heterozygosity is more useful than is a locus with low heterozygosity.  相似文献   

18.
Tan YD  Fu YX 《Genetics》2007,175(2):923-931
Although most high-density linkage maps have been constructed from codominant markers such as single-nucleotide polymorphisms (SNPs) and microsatellites due to their high linkage information, dominant markers can be expected to be even more significant as proteomic technique becomes widely applicable to generate protein polymorphism data from large samples. However, for dominant markers, two possible linkage phases between a pair of markers complicate the estimation of recombination fractions between markers and consequently the construction of linkage maps. The low linkage information of the repulsion phase and high linkage information of coupling phase have led geneticists to construct two separate but related linkage maps. To circumvent this problem, we proposed a new method for estimating the recombination fraction between markers, which greatly improves the accuracy of estimation through distinction between the coupling phase and the repulsion phase of the linked loci. The results obtained from both real and simulated F2 dominant marker data indicate that the recombination fractions estimated by the new method contain a large amount of linkage information for constructing a complete linkage map. In addition, the new method is also applicable to data with mixed types of markers (dominant and codominant) with unknown linkage phase.  相似文献   

19.
Slatkin M  Bertorelle G 《Genetics》2001,158(2):865-874
To better understand the forces affecting individual alleles, we introduce a method for finding the joint distribution of the frequency of a neutral allele and the extent of variability at closely linked marker loci (the intraallelic variability). We model three types of intraallelic variability: (a) the number of nonrecombinants at a linked biallelic marker locus, (b) the length of a conserved haplotype, and (c) the number of mutations at a linked marker locus. If the population growth rate is known, the joint distribution provides the basis for a test of neutrality by testing whether the observed level of intraallelic variability is consistent with the observed allele frequency. If the population growth rate is unknown but neutrality can be assumed, the joint distribution provides the likelihood of the growth rate and leads to a maximum-likelihood estimate. We apply the method to data from published data sets for four loci in humans. We conclude that the Delta32 allele at CCR5 and a disease-associated allele at MLH1 arose recently and have been subject to strong selection. Alleles at PAH appear to be neutral and we estimate the recent growth rate of the European population to be approximately 0.027 per generation with a support interval of (0.017-0.037). Four of the relatively common alleles at CFTR also appear to be neutral but DeltaF508 appears to be significantly advantageous to heterozygous carriers.  相似文献   

20.
Salicornia spp is one of the most salt-tolerant vascular plants and is native to salt marshes and estuaries. We developed expressed sequence tag derived-simple sequence repeat (EST-SSR) markers for estimating genetic diversity and marker-assisted Salicornia breeding. Six polymorphic EST-SSRs of 40 detected 27 alleles, ranging from three to five alleles per locus. The average number of alleles per locus was 4.33 and 4.17, and the major allele frequency at locus DY529765 was high, being 0.859 and 0.857 in S. bigelovii and S. europea, respectively. Gene diversity, heterozygosity and polymorphism information content were highest at locus DY529950 and similar in these two species. Gene diversity increased with increase in the number of alleles that had a low major allele frequency at a locus. Six polymorphic loci effectively discriminated 46 taxa into three clusters via different analyses. Significant deviation of F(ST) from zero in three suggested populations for six loci indicated population differentiation and limited gene flow among them. A reduced median network established that taxon SB65 is primitive. SMART (simple modular architecture research tool) analysis of peptide sequences of six EST-SSRs showed that loci DY529765, DY529950 and EC906203 contained transmembrane, TLC, AgrB and NTR domains and might be involved in salinity stress tolerance. These EST-SSRs are a valuable resource for marker development and may be useful in marker-assisted Salicornia breeding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号