首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
GENEHUNTER and SimWalk2 are among the most commonly used software for parametric multipoint linkage analysis. In the context of extended kindred analysis, GENEHUNTER has a limitation in terms of the number of individuals it can handle. One solution is to manually split the kindred into smaller pedigrees. SimWalk2 can handle a much larger number of individuals. However, its major drawback is the time it takes to process the data when compared to GENEHUNTER. Aside from the limitations of each program, when studying extended kindreds researchers are typically confronted with missing data. In this work we used simulated genotype data based on the structure of a real extended pedigree in order to compare the results obtained through GENEHUNTER and SimWalk2, evaluate the effect of discarding individuals and splitting the kindred on the logarithm of odds (lod) score, and to assess how missing data affect the performance of each program. Our results show that (1) for pedigrees of a moderate size, GENEHUNTER and SimWalk2 produce nearly the same results; (2) when using GENEHUNTER, either splitting the kindred into smaller sub-pedigrees or discarding individuals has an adverse effect when compared to the results obtained when using SimWalk2 with the whole pedigree; and (3) the performance of both programs is qualitatively similar in the missing data scenario. These conclusions are based on the sample distributions of the lod score values and of the estimates of the recombination fraction.  相似文献   

2.
Founder-origin probability methods are used to trace specific chromosomal segments in individual offspring. A haplotypic method was developed for calculating founder-origin probabilities in three-generation outbred pedigrees suited to quantitative trait locus (QTL) analysis. Estimators for expected founder-origin proportions were derived for a linkage group segment, an entire linkage group and a complete haplotype. If the founders are truly outbred, the haplotypic method gives a close approximation when compared with the Haley et al. (1994) method that simultaneously uses all marker information for QTL analysis, and it is less computationally demanding. The chief limitation of the haplotypic method is that some information in two-allele intercross marker-type configurations is ignored. Informativeness of marker arrays is discussed in the framework of founder-origin probabilities and proportions. The haplotypic method can be extended to more complex pedigrees with additional generations.  相似文献   

3.
QTL analysis in arbitrary pedigrees with incomplete marker information   总被引:3,自引:0,他引:3  
Vogl C  Xu S 《Heredity》2002,89(5):339-345
Mapping quantitative trait loci (QTL) in arbitrary outbred pedigrees is complicated by the combinatorial possibilities of allele flow relationships and of the founder allelic configurations. Exact methods are only available for rather short and simple pedigrees. Stochastic simulation using Markov chain Monte Carlo (MCMC) integration offers more flexibility. MCMC methods are less natural in a frequentist than in a Bayesian context, which we therefore adopt. Among the MCMC algorithms for updating marker locus genotypes, we implement the descent-graph algorithm. It can be used to update marker locus allele flow relationships and can handle arbitrarily complex pedigrees and missing marker information. Compared with updating marker genotypic information, updating QTL parameters, such as position, effects, and the allele flow relationships is relatively easy with MCMC. We treat the effect of each diploid combination of founder alleles as a random variable and only estimate the variance of these effects, ie, we model diploid genotypic effects instead of the usual partition in additive and dominance effects. This is a variant of the random model approach. The number of QTL alleles is generally unknown. In the Bayesian context, the number of QTL present on a linkage group can be treated as variable. Computer simulations suggest that the algorithm can indeed handle complex pedigrees and detect two QTL on a linkage group, but that the number of individuals in a single extended family is limited to about 50 to 100 individuals.  相似文献   

4.
The goal of this study is to evaluate, compare, and contrast several standard and new linkage analysis methods. First, we compare a recently proposed confidence set approach with MAPMAKER/SIBS. Then, we evaluate a new Bayesian approach that accounts for heterogeneity. Finally, the newly developed software SIMPLE is compared with GENEHUNTER. We apply these methods to several replicates of the Genetic Analysis Workshop 13 simulated data to assess their ability to detect the high blood pressure genes on chromosome 21, whose positions were known to us prior to the analyses. In contrast to the standard methods, most of the new approaches are able to identify at least one of the disease genes in all the replicates considered.  相似文献   

5.
We have evaluated the power for detecting a common trait determined by two loci, using seven statistics, of which five are implemented in the computer program SimWalk2, and two are implemented in GENEHUNTER. Unlike most previous reports which involve evaluations of the power of allelesharing statistics for a single disease locus, we have used a simulated data set of general pedigrees in which a twolocus disease is segregating and evaluated several nonparametric linkage statistics implemented in the two programs. We found that the power for detecting linkage using the Sall statistic in GENEHUNTER (GH, version 2.1), implemented as statisticE in SimWalk2 (version 2.82), is different in the two. TheP values associated with statisticE output by SimWalk2 are consistently more conservative than those from GENEHUNTER except when the underlying model includes heterogeneity at a level of 50% where theP values output are very comparable. On the other hand, when the thresholds are determined empirically under the null hypothesis, Sall in GENEHUNTER and statisticE have similar power.  相似文献   

6.
In a previous study we found evidence for an X-linked genetic component for familial typical migraine in two large Australian white pedigrees, designated MF7 and MF14. Significant excess allele sharing was indicated by nonparametric linkage (NPL) analysis using GENEHUNTER (P=0.031 and P=0.012, respectively), with a combined analysis of the two pedigrees showing further increased evidence for linkage, producing a maximum NPL score of 2.87 (P=0.011 ) at DXS 1123 on Xq27. The present study was aimed at refining the localization of the migraine X-chromosomal component by typing additional markers, performing haplotype analysis and applying a more powerful technique in the analysis of linkage data from these two pedigrees. Results from the haplotype analyses, coupled with linkage analyses that produced a peak GENEHUNTER-PLUS LOD* score of 2.388 (P=0.0005), provide compelling evidence for the presence of a migraine susceptibility locus on chromosome Xq24-28.  相似文献   

7.
When the mode of inheritance of a disease is unknown, the LOD-score method of linkage analysis must take into account uncertainties in model parameters. We have previously proposed a parametric linkage test called "MFLOD," which does not require specification of disease model parameters. In the present study, we introduce two new model-free parametric linkage tests, known as "MLOD" and "MALOD." These tests are defined, respectively, as the LOD score and the admixture LOD score, maximized (subject to the same constraints as MFLOD) over disease-model parameters. We compared the power of these three parametric linkage tests and that of two nonparametric linkage tests, NPLall and NPLpairs, which are implemented in GENEHUNTER. With the use of small pedigrees and a fully informative marker, we found the powers of MLOD, NPLall, and NPLpairs to be almost equivalent to each other and not far below that of a LOD-score analysis performed under the assumption the correct genetic parameters. Thus, linkage analysis is not much hindered by uncertain mode of inheritance. The results also suggest that both parametric and nonparametric methods are suitable for linkage analysis of complex disorders in small pedigrees. However, whether these results apply to large pedigrees remains to be answered.  相似文献   

8.
An offspring genome can be viewed as a mosaic of chromosomal segments or haplotypes contributed by multiple founders in any quantitative trait locus (QTL) detection study but tracing these is especially complex to achieve for outbred pedigrees. QTL haplotypes can be traced from offspring back to individual founders in outbred pedigrees by combining founder-origin probabilities with fully informative flanking markers. This haplotypic method was illustrated for QTL detection using a three-generation pedigree for a woody perennial plant, Pinus taeda L. Growth rate was estimated using height measurements from ages 2 to 10 years. Using simulated and actual datasets, power of the experimental design was shown to be efficient for detecting QTLs of large effect. Using interval mapping and fully informative markers, a large QTL accounting for 11.3% of the phenotypic variance in the growth rate was detected. This same QTL was expressed at all ages for height, accounting for 7.9-12.2% of the phenotypic variance. A mixed-model inheritance was more appropriate for describing genetic architecture of growth curves in P. taeda than a strictly polygenic model. The positive QTL haplotype was traced from the offspring to its contributing founder, GP3, then the haplotypic phase for GP3 was determined by assaying haploid megagametophytes. The positive QTL haplotype was a recombinant haplotype contributed by GP3. This study illustrates the combined power of fully informative flanking markers and founder origin probabilities for (1) estimating QTL haplotype magnitude, (2) tracing founder origin and (3) determining haplotypic transmission frequency.  相似文献   

9.
Three variants of the confidence set inference (CSI) procedure were proposed and applied to both the simulated and the Collaborative Study on the Genetics of Alcoholism (COGA) data. For each of the two applications, we first performed a preliminary genome scan study based on the microsatellite markers using the GENEHUNTER+ software to identify regions that potentially harbor disease loci. For each such region, we estimated the sibling identity-by-descent sharing probability distribution at the putative disease locus. Based on these estimated probabilities, the CSI procedures were employed to further localize the disease loci using the single-nucleotide polymorphism markers, leading to confidence intervals/regions for their locations. For our analysis with the simulated data, we had knowledge of the simulating models at the time we performed the analysis.  相似文献   

10.
Computational constraints currently limit exact multipoint linkage analysis to pedigrees of moderate size. We introduce new algorithms that allow analysis of larger pedigrees by reducing the time and memory requirements of the computation. We use the observed pedigree genotypes to reduce the number of inheritance patterns that need to be considered. The algorithms are implemented in a new version (version 2.1) of the software package GENEHUNTER. Performance gains depend on marker heterozygosity and on the number of pedigree members available for genotyping, but typically are 10-1,000-fold, compared with the performance of the previous release (version 2.0). As a result, families with up to 30 bits of inheritance information have been analyzed, and further increases in family size are feasible. In addition to computation of linkage statistics and haplotype determination, GENEHUNTER can also perform single-locus and multilocus transmission/disequilibrium tests. We describe and implement a set of permutation tests that allow determination of empirical significance levels in the presence of linkage disequilibrium among marker loci.  相似文献   

11.
Recent studies have suggested that a high-density single nucleotide polymorphism (SNP) marker set could provide equivalent or even superior information compared with currently used microsatellite (STR) marker sets for gene mapping by linkage. The focus of this study was to compare results obtained from linkage analyses involving extended pedigrees with STR and single-nucleotide polymorphism (SNP) marker sets. We also wanted to compare the performance of current linkage programs in the presence of high marker density and extended pedigree structures. One replicate of the Genetic Analysis Workshop 14 (GAW14) simulated extended pedigrees (n = 50) from New York City was analyzed to identify the major gene D2. Four marker sets with varying information content and density on chromosome 3 (STR [7.5 cM]; SNP [3 cM, 1 cM, 0.3 cM]) were analyzed to detect two traits, the original affection status, and a redefined trait more closely correlated with D2. Multipoint parametric and nonparametric linkage analyses (NPL) were performed using programs GENEHUNTER, MERLIN, SIMWALK2, and S.A.G.E. SIBPAL. Our results suggested that the densest SNP map (0.3 cM) had the greatest power to detect linkage for the original trait (genetic heterogeneity), with the highest LOD score/NPL score and mapping precision. However, no significant improvement in linkage signals was observed with the densest SNP map compared with STR or SNP-1 cM maps for the redefined affection status (genetic homogeneity), possibly due to the extremely high information contents for all maps. Finally, our results suggested that each linkage program had limitations in handling the large, complex pedigrees as well as a high-density SNP marker set.  相似文献   

12.
This paper is concerned with efficient strategies for gene mapping using pedigrees containing small numbers of affecteds and identity-by-descent data from closely spaced markers throughout the genome. Particular attention is paid to additive traits involving phenocopies and/or locus heterogeneity. For a sample of pedigrees containing a particular configuration of affecteds, e.g., pairs of siblings together with a first cousin, we use a likelihood analysis to find 1-df statistics that are very efficient over a broad range of penetrances and allele frequencies. We identify configurations of affecteds that are particularly powerful for detecting linkage, and we show how pedigrees containing different numbers and configurations of affecteds can be efficiently combined in an overall test statistic.  相似文献   

13.
Wang K  Peng Y 《BMC genetics》2003,4(Z1):S77
A genome-wide linkage analysis was conducted on systolic blood pressure using a score statistic. The randomly selected Replicate 34 of the simulated data was used. The score statistic was applied to the sibships derived from the general pedigrees. An add-on R program to GENEHUNTER was developed for this analysis and is freely available.  相似文献   

14.
The performance of optimization algorithms, including those based on swarm intelligence, depends on the values assigned to their parameters. To obtain high performance, these parameters must be fine-tuned. Since many parameters can take real values or integer values from a large domain, it is often possible to treat the tuning problem as a continuous optimization problem. In this article, we study the performance of a number of prominent continuous optimization algorithms for parameter tuning using various case studies from the swarm intelligence literature. The continuous optimization algorithms that we study are enhanced to handle the stochastic nature of the tuning problem. In particular, we introduce a new post-selection mechanism that uses F-Race in the final phase of the tuning process to select the best among elite parameter configurations. We also examine the parameter space of the swarm intelligence algorithms that we consider in our study, and we show that by fine-tuning their parameters one can obtain substantial improvements over default configurations.  相似文献   

15.
Gao G  Hoeschele I 《Genetics》2005,171(1):365-376
Identity-by-descent (IBD) matrix calculation is an important step in quantitative trait loci (QTL) analysis using variance component models. To calculate IBD matrices efficiently for large pedigrees with large numbers of loci, an approximation method based on the reconstruction of haplotype configurations for the pedigrees is proposed. The method uses a subset of haplotype configurations with high likelihoods identified by a haplotyping method. The new method is compared with a Markov chain Monte Carlo (MCMC) method (Loki) in terms of QTL mapping performance on simulated pedigrees. Both methods yield almost identical results for the estimation of QTL positions and variance parameters, while the new method is much more computationally efficient than the MCMC approach for large pedigrees and large numbers of loci. The proposed method is also compared with an exact method (Merlin) in small simulated pedigrees, where both methods produce nearly identical estimates of position-specific kinship coefficients. The new method can be used for fine mapping with joint linkage disequilibrium and linkage analysis, which improves the power and accuracy of QTL mapping.  相似文献   

16.
The power provided by several sampling designs to detect segregation at a major locus was investigated in a simulation study using phenotypes constructed from a major-locus genotypic mean, a background polygenic effect, and an individual-specific environmental effect. Questions of which relatives, how many relatives, and how many independent pedigrees to collect were considered, using configurations ranging from nuclear families of size 5 to 4-generation pedigrees of size 45. Each configuration contained a single proband whose phenotype exceeded the 95th percentile in a population where 2.5% carry the disease susceptibility allele. Results suggest that, under the conditions simulated, when total sample size is fixed, samples composed of 3-generation pedigrees of intermediate size provide a greater magnitude of support for the presence of a major locus than do samples composed of nuclear families or 4-generation pedigrees. This study is the first to consider both the discriminatory power and estimation efficiency in comparing alternative sampling strategies for pedigree data.  相似文献   

17.
Having found evidence for segregation at a major locus for a quantitative trait, a logical next step is to identify those pedigrees in which major-locus segregation is occurring. If the quantitative trait is a risk factor for an associated disease, identifying such segregating pedigrees can be important in classifying families by etiology, in risk assessment, and in suggesting treatment modalities. Identifying segregating pedigrees can also be helpful in selecting pedigrees to include in a subsequent linkage study to map the major locus. Here, we describe a strategy to identify pedigrees segregating at a major locus for a quantitative trait. We apply this pedigree selection strategy to simulated data generated under a major-locus or mixed model with a rare dominant allele and sampled according to one of several fixed-structure or sequential sampling designs. We demonstrate that for the situations considered, the pedigree selection strategy is sensitive and specific and that a linkage study based only on the pedigrees classified as segregating extracts essentially all the linkage information in the entire sample of pedigrees. Our results suggest that for large-scale linkage studies involving many genetic markers, the savings from this strategy can be substantial and that, compared with fixed-structure sampling, sequential sampling of pedigrees can greatly improve the efficiency for linkage analysis of a quantitative trait.  相似文献   

18.
Errors in genotype calling can have perverse effects on genetic analyses, confounding association studies, and obscuring rare variants. Analyses now routinely incorporate error rates to control for spurious findings. However, reliable estimates of the error rate can be difficult to obtain because of their variance between studies. Most studies also report only a single estimate of the error rate even though genotypes can be miscalled in more than one way. Here, we report a method for estimating the rates at which different types of genotyping errors occur at biallelic loci using pedigree information. Our method identifies potential genotyping errors by exploiting instances where the haplotypic phase has not been faithfully transmitted. The expected frequency of inconsistent phase depends on the combination of genotypes in a pedigree and the probability of miscalling each genotype. We develop a model that uses the differences in these frequencies to estimate rates for different types of genotype error. Simulations show that our method accurately estimates these error rates in a variety of scenarios. We apply this method to a dataset from the whole-genome sequencing of owl monkeys (Aotus nancymaae) in three-generation pedigrees. We find significant differences between estimates for different types of genotyping error, with the most common being homozygous reference sites miscalled as heterozygous and vice versa. The approach we describe is applicable to any set of genotypes where haplotypic phase can reliably be called and should prove useful in helping to control for false discoveries.  相似文献   

19.
Patients with inherited retinal dystrophies (IRDs) were recruited from two understudied populations: Mexico and Pakistan as well as a third well-studied population of European Americans to define the genetic architecture of IRD by performing whole-genome sequencing (WGS). Whole-genome analysis was performed on 409 individuals from 108 unrelated pedigrees with IRDs. All patients underwent an ophthalmic evaluation to establish the retinal phenotype. Although the 108 pedigrees in this study had previously been examined for mutations in known IRD genes using a wide range of methodologies including targeted gene(s) or mutation(s) screening, linkage analysis and exome sequencing, the gene mutations responsible for IRD in these 108 pedigrees were not determined. WGS was performed on these pedigrees using Illumina X10 at a minimum of 30X depth. The sequence reads were mapped against hg19 followed by variant calling using GATK. The genome variants were annotated using SnpEff, PolyPhen2, and CADD score; the structural variants (SVs) were called using GenomeSTRiP and LUMPY. We identified potential causative sequence alterations in 61 pedigrees (57%), including 39 novel and 54 reported variants in IRD genes. For 57 of these pedigrees the observed genotype was consistent with the initial clinical diagnosis, the remaining 4 had the clinical diagnosis reclassified based on our findings. In seven pedigrees (12%) we observed atypical causal variants, i.e. unexpected genotype(s), including 4 pedigrees with causal variants in more than one IRD gene within all affected family members, one pedigree with intrafamilial genetic heterogeneity (different affected family members carrying causal variants in different IRD genes), one pedigree carrying a dominant causative variant present in pseudo-recessive form due to consanguinity and one pedigree with a de-novo variant in the affected family member. Combined atypical and large structural variants contributed to about 20% of cases. Among the novel mutations, 75% were detected in Mexican and 50% found in European American pedigrees and have not been reported in any other population while only 20% were detected in Pakistani pedigrees and were not previously reported. The remaining novel IRD causative variants were listed in gnomAD but were found to be very rare and population specific. Mutations in known IRD associated genes contributed to pathology in 63% Mexican, 60% Pakistani and 45% European American pedigrees analyzed. Overall, contribution of known IRD gene variants to disease pathology in these three populations was similar to that observed in other populations worldwide. This study revealed a spectrum of mutations contributing to IRD in three populations, identified a large proportion of novel potentially causative variants that are specific to the corresponding population or not reported in gnomAD and shed light on the genetic architecture of IRD in these diverse global populations.  相似文献   

20.
A recent approach for gene mapping based on confidence set inference (CSI) promises several advantages, including avoidance of corrections for multiple tests, availability of confidence intervals with known statistical properties, and sufficient localizations of disease genes. This paper proposes an extended CSI procedure that can handle markers with incomplete polymorphism, thereby increasing the applicability of the set of CSI methods in practical situations. Simulation studies show that the new procedure retains the main advantages of the original CSI. Although it generally requires more data to achieve a similar power, this increase is moderate for markers with 80% heterozygosity or higher. We also investigate the effects of relative risk estimates and disease models. Our analyses show that perturbation from actual relative risks or multilocus disease models generally leads to reduction in power or inflation in type I error, as expected. Nevertheless, for certain classes of two-locus disease models, CSI can still perform well, with reasonably high actual coverage probabilities for at least one of the disease loci. Application of CSI to the data provided by the Genetic Analysis Workshop 13 yields encouraging results, as they compare favorably to those obtained from GENEHUNTER using its NPL sib-pair method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号