首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Computations for genome scans need to adapt to the increasing use of dense diallelic markers as well as of full-chromosome multipoint linkage analysis with either diallelic or multiallelic markers. Whereas suitable exact-computation tools are available for use with small pedigrees, equivalent exact computation for larger pedigrees remains infeasible. Markov chain-Monte Carlo (MCMC)-based methods currently provide the only computationally practical option. To date, no systematic comparison of the performance of MCMC-based programs is available, nor have these programs been systematically evaluated for use with dense diallelic markers. Using simulated data, we evaluate the performance of two MCMC-based linkage-analysis programs--lm_markers from the MORGAN package and SimWalk2--under a variety of analysis conditions. Pedigrees consisted of 14, 52, or 98 individuals in 3, 5, or 6 generations, respectively, with increasing amounts of missing data in larger pedigrees. One hundred replicates of markers and trait data were simulated on a 100-cM chromosome, with up to 10 multiallelic and up to 200 diallelic markers used simultaneously for computation of multipoint LOD scores. Exact computation was available for comparison in most situations, and comparison with a perfectly informative marker or interprogram comparison was available in the remaining situations. Our results confirm the accuracy of both programs in multipoint analysis with multiallelic markers on pedigrees of varied sizes and missing-data patterns, but there are some computational differences. In contrast, for large numbers of dense diallelic markers, only the lm_markers program was able to provide accurate results within a computationally practical time. Thus, programs in the MORGAN package are the first available to provide a computationally practical option for accurate linkage analyses in genome scans with both large numbers of diallelic markers and large pedigrees.  相似文献   

2.
In complex disease studies, it is crucial to perform multipoint linkage analysis with many markers and to use robust nonparametric methods that take account of all pedigree information. Currently available methods fall short in both regards. In this paper, we describe how to extract complete multipoint inheritance information from general pedigrees of moderate size. This information is captured in the multipoint inheritance distribution, which provides a framework for a unified approach to both parametric and nonparametric methods of linkage analysis. Specifically, the approach includes the following: (1) Rapid exact computation of multipoint LOD scores involving dozens of highly polymorphic markers, even in the presence of loops and missing data. (2) Non-parametric linkage (NPL) analysis, a powerful new approach to pedigree analysis. We show that NPL is robust to uncertainty about mode of inheritance, is much more powerful than commonly used nonparametric methods, and loses little power relative to parametric linkage analysis. NPL thus appears to be the method of choice for pedigree studies of complex traits. (3) Information-content mapping, which measures the fraction of the total inheritance information extracted by the available marker data and points out the regions in which typing additional markers is most useful. (4) Maximum-likelihood reconstruction of many-marker haplotypes, even in pedigrees with missing data. We have implemented NPL analysis, LOD-score computation, information-content mapping, and haplotype reconstruction in a new computer package, GENEHUNTER. The package allows efficient multipoint analysis of pedigree data to be performed rapidly in a single user-friendly environment.  相似文献   

3.
George AW 《Genetics》2005,171(2):791-801
Mapping markers from linkage data continues to be a task performed in many genetic epidemiological studies. Data collected in a study may be used to refine published map estimates and a study may use markers that do not appear in any published map. Furthermore, inaccuracies in meiotic maps can seriously bias linkage findings. To make best use of the available marker information, multilocus linkage analyses are performed. However, two computational issues greatly limit the number of markers currently mapped jointly; the number of candidate marker orders increases exponentially with marker number and computing exact multilocus likelihoods on general pedigrees is computationally demanding. In this article, a new Markov chain Monte Carlo (MCMC) approach that solves both these computational problems is presented. The MCMC approach allows many markers to be mapped jointly, using data observed on general pedigrees with unobserved individuals. The performance of the new mapping procedure is demonstrated through the analysis of simulated and real data. The MCMC procedure performs extremely well, even when there are millions of candidate orders, and gives results superior to those of CRI-MAP.  相似文献   

4.
Paget disease of bone (PDB) is characterized by increased osteoclast activity and localized abnormal bone remodeling. PDB has a significant genetic component, with evidence of linkage to chromosomes 6p21.3 (PDB1) and 18q21-22 (PDB2) in some pedigrees. There is evidence of genetic heterogeneity, with other pedigrees showing negative linkage to these regions. TNFRSF11A, a gene that is essential for osteoclast formation and that encodes receptor activator of nuclear factor-kappa B (RANK), has been mapped to the PDB2 region. TNFRSF11A mutations that segregate in pedigrees with either familial expansile osteolysis or familial PDB have been identified; however, linkage studies and mutation screening have excluded the involvement of RANK in the majority of patients with PDB. We have excluded linkage, both to PDB1 and to PDB2, in a large multigenerational pedigree with multiple family members affected by PDB. We have conducted a genomewide scan of this pedigree, followed by fine mapping and multipoint analysis in regions of interest. The peak two-point LOD scores from the genomewide scan were 2.75, at D7S507, and 1.76, at D18S70. Multipoint and haplotype analysis of markers flanking D7S507 did not support linkage to this region. Haplotype analysis of markers flanking D18S70 demonstrated a haplotype segregating with PDB in a large subpedigree. This subpedigree had a significantly lower age at diagnosis than the rest of the pedigree (51.2+/-8.5 vs. 64.2+/-9.7 years; P=.0012). Linkage analysis of this subpedigree demonstrated a peak two-point LOD score of 4.23, at marker D18S1390 (straight theta=0), and a peak multipoint LOD score of 4.71, at marker D18S70. Our data are consistent with genetic heterogeneity within the pedigree and indicate that 18q23 harbors a novel susceptibility gene for PDB.  相似文献   

5.
Computation of LOD scores is a valuable tool for mapping disease-susceptibility genes in the study of Mendelian and complex diseases. However, computation of exact multipoint likelihoods of large inbred pedigrees with extensive missing data is often beyond the capabilities of a single computer. We present a distributed system called "SUPERLINK-ONLINE," for the computation of multipoint LOD scores of large inbred pedigrees. It achieves high performance via the efficient parallelization of the algorithms in SUPERLINK, a state-of-the-art serial program for these tasks, and through the use of the idle cycles of thousands of personal computers. The main algorithmic challenge has been to efficiently split a large task for distributed execution in a highly dynamic, nondedicated running environment. Notably, the system is available online, which allows computationally intensive analyses to be performed with no need for either the installation of software or the maintenance of a complicated distributed environment. As the system was being developed, it was extensively tested by collaborating medical centers worldwide on a variety of real data sets, some of which are presented in this article.  相似文献   

6.
Dense sets of hundreds of thousands of markers have been developed for genome-wide association studies. These marker sets are also beneficial for linkage analysis of large, deep pedigrees containing distantly related cases. It is impossible to analyse jointly all genotypes in large pedigrees using the Lander–Green Algorithm, however, as marker density increases it becomes less crucial to analyse all individuals’ genotypes simultaneously. In this report, an approximate multipoint non-parametric technique is described, where large pedigrees are split into many small pedigrees, each containing just two cases. This technique is demonstrated, using phased data from the International Hapmap Project to simulate sets of 10,000, 50,000 and 250,000 markers, showing that it becomes increasingly accurate as more markers are genotyped. This method allows routine linkage analysis of large families with dense marker sets and represents a more easily applied alternative to Monte Carlo Markov Chain methods.  相似文献   

7.
Linkage disequilibrium (LD) content was calculated for the Genetic Analysis Workshop 14 Affymetrix and Illumina single-nucleotide polymorphism (SNP) genome scans of the Collaborative Study on the Genetics of Alcoholism samples. Pair-wise LD was measured as both D' and r2 on 505 pedigree founder individuals. The r2 estimates were then used to correct the multipoint identity by descent matrix (MIBD) calculation to account for LD and LOD scores on chromosomes 3 and 18 were calculated for COGA's ttdt3 electrophysiological trait using those MIBDs. Extensive LD was observed throughout both marker sets, and it was higher in Affymetrix's more dense SNP map. However, SNP density did not solely account for Affymetrix's higher LD. MIBD estimation procedures assume linkage equilibrium to construct genotypes of non-genotyped pedigree founder individuals, and dense SNP genotyping maps are likely to contain moderate to high LD between markers. LOD score plots calculated after correction for LD followed the same general pattern as uncorrected ones. Since in our study almost half of the pedigree founders were genotyped, it is possible that LD had a minor impact on the LOD scores. Caution should probably be taken when using high density SNP maps when many non-genotyped founders are present in the study pedigrees.  相似文献   

8.
Dense SNP maps can be highly informative for linkage studies. But when parental genotypes are missing, multipoint linkage scores can be inflated in regions with substantial marker-marker linkage disequilibrium (LD). Such regions were observed in the Affymetrix SNP genotypes for the Genetic Analysis Workshop 14 (GAW14) Collaborative Study on the Genetics of Alcoholism (COGA) dataset, providing an opportunity to test a novel simulation strategy for studying this problem. First, an inheritance vector (with or without linkage present) is simulated for each replicate, i.e., locations of recombinations and transmission of parental chromosomes are determined for each meiosis. Then, two sets of founder haplotypes are superimposed onto the inheritance vector: one set that is inferred from the actual data and which contains the pattern of LD; and one set created by randomly selecting parental alleles based on the known allele frequencies, with no correlation (LD) between markers. Applying this strategy to a map of 176 SNPs (66 Mb of chromosome 7) for 100 replicates of 116 sibling pairs, significant inflation of multipoint linkage scores was observed in regions of high LD when parental genotypes were set to missing, with no linkage present. Similar inflation was observed in analyses of the COGA data for these affected sib pairs with parental genotypes set to missing, but not after reducing the marker map until r2 between any pair of markers was 相似文献   

9.
Sobel E  Sengul H  Weeks DE 《Human heredity》2001,52(3):121-131
OBJECTIVES: To describe, implement, and test an efficient algorithm to obtain multipoint identity-by-descent (IBD) probabilities at arbitrary positions among marker loci for general pedigrees. Unlike existing programs, our algorithm can analyze data sets with large numbers of people and markers. The algorithm has been implemented in the SimWalk2 computer package. METHODS: Using a rigorous testing regimen containing five pedigrees of various sizes with realistic marker data, we compared several widely used IBD computation programs: Allegro, Aspex, GeneHunter, MapMaker/Sibs, Mendel, Sage, SimWalk2, and Solar. RESULTS: The testing revealed a few discrepancies, particularly on consanguineous pedigrees, but overall excellent results in the deterministic multipoint packages. SimWalk2 was also found to be in good agreement with the deterministic multipoint programs, usually matching to two decimal places the kinship coefficient that ranges from 0 to 1. However, the packages based on single-point IBD estimation, while consistent with each other, often showed poor results, disagreeing with the multipoint kinship results by as much as 0.5. CONCLUSIONS: Our testing has clearly shown that multipoint IBD estimation is much better than single-point estimation. In addition, our testing has validated our algorithm for estimating IBD probabilities at arbitrary positions on general pedigrees.  相似文献   

10.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.  相似文献   

11.
Genotype data from the Illumina Linkage III SNP panel (n = 4,720 SNPs) and the Affymetrix 10 k mapping array (n = 11,120 SNPs) were used to test the effects of linkage disequilibrium (LD) between SNPs in a linkage analysis in the Collaborative Study on the Genetics of Alcoholism pedigree collection (143 pedigrees; 1,614 individuals). The average r2 between adjacent markers across the genetic map was 0.099 +/- 0.003 in the Illumina III panel and 0.17 +/- 0.003 in the Affymetrix 10 k array. In order to determine the effect of LD between marker loci in a nonparametric multipoint linkage analysis, markers in strong LD with another marker (r2 > 0.40) were removed (n = 471 loci in the Illumina panel; n = 1,804 loci in the Affymetrix panel) and the linkage analysis results were compared to the results using the entire marker sets. In all analyses using the ALDX1 phenotype, 8 linkage regions on 5 chromosomes (2, 7, 10, 11, X) were detected (peak markers p < 0.01), and the Illumina panel detected an additional region on chromosome 6. Analysis of the same pedigree set and ALDX1 phenotype using short tandem repeat markers (STRs) resulted in 3 linkage regions on 3 chromosomes (peak markers p < 0.01). These results suggest that in this pedigree set, LD between loci with spacing similar to the SNP panels tested may not significantly affect the overall detection of linkage regions in a genome scan. Moreover, since the data quality and information content are greatly improved in the SNP panels over STR genotyping methods, new linkage regions may be identified due to higher information content and data quality in a dense SNP linkage panel.  相似文献   

12.
Under additive inheritance, the Henderson mixed model equations (HMME) provide an efficient approach to obtaining genetic evaluations by marker assisted best linear unbiased prediction (MABLUP) given pedigree relationships, trait and marker data. For large pedigrees with many missing markers, however, it is not feasible to calculate the exact gametic variance covariance matrix required to construct HMME. The objective of this study was to investigate the consequences of using approximate gametic variance covariance matrices on response to selection by MABLUP. Two methods were used to generate approximate variance covariance matrices. The first method (Method A) completely discards the marker information for individuals with an unknown linkage phase between two flanking markers. The second method (Method B) makes use of the marker information at only the most polymorphic marker locus for individuals with an unknown linkage phase. Data sets were simulated with and without missing marker data for flanking markers with 2, 4, 6, 8 or 12 alleles. Several missing marker data patterns were considered. The genetic variability explained by marked quantitative trait loci (MQTL) was modeled with one or two MQTL of equal effect. Response to selection by MABLUP using Method A or Method B were compared with that obtained by MABLUP using the exact genetic variance covariance matrix, which was estimated using 15 000 samples from the conditional distribution of genotypic values given the observed marker data. For the simulated conditions, the superiority of MABLUP over BLUP based only on pedigree relationships and trait data varied between 0.1% and 13.5% for Method A, between 1.7% and 23.8% for Method B, and between 7.6% and 28.9% for the exact method. The relative performance of the methods under investigation was not affected by the number of MQTL in the model.  相似文献   

13.
Stewart WC  Thompson EA 《Biometrics》2006,62(3):728-734
As a result of previous large, multipoint linkage studies there is a substantial amount of existing marker data. Due to the increased sample size, genetic maps estimated from these data could be more accurate than publicly available maps. However, current methods for map estimation are restricted to data sets containing pedigrees with a small number of individuals, or cannot make full use of marker data that are observed at several loci on members of large, extended pedigrees. In this article, a maximum likelihood (ML) method for map estimation that can make full use of the marker data in a large, multipoint linkage study is described. The method is applied to replicate sets of simulated marker data involving seven linked loci, and pedigree structures based on the real multipoint linkage study of Abkevich et al. (2003, American Journal of Human Genetics 73, 1271-1281). The variance of the ML estimate is accurately estimated, and tests of both simple and composite null hypotheses are performed. An efficient procedure for combining map estimates over data sets is also suggested.  相似文献   

14.
Hereditary neuralgic amyotrophy (HNA) is a rare autosomal dominant disorder on chromosome 17q, associated with recurrent, episodic, painful brachial plexus neuropathy. Dysmorphic features, including hypotelorism, long nasal bridge and facial asymmetry, are frequently associated with HNA. To assess genetic homogeneity, determine the cytogenetic location, and identify flanking markers for the HNA locus, six pedigrees were studied with multiple DNA markers from distal chromosome 17q. The results in all pedigrees supported linkage of the HNA locus to chromosome 17. A maximum combined lod score (Ζ = 10.94, £ = 0.05) was obtained with marker D17S939 and the maximum multipoint lod score was 22.768 in the interval defined by D17S802– D17S939. An analysis of crossovers placed the HNA locus within an approximate 4.0-cM interval flanked by D17S1603 and D17S802. Analysis of DNA from a human/mouse somatic cell hybrid with linked markers suggests that band 17q25 harbors the HNA locus. These results support genetic homogeneity within HNA and define a specific interval and a precise cytogenetic location in chromosome 17q25 for this disorder. Received: 24 June 1997 / Accepted: 21 August 1997  相似文献   

15.
Most linkage programs assume linkage equilibrium among multiple linked markers. This assumption may lead to bias for tightly linked markers where strong linkage disequilibrium (LD) exists. We used simulated data from Genetic Analysis Workshop 14 to examine the possible effect of LD on multipoint linkage analysis. Single-nucleotide polymorphism packets from a non-disease-related region that was generated with LD were used for both model-free and parametric linkage analyses. Results showed that high LD among markers can induce false-positive evidence of linkage for affected sib-pair analysis when parental data are missing. Bias can be eliminated with parental data and can be reduced when additional markers not in LD are included in the analyses.  相似文献   

16.
Homozygosity mapping is a powerful strategy for mapping rare recessive traits in children of consanguineous marriages. Practical applications of this strategy are currently limited by the inability of conventional linkage analysis software to compute, in reasonable time, multipoint LOD scores for pedigrees with inbreeding loops. We have developed a new algorithm for rapid multipoint likelihood calculations in small pedigrees, including those with inbreeding loops. The running time of the algorithm grows, at most, linearly with the number of loci considered simultaneously. The running time is not sensitive to the presence of inbreeding loops, missing genotype information, and highly polymorphic loci. We have incorporated this algorithm into a software package, MAPMAKER/HOMOZ, that allows very rapid multipoint mapping of disease genes in nuclear families, including homozygosity mapping. Multipoint analysis with dozens of markers can be carried out in minutes on a personal workstation.  相似文献   

17.
OBJECTIVE--To define the region on human chromosome 19 carrying the gene for malignant hyperthermia susceptibility and to evaluate the use of flanking DNA markers in diagnosing susceptibility. DESIGN--Prospective molecular genetic linkage studies in a large malignant hyperthermia pedigree. SETTING--Irish malignant hyperthermia testing centre. SUBJECTS--A large Irish malignant hyperthermia pedigree. MAIN OUTCOME MEASURES--Routine diagnosis of susceptibility to malignant hyperthermia with in vitro contracture test on muscle biopsy specimens and genetic linkage between susceptibility and polymorphic DNA markers in a malignant hyperthermia family. RESULTS--Genetic typing of polymorphic DNA markers in a large Irish malignant hyperthermia pedigree generated a lod score of greater than 3 for the marker D19S9 and showed that the gene for susceptibility is flanked by the markers D19S9 and D19S16. These tightly linked flanking markers allowed non-invasive presymptomatic diagnosis of susceptibility in five untested subjects in the large pedigree with an accuracy of greater than 99.7%. CONCLUSIONS--DNA markers flanking the gene for susceptibility to malignant hyperthermia can be used with high accuracy to diagnose susceptibility in subjects in large known malignant hyperthermia pedigrees and may replace the previous in vitro contracture test for diagnosing this inherited disorder in large families with malignant hyperthermia.  相似文献   

18.
Gene mapping and genetic epidemiology require large-scale computation of likelihoods based on human pedigree data. Although computation of such likelihoods has become increasingly sophisticated, fast calculations are still impeded by complex pedigree structures, by models with many underlying loci and by missing observations on key family members. The current paper 'introduces' a new method of array factorization that substantially accelerates linkage calculations with large numbers of markers. This method is not limited to nuclear families or to families with complete phenotyping. Vectorization and parallelization are two general-purpose hardware techniques for accelerating computations. These techniques can assist in the rapid calculation of genetic likelihoods. We describe our experience using both of these methods with the existing program MENDEL. A vectorized version of MENDEL was run on an IBM 3090 supercomputer. A parallelized version of MENDEL was run on parallel machines of different architectures and on a network of workstations. Applying these revised versions of MENDEL to two challenging linkage problems yields substantial improvements in computational speed.  相似文献   

19.
Since little is known about chromosomal locations harboring type 2 diabetes-susceptibility genes, we conducted a genomewide scan for such genes in a Mexican American population. We used data from 27 low-income extended Mexican American pedigrees consisting of 440 individuals for whom genotypic data are available for 379 markers. We used a variance-components technique to conduct multipoint linkage analyses for two phenotypes: type 2 diabetes (a discrete trait) and age at onset of diabetes (a truncated quantitative trait). For the multipoint analyses, a subset of 295 markers was selected on the basis of optimal spacing and informativeness. We found significant evidence that a susceptibility locus near the marker D10S587 on chromosome 10q influences age at onset of diabetes (LOD score 3.75) and is also linked with type 2 diabetes itself (LOD score 2.88). This susceptibility locus explains 63.8%+/-9.9% (P=. 000016) of the total phenotypic variation in age at onset of diabetes and 65.7%+/-10.9% (P=.000135) of the total variation in liability to type 2 diabetes. Weaker evidence was found for linkage of diabetes and of age at onset to regions on chromosomes 3p, 4q, and 9p. In conclusion, our strongest evidence for linkage to both age at onset of diabetes and type 2 diabetes itself in the Mexican American population was for a region on chromosome 10q.  相似文献   

20.
Members of a large pedigree of Irish origin presenting with early onset Type I autosomal dominant retinitis pigmentosa (ADRP) have been typed for polymorphic DNA markers from chromosomes 6, 13, 20, and 21. For each marker close linkage to ADRP has been excluded by pairwise analyses. Using distances fixed from well-established genetic maps of these chromosomes and multipoint analyses with two or three contiguous markers, exclusion of ADRP was extended to the areas between markers, resulting in the exclusion of ADRP from extensive regions of each chromosome, totaling approximately 500 cM or 15% of the genome. The study indicates the large quantity of linkage/exclusion data obtainable using well-spaced highly polymorphic markers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号