共查询到20条相似文献,搜索用时 12 毫秒
1.
Patterns of linkage disequilibrium in the human genome 总被引:2,自引:0,他引:2
Particular alleles at neighbouring loci tend to be co-inherited. For tightly linked loci, this might lead to associations between alleles in the population a property known as linkage disequilibrium (LD). LD has recently become the focus of intense study in the hope that it might facilitate the mapping of complex disease loci through whole-genome association studies. This approach depends crucially on the patterns of LD in the human genome. In this review, we draw on empirical studies in humans and Drosophila, as well as simulation studies, to assess the current state of knowledge about patterns of LD, and consider the implications for the use of LD as a mapping tool. 相似文献
2.
There is great interest in the patterns and extent of linkage disequilibrium (LD) in humans and other species. Characterizing LD is of central importance for gene-mapping studies and can provide insights into the biology of recombination and human demographic history. Here, we review recent developments in this field, including the recently proposed 'haplotype-block' model of LD. We describe some of the recent data in detail and compare the observed patterns to those seen in simulations. 相似文献
3.
Increased gene coverage and Alu frequency in large linkage disequilibrium blocks of the human genome
The human genome has linkage disequilibrium (LD) blocks, within which single-nucleotide polymorphisms show strong association with each other. We examined data from the International HapMap Project to define LD blocks and to detect DNA sequence features inside of them. We used permutation tests to determine the empirical significance of the association of LD blocks with genes and Alu repeats. Very large LD blocks (>200 kb) have significantly higher gene coverage and Alu frequency than the outcome obtained from permutation-based simulation, whereas there was no significant positive correlation between gene density and block size. We also observed a reduced frequency of Alu repeats at the gaps between large LD blocks, indicating that their enrichment in large LD blocks does not introduce recombination hotspots that would cause these gaps. 相似文献
4.
A scan for linkage disequilibrium across the human genome. 总被引:17,自引:0,他引:17
5.
Pe'er I Chretien YR de Bakker PI Barrett JC Daly MJ Altshuler DM 《American journal of human genetics》2006,78(4):588-603
Genetic association studies of common disease often rely on linkage disequilibrium (LD) along the human genome and in the population under study. Although understanding the characteristics of this correlation has been the focus of many large-scale surveys (culminating in genomewide haplotype maps), the results of different studies have yielded wide-ranging estimates. Since understanding these differences (and whether they can be reconciled) has important implications for whole-genome association studies, in this article we dissect biases in these estimations that are due to known aspects of study design and analytic methodology. In particular, we document in the empirical data that the long-known complicating effects of allele frequency, marker density, and sample size largely reconcile all large-scale surveys. Two exceptions are an underappraisal of redundancy among single-nucleotide polymorphisms (SNPs) when evaluation is limited to short regions (as in candidate-gene resequencing studies) and an inflation in the extent of LD in HapMap phase I, which is likely due to oversampling of specific haplotypes in the creation of the public SNP map. Understanding these factors can guide the understanding of empirical LD surveys and has implications for genetic association studies. 相似文献
6.
Sun P Zhang R Jiang Y Wang X Li J Lv H Tang G Guo X Meng X Zhang H Zhang R 《The FEBS journal》2011,278(19):3748-3755
We used the genotyping data generated by the International HapMap Project to study the patterns of linkage disequilibrium (LD) in human genic regions. LD patterns for 11,998 genes from 11 HapMap populations were identified by analyzing the distribution of haplotype blocks. The genes were prioritized using LD levels. The results showed that there were significant differences in the degree of LD between genes. Genes with high or low LD (the upper and lower quartiles of the LD levels) fell into different Gene Ontology functional categories. The high LD genes clustered preferentially in the metabolic process, macromolecule localization and cell-cycle categories, whereas the low LD genes clustered in the developmental process, ion transport, and immune and regulation system categories. Furthermore, we subdivided the genic region into 3'-UTR, 5'-UTR and CDS (coding region), and compared the different LD patterns in these subregions. We found that the LD patterns in low LD genes had a more interspersed block structure compared with the high LD genes. This was especially true in the CDS and 5'-UTR. The extent of LD was somewhat higher in 5'-UTRs compared with 3'-UTRs for both high and low LD genes. In addition, we assessed the overlap for the intragenic LD regions and found that the LD regions in high LD genes were more consistent among populations. Comprehensive information about the distribution of LD patterns in gene regions in populations may provide insights into the evolutionary history of humans and help in the selection of biomarkers for disease association studies. 相似文献
7.
Reuben J. Pengelly William Tapper Jane Gibson Marcin Knut Rick Tearle Andrew Collins Sarah Ennis 《BMC genomics》2015,16(1)
Background
An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution.Results
We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure.Conclusions
WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1854-0) contains supplementary material, which is available to authorized users. 相似文献8.
Molecular linkage maps of the Populus genome. 总被引:7,自引:0,他引:7
Tongming Yin Xinye Zhang Minren Huang Minxiu Wang Qiang Zhuge Shengming Tu Li-Huang Zhu Rongling Wu 《Génome》2002,45(3):541-555
We report molecular genetic linkage maps for an interspecific hybrid population of Populus, a model system in forest-tree biology. The hybrids were produced by crosses between P. deltoides (mother) and P. euramericana (father), which is a natural hybrid of P. deltoides (grandmother) and P. nigra (grandfather). Linkage analysis from 93 of the 450 backcross progeny grown in the field for 15 years was performed using random amplified polymorphic DNAs (RAPDs), amplified fragment length polymorphisms (AFLPs), and inter-simple sequence repeats (ISSRs). Of a total of 839 polymorphic markers identified, 560 (67%) were testcross markers heterozygous in one parent but null in the other (segregating 1:1), 206 (25%) were intercross dominant markers heterozygous in both parents (segregating 3:1), and the remaining 73 (9%) were 19 non-parental RAPD markers (segregating 1:1) and 54 codominant AFLP markers (segregating 1:1:1:1). A mixed set of the testcross markers, non-parental RAPD markers, and codominant AFLP markers was used to construct two linkage maps, one based on the P. deltoides (D) genome and the other based on P. euramericana (E). The two maps showed nearly complete coverage of the genome, spanning 3801 and 3452 cM, respectively. The availability of non-parental RAPD and codominant AFLP markers as orthologous genes allowed for a direct comparison of the rate of meiotic recombination between the two different parental species. Generally, the rate of meiotic recombination was greater for males than females in our interspecific poplar hybrids. The confounded effect of sexes and species causes the mean recombination distance of orthologous markers to be 11% longer for the father (P. euramericana; interspecific hybrid) than for the mother (P. deltoides; pure species). The linkage maps constructed and the interspecific poplar hybrid population in which clonal replicates for individual genotypes are available present a comprehensive foundation for future genomic studies and quantitative trait locus (QTL) identification. 相似文献
9.
Background
During the lifetime of a fermenter culture, the soil bacterium S. coelicolor undergoes a major metabolic switch from exponential growth to antibiotic production. We have studied gene expression patterns during this switch, using a specifically designed Affymetrix genechip and a high-resolution time-series of fermenter-grown samples.Results
Surprisingly, we find that the metabolic switch actually consists of multiple finely orchestrated switching events. Strongly coherent clusters of genes show drastic changes in gene expression already many hours before the classically defined transition phase where the switch from primary to secondary metabolism was expected. The main switch in gene expression takes only 2 hours, and changes in antibiotic biosynthesis genes are delayed relative to the metabolic rearrangements. Furthermore, global variation in morphogenesis genes indicates an involvement of cell differentiation pathways in the decision phase leading up to the commitment to antibiotic biosynthesis.Conclusions
Our study provides the first detailed insights into the complex sequence of early regulatory events during and preceding the major metabolic switch in S. coelicolor, which will form the starting point for future attempts at engineering antibiotic production in a biotechnological setting. 相似文献10.
Technology and genetics have advanced to the point where genotyping thousands of individuals at thousands of marker locations around the whole human genome is possible. The whole-genome scan for detection of complex disease genes is a widely discussed topic. We review some of the recent high-density genotyping experiments and discuss related details, particularly the extent and variability of linkage disequilibrium. We also discuss the quality of single nucleotide polymorphisms (SNPs) in public databases and its consequences to the number of SNPs required for large-scale genotyping projects. 相似文献
11.
12.
Exome sequencing identifies thousands of DNA variants and a proportion of these are involved in disease. Genotypes derived from exome sequences provide particularly high-resolution coverage enabling study of the linkage disequilibrium structure of individual genes. The extent and strength of linkage disequilibrium reflects the combined influences of mutation, recombination, selection and population history. By constructing linkage disequilibrium maps of individual genes, we show that genes containing OMIM-listed disease variants are significantly under-represented amongst genes with complete or very strong linkage disequilibrium (P = 0.0004). In contrast, genes with disease variants are significantly over-represented amongst genes with levels of linkage disequilibrium close to the average for genes not known to contain disease variants (P = 0.0038). Functional clustering reveals, amongst genes with particularly strong linkage disequilibrium, significant enrichment of essential biological functions (e.g. phosphorylation, cell division, cellular transport and metabolic processes). Strong linkage disequilibrium, corresponding to reduced haplotype diversity, may reflect selection in utero against deleterious mutations which have profound impact on the function of essential genes. Genes with very weak linkage disequilibrium show enrichment of functions requiring greater allelic diversity (e.g. sensory perception and immune response). This category is not enriched for genes containing disease variation. In contrast, there is significant enrichment of genes containing disease variants amongst genes with more average levels of linkage disequilibrium. Mutations in these genes may less likely lead to in utero lethality and be subject to less intense selection. 相似文献
13.
Two linkage maps were constructed for the model plant Petunia. Mapping populations were obtained by crossing the wild species Petunia axillaris subsp. axillaris with Petunia inflata, and Petunia axillaris subsp. parodii with Petunia exserta. Both maps cover the seven chromosomes of Petunia, and span 970 centimorgans (cM) and 700 cM of the genomes, respectively. In total, 207 markers were mapped. Of these, 28 are multilocus amplified fragment length polymorphism (AFLP) markers and 179 are gene-derived markers. For the first time we report on the development and mapping of 83 Petunia microsatellites. The two maps retain the same marker order, but display significant differences of recombination frequencies at orthologous mapping intervals. A complex pattern of genomic rearrangements was detected with the related genome of tomato (Solanum lycopersicum), indicating that synteny between Petunia and other Solanaceae crops has been considerably disrupted. The newly developed markers will facilitate the genetic characterization of mutants and ecological studies on genetic diversity and speciation within the genus Petunia. The maps will provide a powerful tool to link genetic and genomic information and will be useful to support sequence assembly of the Petunia genome. 相似文献
14.
Evolutionary forces like Hill-Robertson interference and negative epistasis can lead to deleterious mutations being found on distinct haplotypes. However, the extent to which these forces depend on the selection and dominance coefficients of deleterious mutations and shape genome-wide patterns of linkage disequilibrium (LD) in natural populations with complex demographic histories has not been tested. In this study, we first used forward-in-time simulations to predict how negative selection impacts LD. Under models where deleterious mutations have additive effects on fitness, deleterious variants less than 10 kb apart tend to be carried on different haplotypes relative to pairs of synonymous SNPs. In contrast, for recessive mutations, there is no consistent ordering of how selection coefficients affect LD decay, due to the complex interplay of different evolutionary effects. We then examined empirical data of modern humans from the 1000 Genomes Project. LD between derived alleles at nonsynonymous SNPs is lower compared to pairs of derived synonymous variants, suggesting that nonsynonymous derived alleles tend to occur on different haplotypes more than synonymous variants. This result holds when controlling for potential confounding factors by matching SNPs for frequency in the sample (allele count), physical distance, magnitude of background selection, and genetic distance between pairs of variants. Lastly, we introduce a new statistic HR(j) which allows us to detect interference using unphased genotypes. Application of this approach to high-coverage human genome sequences confirms our finding that nonsynonymous derived alleles tend to be located on different haplotypes more often than are synonymous derived alleles. Our findings suggest that interference may play a pervasive role in shaping patterns of LD between deleterious variants in the human genome, and consequently influences genome-wide patterns of LD. 相似文献
15.
Jungerius BJ Gu J Crooijmans RP van der Poel JJ Groenen MA van Oost BA te Pas MF 《Animal biotechnology》2005,16(1):41-54
Linkage disequilibrium (LD) refers to the correlation among neighboring alleles, reflecting non-random patterns of association between alleles at (nearby) loci. A better understanding of LD in the porcine genome is of direct relevance for identification of genes and mutations with a certain effect on the traits of interest. Here, 215 SNPs in seven genomic regions were genotyped in individuals of three breeds. Pairwise linkage disequilibrium was calculated for all marker pairs. To estimate the extent of LD, all pairwise LD values were plotted against the distance between the markers. Based on SNP markers in four genomic regions analyzed in three panels from populations of Large White, Dutch Landrace, and Meishan origin, useful LD is estimated to extend for approximately 40 to 60 kb in the porcine genome. 相似文献
16.
Allele frequency matching between SNPs reveals an excess of linkage disequilibrium in genic regions of the human genome 下载免费PDF全文
Significant interest has emerged in mapping genetic susceptibility for complex traits through whole-genome association studies. These studies rely on the extent of association, i.e., linkage disequilibrium (LD), between single nucleotide polymorphisms (SNPs) across the human genome. LD describes the nonrandom association between SNP pairs and can be used as a metric when designing maximally informative panels of SNPs for association studies in human populations. Using data from the 1.58 million SNPs genotyped by Perlegen, we explored the allele frequency dependence of the LD statistic r(2) both empirically and theoretically. We show that average r(2) values between SNPs unmatched for allele frequency are always limited to much less than 1 (theoretical approximately 0.46 to 0.57 for this dataset). Frequency matching of SNP pairs provides a more sensitive measure for assessing the average decay of LD and generates average r(2) values across nearly the entire informative range (from 0 to 0.89 through 0.95). Additionally, we analyzed the extent of perfect LD (r(2) = 1.0) using frequency-matched SNPs and found significant differences in the extent of LD in genic regions versus intergenic regions. The SNP pairs exhibiting perfect LD showed a significant bias for derived, nonancestral alleles, providing evidence for positive natural selection in the human genome. 相似文献
17.
The current pace of the generation of sequence data requires the development of software tools that can rapidly provide full annotation of the data. We have developed a new method for rapid sequence comparison using the exact match algorithm without repeat masking. As a demonstration, we have identified all perfect simple tandem repeats (STR) within the draft sequence of the human genome. The STR elements (chromosome, position, length and repeat subunit) have been placed into a relational database. Repeat flanking sequence is also publicly accessible at http://grid.abcc.ncifcrf.gov. To illustrate the utility of this complete set of STR elements, we documented the increased density of potentially polymorphic markers throughout the genome. The new STR markers may be useful in disease association studies because so many STR elements manifest multiallelic polymorphism. Also, because triplet repeat expansions are important for human disease etiology, we identified trinucleotide repeats that exist within exons of known genes. This resulted in a list that includes all 14 genes known to undergo polynucleotide expansion, and 48 additional candidates. Several of these are non-polyglutamine triplet repeats. Other examinations of the STR database demonstrated repeats spanning splice junctions and identified SNPs within repeat elements. 相似文献
18.
Assessment of linkage disequilibrium in potato genome with single nucleotide polymorphism markers 总被引:6,自引:0,他引:6 下载免费PDF全文
The extent of linkage disequilibrium (LD) is an important factor in designing association mapping experiments. Unlike other plant species that have been analyzed so far for the extent of LD, cultivated potato (Solanum tuberosum L.), an outcrossing species, is a highly heterozygous autotetraploid. The favored genotypes of modern cultivars are maintained by vegetative propagation through tubers. As a first step in the LD analysis, we surveyed both coding and noncoding regions of 66 DNA fragments from 47 accessions for single nucleotide polymorphism (SNP). In the process, we combined information from the potato SNP database with experimental SNP detection. The total length of all analyzed fragments was >25 kb, and the number of screened sequence bases reached almost 1.4 million. Average nucleotide polymorphism (=11.5x10(-3)) and diversity (pi=14.6x10(-3)) was high compared to the other plant species. The overall Tajima's D value (0.5) was not significant, but indicates a deficit of low-frequency alleles relative to expectation. To eliminate the possibility that an elevated D value occurs due to population subdivision, we assessed the population structure with probabilistic statistics. The analysis did not reveal any significant subdivision, indicating a relatively homogenous population structure. However, the analysis of individual fragments revealed the presence of subgroups in the fragment closely linked to the R1 resistance gene. Data pooled from all fragments show relatively fast decay of LD in the short range (r2=0.208 at 1 kb) but slow decay afterward (r2=0.137 at approximately 70 kb). The estimate from our data indicates that LD in potato declines below 0.10 at a distance of approximately 10 cM. We speculate that two conflicting factors play a vital role in shaping LD in potato: the outcrossing mating type and the very limited number of meiotic generations. 相似文献
19.
Oliver JL Carpena P Román-Roldán R Mata-Balaguer T Mejías-Romero A Hackenberg M Bernaola-Galván P 《Gene》2002,300(1-2):117-127
The human genome is a mosaic of isochores, which are long DNA segments (300 kbp) relatively homogeneous in G+C. Human isochores were first identified by density-gradient ultracentrifugation of bulk DNA, and differ in important features, e.g. genes are found predominantly in the GC-richest isochores. Here, we use a reliable segmentation method to partition the longest contigs in the human genome draft sequence into long homogeneous genome regions (LHGRs), thereby revealing the isochore structure of the human genome. The advantages of the isochore maps presented here are: (1) sequence heterogeneities at different scales are shown in the same plot; (2) pair-wise compositional differences between adjacent regions are all statistically significant; (3) isochore boundaries are accurately defined to single base pair resolution; and (4) both gradual and abrupt isochore boundaries are simultaneously revealed. Taking advantage of the wide sample of genome sequence analyzed, we investigate the correspondence between LHGRs and true human isochores revealed through DNA centrifugation. LHGRs show many of the typical isochore features, mainly size distribution, G+C range, and proportions of the isochore classes. The relative density of genes, Alu and long interspersed nuclear element repeats and the different types of single nucleotide polymorphisms on LHGRs also coincide with expectations in true isochores. Potential applications of isochore maps range from the improvement of gene-finding algorithms to the prediction of linkage disequilibrium levels in association studies between marker genes and complex traits. The coordinates for the LHGRs identified in all the contigs longer than 2 Mb in the human genome sequence are available at the online resource on isochore mapping: http://bioinfo2.ugr.es/isochores. 相似文献
20.
Using dominance relationship coefficients based on linkage disequilibrium and linkage with a general complex pedigree to increase mapping resolution 下载免费PDF全文
Dominance (intralocus allelic interactions) plays often an important role in quantitative trait variation. However, few studies about dominance in QTL mapping have been reported in outbred animal or human populations. This is because common dominance effects can be predicted mainly for many full sibs, which do not often occur in outbred or natural populations with a general pedigree. Moreover, incomplete genotypes for such a pedigree make it infeasible to estimate dominance relationship coefficients between individuals. In this study, identity-by-descent (IBD) coefficients are estimated on the basis of population-wide linkage disequilibrium (LD), which makes it possible to track dominance relationships between unrelated founders. Therefore, it is possible to use dominance effects in QTL mapping without full sibs. Incomplete genotypes with a complex pedigree and many markers can be efficiently dealt with by a Markov chain Monte Carlo method for estimating IBD and dominance relationship matrices (D(RM)). It is shown by simulation that the use of D(RM) increases the likelihood ratio at the true QTL position and the mapping accuracy and power with complete dominance, overdominance, and recessive inheritance modes when using 200 genotyped and phenotyped individuals. 相似文献