首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome wide linkage disequilibrium (LD) was investigated in a set of 32 genotypes representing salt tolerant improved varieties and landraces and six salt sensitive genotypes of rice with 64 microsatellite markers to identify the genomic regions that are associated with salt tolerance in rice. Out of 64 markers analyzed, 36% SSR pairs exhibited significant LD at 0.05. A few regions were identified as targets of selection in 10 chromosomes with high r 2 values. The model-based groups from Bayesian clustering analysis are largely consistent with known pedigrees of the lines. The increased percentage of association of SSR loci in the improved varieties indicated the role of selection in linkage disequilibrium especially for salt tolerance. LD was extended as far as 100 cM in the present study. Most of the markers (43.8%) with significant LD values were observed in the genomic regions of reported QTL for salt tolerance in rice.  相似文献   

2.
凉山半细毛羊1号染色体微卫星遗传连锁图谱的构建   总被引:1,自引:1,他引:0  
张明亚  吴登俊 《遗传》2005,27(4):575-578
实验选择绵羊1号条染色体上的9个微卫星标记,采用父系半同胞家系群体(共387个个体)构建凉山半细毛羊1号染色体遗传连锁图。建立的资源参考家系通过20个微卫星标记进行了系谱确证。试验结果表明,9个标记的等位基因数变化范围为5~15个,杂合度在0.202~0.831之间,平均杂合度为0.617,各标记的平均多态信息含量PIC=0.604。构建的凉山半细毛羊1号条染色体遗传连锁图总长度311.0 cM,与美国肉畜中心(USDA)和国际绵羊作图中心(IMF)构建的绵羊1号条染色遗传连锁图结果基本一致。可用于下一步的QTL定位研究。  相似文献   

3.
Genomic resources are sparse in most ecologically and economically important North American hardwood species. As part of the Hardwood Genomics project (http://www.hardwoodgenomics.org/), we evaluated the utility of restriction site associated DNA sequencing (RAD-Seq) for framework genetic linkage map construction in honeylocust (Gleditsia triacanthos L.), a leguminous tree common in eastern North America. Starting with a large open-pollinated family of progeny from a single tree, a mapping pedigree of 92 putative full-sibs was identified by kin group assignment and paternity analyses with microsatellite markers. RAD-Seq using Illumina next-generation DNA sequencing (NGS) generated over 117 M reads among the 92 plants. De novo reference genome clustering and alignment of samples to the reference genome revealed 5849 candidate single nucleotide polymorphisms (SNPs), of which 1570 were retained after quality filtering. Of the 1570 SNPs, 236 were in pseudo-testcross mapping configuration in the maternal parent and segregated approximately in the expected 1:1 ratio. The final map generated has a total length of 815.57 cM and consists of 178 markers on 14 linkage groups, corresponding to the haploid chromosome number in honey locust. Synteny and collinearity between honey locust and model legumes Glycine max, Medicago truncatula, and Phaseolus vulgaris were found for six of the honey locust linkage groups. RAD-Seq proved to be useful for framework linkage map construction in honey locust, a species for which no genomic resources had previously been available. However, greater sequence coverage and larger full-sib mapping pedigrees are necessary for the development of high-density linkage maps with future applications in quantitative trait locus (QTL) mapping.  相似文献   

4.
We analysed linkage disequilibrium (LD) in Australian Holstein-Friesian cattle by genotyping a sample of 45 bulls for 15 closely-spaced microsatellites on two regions of BTA6 reported to carry important QTL for dairy traits. The order and distance of markers were based on the USDA-MARC linkage map. Frequencies of haplotypes were estimated using the E-M approach and a more computationally-intensive Bayesian approach as implemented in PHASE. LD was then estimated using the Hedrick multiallelic extension of Lewontin normalised coefficient D''. Estimates of D'' from the two approaches were in close agreement (r = 0.91). The mean estimates of D'' for marker pairs with an inter-marker distance of less than 5 cM (n = 13) are 0.57 and 0.51, and for distances more than 20 cM (n = 44) are 0.29 and 0.17, estimated from the E-M and Bayesian approaches, respectively. The Malecot model was fitted for the exponential decline of LD with map distance between markers. The swept radii (the distance at which LD has declined to 1/e (~37%) of its initial value) are 11.6 and 13.7 cM for the above two methods, respectively. The Malecot model was also fitted using map distance in Mb from the bovine integrated map (bovine location database, bLDB) in addition to cM from the MARC map. Overall, the results indicate a high level of LD on chromosome 6 in Australian dairy cattle.  相似文献   

5.
Previously, we have reported linkage of markers from chromosome 1q22 to schizophrenia, a finding supported by several independent studies. We have now examined the region of strongest linkage for evidence of linkage disequilibrium (LD) in a sample of 24 Canadian familial-schizophrenia pedigrees. Analysis of 14 microsatellites and 15 single-nucleotide polymorphisms (SNPs) from the 5.4-Mb region between D1S1653 and D1S1677 produced significant evidence (nominal P<.05) of LD between schizophrenia and 2 microsatellites and 6 SNPs. All of the markers exhibiting significant LD to schizophrenia fall within the genomic extent of the gene for carboxyl-terminal PDZ ligand of neuronal nitric oxide synthase (CAPON), making it a prime positional candidate for the schizophrenia-susceptibility locus on 1q22, although initial mutation analysis of this gene has not identified any schizophrenia-associated changes within exons. Consistent with several recently identified candidate genes for schizophrenia, CAPON is involved in signal transduction in the NMDA receptor system, highlighting the potential importance of this pathway in the etiology of schizophrenia.  相似文献   

6.
Tom Druet  Michel Georges 《Genetics》2010,184(3):789-798
Faithful reconstruction of haplotypes from diploid marker data (phasing) is important for many kinds of genetic analyses, including mapping of trait loci, prediction of genomic breeding values, and identification of signatures of selection. In human genetics, phasing most often exploits population information (linkage disequilibrium), while in animal genetics the primary source of information is familial (Mendelian segregation and linkage). We herein develop and evaluate a method that simultaneously exploits both sources of information. It builds on hidden Markov models that were initially developed to exploit population information only. We demonstrate that the approach improves the accuracy of allele phasing as well as imputation of missing genotypes. Reconstructed haplotypes are assigned to hidden states that are shown to correspond to clusters of genealogically related chromosomes. We show that these cluster states can directly be used to fine map QTL. The method is computationally effective at handling large data sets based on high-density SNP panels.ARRAY technology now allows genotyping of large cohorts for thousands to millions of single nucleotide polymorphisms (SNPs), which are becoming available for a growing list of organisms including human and domestic animals. Among other applications, these advances permit systematic scanning of the genome to map trait loci by association (e.g., Wellcome Trust Case Control Consortium 2007; Charlier et al. 2008), to predict genomic breeding values for complex traits (Meuwissen et al. 2001; Goddard and Hayes 2009), or to identify signatures of selection (e.g., Voight et al. 2006).Present-day genotyping platforms do not directly provide information about linkage phase; i.e., co-inherited alleles at adjacent heterozygous markers (haplotypes) are not identified as such. As haplotype information may considerably empower genetic analyses, indirect phasing strategies have been devised: haplotypes can be reconstructed from unphased genotypes using either familial information (Mendelian segregation and linkage) and/or population information (linkage disequilibrium, LD, and surrogate parents) (e.g., Windig and Meuwissen 2004; Scheet and Stephens 2006; Kong et al. 2008).Haplotype-based approaches are routinely applied in animal genetics for combined linkage and LD mapping of QTL (e.g., Meuwissen and Goddard 2000; Blott et al. 2003). In these studies, phasing has so far relied on familial information provided by the extended pedigrees typical of livestock (e.g., Windig and Meuwissen 2004). This approach, however, leaves a nonnegligible proportion of genotypes unphased, especially for the less connected individuals. After phasing, identity-by-descent (IBD) probabilities conditional on haplotype data—needed for QTL mapping—are computed for all chromosome pairs, using familial as well as population information (hence combined linkage and LD mapping – L + LD) (e.g., Meuwissen and Goddard 2001). However, the use of high-density SNP chips and the analysis of ever larger cohorts render the computation of pairwise IBD probabilities a bottleneck.We herein propose a more efficient, heuristic approach based on hidden Markov models (HMM). It simultaneously phases and sorts haplotypes in clusters that can be used directly for mapping or other purposes. The proposed method exploits familial as well as population information, and imputes missing genotypes. We herein describe the accuracy of the proposed method and its use for L + LD mapping of QTL.  相似文献   

7.
Spinal muscular atrophy (SMA) is a common autosomal recessive disorder resulting in loss of motor neurons. We have performed linkage analysis on a panel of families using nine markers that are closely linked to the SMA gene. The highest lod score was obtained with the marker D5S351 (Zmax = 10.04 at = 0 excluding two unlinked families, and Zmax = 8.77 at = 0.007 with all families). One type III family did not show linkage to the 5q13 markers, and in one type I consanguineous family the affected individual did not show homozygosity except for the marker D5S435. Three recombinants were identified with the closest centromeric marker, D5S435, which position the gene telomeric of this marker. These recombinants will facilitate finer mapping of the location of the SMA gene. Lastly, two families provide strong evidence for a remarkable variability in presentation of the SMA phenotype, with the age at onset in one family varying from 17 months to 13 years.  相似文献   

8.
Alan R. Rogers 《Genetics》2014,197(4):1329-1341
The “LD curve” relates the linkage disequilibrium (LD) between pairs of nucleotide sites to the distance that separates them along the chromosome. The shape of this curve reflects natural selection, admixture between populations, and the history of population size. This article derives new results about the last of these effects. When a population expands in size, the LD curve grows steeper, and this effect is especially pronounced following a bottleneck in population size. When a population shrinks, the LD curve rises but remains relatively flat. As LD converges toward a new equilibrium, its time path may not be monotonic. Following an episode of growth, for example, it declines to a low value before rising toward the new equilibrium. These changes happen at different rates for different LD statistics. They are especially slow for estimates of σd2, which therefore allow inferences about ancient population history. For the human population of Europe, these results suggest a history of population growth.  相似文献   

9.
Theo Meuwissen  Mike Goddard 《Genetics》2010,185(4):1441-1449
A novel method, called linkage disequilibrium multilocus iterative peeling (LDMIP), for the imputation of phase and missing genotypes is developed. LDMIP performs an iterative peeling step for every locus, which accounts for the family data, and uses a forward–backward algorithm to accumulate information across loci. Marker similarity between haplotype pairs is used to impute possible missing genotypes and phases, which relies on the linkage disequilibrium between closely linked markers. After this imputation step, the combined iterative peeling/forward–backward algorithm is applied again, until convergence. The calculations per iteration scale linearly with number of markers and number of individuals in the pedigree, which makes LDMIP well suited to large numbers of markers and/or large numbers of individuals. Per iteration calculations scale quadratically with the number of alleles, which implies biallelic markers are preferred. In a situation with up to 15% randomly missing genotypes, the error rate of the imputed genotypes was <1% and ∼99% of the missing genotypes were imputed. In another example, LDMIP was used to impute whole-genome sequence data consisting of 17,321 SNPs on a chromosome. Imputation of the sequence was based on the information of 20 (re)sequenced founder individuals and genotyping their descendants for a panel of 3000 SNPs. The error rate of the imputed SNP genotypes was 10%. However, if the parents of these 20 founders are also sequenced, >99% of missing genotypes are imputed correctly.HIGH-DENSITY SNP arrays are currently available for an increasing number of species. QTL mapping, marker-assisted selection (MAS), and other genetic analyses often require or benefit greatly from imputing missing genotypes and from knowing the phase of the SNP genotypes. Although many statistical methods have been developed for phasing in the literature, new methods are needed because of the spectacular improvements in the efficiency of high-throughput genotyping. The older phasing methods often use linkage information (e.g., Sobel and Lange 1996). However, due to the use of increasingly dense marker maps, the use of linkage disequilibrium (LD) information has become increasingly attractive. Moreover, the linkage analysis methods became computationally intractable as the number of SNPs and/or the number of individuals increased. Phasing methods that rely solely on LD, such as Fastphase (Scheet and Stephens 2006), tend to mistakenly introduce recombinations when applied to genotypes covering long genetic distances (Kong et al. 2008). Linkage information and use of LD are not fundamentally different—use of LD may be thought of as linkage analysis based on common ancestors that occur before the known pedigree. Kong et al. suggested a new approach called “long-range phasing (LRP)” that relies on detecting identical-by-descent (IBD) haplotypes in different individuals and can phase large numbers of SNPs. For situations where the pedigree back to the common ancestor is available this may be considered linkage analysis but it can also be used without a pedigree and would use LD information. Long-range phasing may be seen as a set of sensible, heuristic rules to determine the phase using linkage analysis information, without attempting to extract all information in an optimal way. The latter is typical for modern linkage-based phasing methods: they are less concerned about optimally using all information, since there is a surplus of information, and are more concerned with handling high SNP densities over large genetic distances and for many genotyped individuals. The surplus of information is especially large if whole-genome sequence data are used. We define here whole-genome (re)sequence data as all the SNPs in the genome, which ignores information from copy number variation and other non-SNP genetic polymorphisms.Here we describe a new phasing method, called linkage disequilibrium multilocus iterative peeling (LDMIP), which combines linkage and linkage disequilibrium information and can handle tens of thousands of SNPs per chromosome and thousands of individuals. It was initially developed for the common situation where many individuals at the top of the pedigree are ungenotyped, but the method is general and can be applied in other situations. For instance, we apply it here to the situation where a few individuals have very dense genetic information (e.g., from genome sequencing) while most individuals have sparser or no genotype data. To make optimum use of the known pedigree, family information is used quite extensively in an iterative peeling approach (Elston and Stewart 1971; Van Arendonk et al. 1989; Janss et al. 1995). The use of LD information crudely follows the approach of Meuwissen and Goddard (2001, 2007).  相似文献   

10.

Background

The genome sequence and a high-density SNP map are now available for the chicken and can be used to identify genetic markers for use in marker-assisted selection (MAS). Effective MAS requires high linkage disequilibrium (LD) between markers and quantitative trait loci (QTL), and sustained marker-QTL LD over generations. This study used data from a 3,000 SNP panel to assess the level and consistency of LD between single nucleotide polymorphisms (SNPs) over consecutive years in two egg-layer chicken lines, and analyzed one line by two methods (SNP-wise association and genome-wise Bayesian analysis) to identify markers associated with egg-quality and egg-production phenotypes.

Results

The LD between markers pairs was high at short distances (r2 > 0.2 at < 2 Mb) and remained high after one generation (correlations of 0.80 to 0.92 at < 5 Mb) in both lines. Single- and 3-SNP regression analyses using a mixed model with SNP as fixed effect resulted in 159 and 76 significant tests (P < 0.01), respectively, across 12 traits. A Bayesian analysis called BayesB, that fits all SNPs simultaneously as random effects and uses model averaging procedures, identified 33 SNPs that were included in the model >20% of the time (φ > 0.2) and an additional ten 3-SNP windows that had a sum of φ greater than 0.35. Generally, SNPs included in the Bayesian model also had a small P-value in the 1-SNP analyses.

Conclusion

High LD correlations between markers at short distances across two generations indicate that such markers will retain high LD with linked QTL and be effective for MAS. The different association analysis methods used provided consistent results. Multiple single SNPs and 3-SNP windows were significantly associated with egg-related traits, providing genomic positions of QTL that can be useful for both MAS and to identify causal mutations.
  相似文献   

11.
Observed linkage disequilibrium (LD) between genetic markers in different populations descended independently from a common ancestral population can be used to estimate their absolute time of divergence, because the correlation of LD between populations will be reduced each generation by an amount that, approximately, depends only on the recombination rate between markers. Although drift leads to divergence in allele frequencies, it has less effect on divergence in LD values. We derived the relationship between LD and time of divergence and verified it with coalescent simulations. We then used HapMap Phase II data to estimate time of divergence between human populations. Summed over large numbers of pairs of loci, we find a positive correlation of LD between African and non-African populations at levels of up to ~0.3 cM. We estimate that the observed correlation of LD is consistent with an effective separation time of approximately 1,000 generations or ~25,000 years before present. The most likely explanation for such relatively low separation times is the existence of substantial levels of migration between populations after the initial separation. Theory and results from coalescent simulations confirm that low levels of migration can lead to a downward bias in the estimate of separation time.  相似文献   

12.
Genetic linkage heterogeneity in the fragile X syndrome   总被引:8,自引:0,他引:8  
Summary Genetic linkage between a factor IX DNA restriction fragment length polymorphism (RFLP) and the fragile X chromosome marker was analyzed in eight fragile X pedigrees and compared to eight previously reported pedigrees. A large pedigree with apparently full penetrance in all male members showed a high frequency of recombination. A lod score of-7.39 at =0 and a maximum score of 0.26 at =0.32 were calculated. A second large pedigree with a non-penetrant male showed tight linkage with a maximum lod score of 3.13 at =0, a result similar to one large pedigree with a nonpenetrant male previously reported. The differences in lod scores seen in these large pedigrees suggested there was genetic heterogeneity in linkage between families which appeared to relate to the presence of nonpenetrant males. The combined lod score for the three pedigrees with nonpenetrant males was 6.84 at 0=0. For the 13 other pedigrees without nonpenetrant males the combined lod score was-21.81 at =0, with a peak of 0.98 at =0.28. When lod scores from all 16 families were combined, the value was-15.14 at =0 and the overall maximum was 5.13 at =0.17.To determine whether genetic heterogeneity was present, three statistical tests for heterogeneity were employed. First, a predivided-sample test was used. The 16 pedigrees were divided into two classes, NP and P, based upon whether or not any nonpenetrant males were detected in the pedigree. This test gave evidence for significant genetic heterogencity whether the three large pedigrees with seven or more informative males (P<0.005), the eight pedigrees with three informative males (P<0.001), or all 16 pedigrees (P<0.001) were included in the analysis. Second, Morton's large sample test was employed. Significant heterogeneity was present when the analysis was restricted to the three large pedigrees (P<0.025), or to the eight pedigrees with informative males (P<0.05) but not when smaller, less informative pedigrees were also included. Third, an admixture test for heterogeneity was employed which tests for linkage versus no linkage. A trend toward significance was seen (0.05<P<0.10) which increased when the analysis was restricted to the larger, more informative pedigrees.The pedigrees where nonpenetrant males are detected appear to constitute one class (NP) where tight linkage to factor IX is predicted. The pedigrees where full penetrance is present appear to consitute a second class (P) where loose linkage to factor IX is predicted. Either the chromosomal location of the mutation or suppression of recombination to nearby genes may be different in the two classes of pedigrees. In the NP class of fra X pedigrees, information from DNA analysis should be useful for carrier detection, prenatal diagnosis, and genetic counseling.  相似文献   

13.
Linkage disequilibrium (LD) content was calculated for the Genetic Analysis Workshop 14 Affymetrix and Illumina single-nucleotide polymorphism (SNP) genome scans of the Collaborative Study on the Genetics of Alcoholism samples. Pair-wise LD was measured as both D' and r2 on 505 pedigree founder individuals. The r2 estimates were then used to correct the multipoint identity by descent matrix (MIBD) calculation to account for LD and LOD scores on chromosomes 3 and 18 were calculated for COGA's ttdt3 electrophysiological trait using those MIBDs. Extensive LD was observed throughout both marker sets, and it was higher in Affymetrix's more dense SNP map. However, SNP density did not solely account for Affymetrix's higher LD. MIBD estimation procedures assume linkage equilibrium to construct genotypes of non-genotyped pedigree founder individuals, and dense SNP genotyping maps are likely to contain moderate to high LD between markers. LOD score plots calculated after correction for LD followed the same general pattern as uncorrected ones. Since in our study almost half of the pedigree founders were genotyped, it is possible that LD had a minor impact on the LOD scores. Caution should probably be taken when using high density SNP maps when many non-genotyped founders are present in the study pedigrees.  相似文献   

14.
Genotype data from the Illumina Linkage III SNP panel (n = 4,720 SNPs) and the Affymetrix 10 k mapping array (n = 11,120 SNPs) were used to test the effects of linkage disequilibrium (LD) between SNPs in a linkage analysis in the Collaborative Study on the Genetics of Alcoholism pedigree collection (143 pedigrees; 1,614 individuals). The average r2 between adjacent markers across the genetic map was 0.099 +/- 0.003 in the Illumina III panel and 0.17 +/- 0.003 in the Affymetrix 10 k array. In order to determine the effect of LD between marker loci in a nonparametric multipoint linkage analysis, markers in strong LD with another marker (r2 > 0.40) were removed (n = 471 loci in the Illumina panel; n = 1,804 loci in the Affymetrix panel) and the linkage analysis results were compared to the results using the entire marker sets. In all analyses using the ALDX1 phenotype, 8 linkage regions on 5 chromosomes (2, 7, 10, 11, X) were detected (peak markers p < 0.01), and the Illumina panel detected an additional region on chromosome 6. Analysis of the same pedigree set and ALDX1 phenotype using short tandem repeat markers (STRs) resulted in 3 linkage regions on 3 chromosomes (peak markers p < 0.01). These results suggest that in this pedigree set, LD between loci with spacing similar to the SNP panels tested may not significantly affect the overall detection of linkage regions in a genome scan. Moreover, since the data quality and information content are greatly improved in the SNP panels over STR genotyping methods, new linkage regions may be identified due to higher information content and data quality in a dense SNP linkage panel.  相似文献   

15.
Rhesus and cynomolgus macaques are frequently used in biomedical research, and the availability of their reference genomes now provides for their use in genome-wide association studies. However, little is known about linkage disequilibrium (LD) in their genomes, which can affect the design and success of such studies. Here we studied LD by using 1781 conserved single-nucleotide polymorphisms (SNPs) in 183 rhesus macaques (Macaca mulatta), including 97 purebred Chinese and 86 purebred Indian animals, and 96 cynomolgus macaques (M. fascicularis fascicularis). Correlation between loci pairs decayed to 0.02 at 1146.83, 2197.92, and 3955.83 kb for Chinese rhesus, Indian rhesus, and cynomolgus macaques, respectively. Differences between the observed heterozygosity and minor allele frequency (MAF) of pairs of these 3 taxa were highly statistically significant. These 3 nonhuman primate taxa have significantly different genetic diversities (heterozygosity and MAF) and rates of LD decay. Our study confirms a much lower rate of LD decay in Indian than in Chinese rhesus macaques relative to that previously reported. In contrast, the especially low rate of LD decay in cynomolgus macaques suggests the particular usefulness of this species in genome-wide association studies. Although conserved markers, such as those used here, are required for valid LD comparisons among taxa, LD can be assessed with less bias by using species-specific markers, because conserved SNPs may be ancestral and therefore not informative for LD.Abbreviations: GWAS, genome-wide association study; LD, linkage disequilibrium; MAF, minor allele frequencyContributing to the widespread use of nonhuman primates in biomedical research, captive-breeding programs such as those of the National Primate Research Center system in the United States were established initially by using animals imported from Asia. The 2 most commonly used primates are rhesus macaques (Macaca mulatta) and long-tailed or cynomolgus macaques (M. fascicularis fascicularis).After humans, rhesus macaques are the most widely distributed primate species.37,38 This species is found throughout mainland Asia, ranging from Afghanistan to India and eastward through Thailand and southern China to the Yellow Sea.31,34 In addition to their significant morphological differences,9 rhesus macaques of Indian and Chinese origins have been demonstrated to exhibit significant phenotypic differences that are directly relevant to their use as biomedical models in experimental studies.2,23,42 Cynomolgus macaques are found south of the subtropical and temperate geographic distributions of rhesus macaques, in the south and southeast Indo-Malayan regions.8,10The 2 species share a common ancestor that lived 1 to 2 million years ago.3,13,25 This ancestral population of rhesus macaques diverged from a fascicularis-like ancestor shared in common with both rhesus and cynomolgus macaques after cynomolgus macaques expanded from their homeland in Indonesia.36 For this reason, genetic markers present in Indian rhesus macaques are either highly derived or are conserved as ancestral markers shared with Chinese rhesus macaques. The interspecific boundaries of rhesus and cynomolgus macaques are delineated by a narrow zone of parapatry in northern Indochina,7,8,10 within which male-biased gene flow37,39 and relatively high, but highly variable, levels of introgression of genes32 have occurred from rhesus to cynomolgus macaque groups.37,39 Because cynomolgus macaques originated in Indonesia36 and because rhesus macaques probably diverged from cynomolgus macaques in southwestern China,11 genetic markers shared between Indonesian cynomolgus macaques and Chinese rhesus macaques comprise a unique set of markers that are conserved in both macaque species.The wide assortment of morphometric differences8,9 and the broad geographic distribution of these 2 macaque species foster an expectation of high genetic diversity within and between them that could be exploited for mapping genes responsible for phenotypic differences between taxa. A better understanding of linkage disequilibrium (LD) in these nonhuman primate species can lead to a more informed selection of study subjects for, and more efficient conduct of, genome-wide association studies (GWAS) of particular diseases that macaques share in common with humans. LD is the nonrandom association of alleles at 2 or more adjacent loci that descend from single, ancestral chromosomes.29 LD plays a critical role in gene mapping, both as a tool for fine mapping of complex disease genes and in GWAS-based approaches. GWAS facilitate the identification of genes associated with complex and common traits or diseases by examining LD estimates among large numbers of common genetic variants, typically single-nucleotide polymorphisms (SNPs), between pairs of different groups of subjects to determine whether any variant is associated with a trait or disease of interest. LD data make tightly linked variants strongly correlated to produce successful association studies. For instance, LD reduces the number of markers and sample size of study subjects required to map genes influencing phenotypes to the genome because markers in LD are linked and inherited together.13 In addition, differences in LD can be used to identify orthologs for detecting the signatures of selective sweeps,21 as defined by dN/dS ratios obtained through the McDonald–Kreitman neutrality test.24 Furthermore, LD assessments can provide a more complete understanding of genome structure by defining the boundaries of haplotype blocks, within which recombination is rare or absent and which are separated by recombination ‘hotspots,’ in genomes.43Evidence from a study based on 1476 SNPs identified in ENCODE regions of the Indian rhesus macaque genome13 indicated that the rate of LD decay is higher in Chinese than in Indian rhesus macaques due to an hypothesized genetic bottleneck experienced by Indian rhesus macaques after diverging from the eastern subspecies, and, therefore, that Indian rhesus macaques, having higher LD, may be more useful for GWAS than Chinese rhesus macaques. In that study,13 only 33% of the SNPs were shared in common between the 2 subspecies, with Chinese rhesus macaques contributing to more than 60% of the remaining rhesus SNPs. Conversely, another study41 reported a slower rate of decay of LD in 25 Chinese than in 25 Indian rhesus macaques on the basis of 4040 SNPs, only 2% of which fell in coding regions, but 68% of those SNPs were shared between the 2 subspecies, with Indian rhesus macaques contributing almost 60% of the remaining SNPs. The marked disparity between the 2 studies in the proportions of shared SNPs used, the subspecies with the most genetic diversity, the sample size of Chinese rhesus macaques, the proportions of SNPs located in or near coding regions that are subject to functional constraints, and the greater disparity in LD decay between the 2 subspecies of rhesus macaques might reflect biases in either or both studies. For example, the use of markers whose frequencies are uncharacteristically low in one subspecies relative to the other can underestimate the rate of LD decay because lower frequency alleles, on average, are younger and have experienced less time for recombination.26 To avoid the influence of such ascertainment biases, comparisons of LD between 2 taxa should involve only SNPs conserved in both taxa. Moreover, because 2 points do not provide a phylogenetic or cladistic analysis to assign specific SNPs to origin on one phylogenetic line or another, comparing just the Indian and Chinese rhesus macaques without an additional primate taxon makes it is difficult to establish polarity and distinguish between derived and conserved SNPs. This limitation likely led to the contradictory conclusions of the 2 previously cited studies13,41 regarding the rate of LD decay in Chinese and Indian rhesus macaques.Because rhesus and cynomolgus macaques share a common fascicularis-like ancestor, a comparison of heterospecific SNPs among cynomolgus, Indian rhesus, and Chinese rhesus macaques would likely be fundamental to inferences regarding genome-wide LD estimates. The objective of the present study was to evaluate the conclusions of previous studies13,41 by using our panel of 1781 autosomal SNPs that are conserved in both rhesus and cynomolgus macaques to estimate the rates at which genome-wide LD decays in Indian and Chinese rhesus macaques and cynomolgus macaques, the species ancestral to rhesus macaques, and to evaluate the suitability of these populations for GWAS.  相似文献   

16.
17.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.  相似文献   

18.

Background

Characterization of viruses in HIV-1 transmission pairs will help identify biological determinants of infectiousness and evaluate candidate interventions to reduce transmission. Although HIV-1 sequencing is frequently used to substantiate linkage between newly HIV-1 infected individuals and their sexual partners in epidemiologic and forensic studies, viral sequencing is seldom applied in HIV-1 prevention trials. The Partners in Prevention HSV/HIV Transmission Study (ClinicalTrials.gov #NCT00194519) was a prospective randomized placebo-controlled trial that enrolled serodiscordant heterosexual couples to determine the efficacy of genital herpes suppression in reducing HIV-1 transmission; as part of the study analysis, HIV-1 sequences were examined for genetic linkage between seroconverters and their enrolled partners.

Methodology/Principal Findings

We obtained partial consensus HIV-1 env and gag sequences from blood plasma for 151 transmission pairs and performed deep sequencing of env in some cases. We analyzed sequences with phylogenetic techniques and developed a Bayesian algorithm to evaluate the probability of linkage. For linkage, we required monophyletic clustering between enrolled partners'' sequences and a Bayesian posterior probability of ≥50%. Adjudicators classified each seroconversion, finding 108 (71.5%) linked, 40 (26.5%) unlinked, and 3 (2.0%) indeterminate transmissions, with linkage determined by consensus env sequencing in 91 (84%). Male seroconverters had a higher frequency of unlinked transmissions than female seroconverters. The likelihood of transmission from the enrolled partner was related to time on study, with increasing numbers of unlinked transmissions occurring after longer observation periods. Finally, baseline viral load was found to be significantly higher among linked transmitters.

Conclusions/Significance

In this first use of HIV-1 sequencing to establish endpoints in a large clinical trial, more than one-fourth of transmissions were unlinked to the enrolled partner, illustrating the relevance of these methods in the design of future HIV-1 prevention trials in serodiscordant couples. A hierarchy of sequencing techniques, analysis methods, and expert adjudication contributed to the linkage determination process.  相似文献   

19.

Background

Both genome-wide association (GWA) studies and genomic selection depend on the level of non-random association of alleles at different loci, i.e. linkage disequilibrium (LD), across the genome. Therefore, characterizing LD is of fundamental importance to implement both approaches. In this study, using a 60K single nucleotide polymorphism (SNP) panel, we estimated LD and haplotype structure in crossbred broiler chickens and their component pure lines (one male and two female lines) and calculated the consistency of LD between these populations.

Results

The average level of LD (measured by r2) between adjacent SNPs across the chicken autosomes studied here ranged from 0.34 to 0.40 in the pure lines but was only 0.24 in the crossbred populations, with 28.4% of adjacent SNP pairs having an r2 higher than 0.3. Compared with the pure lines, the crossbred populations consistently showed a lower level of LD, smaller haploblock sizes and lower haplotype homozygosity on macro-, intermediate and micro-chromosomes. Furthermore, correlations of LD between markers at short distances (0 to 10 kb) were high between crossbred and pure lines (0.83 to 0.94).

Conclusions

Our results suggest that using crossbred populations instead of pure lines can be advantageous for high-resolution QTL (quantitative trait loci) mapping in GWA studies and to achieve good persistence of accuracy of genomic breeding values over generations in genomic selection. These results also provide useful information for the design and implementation of GWA studies and genomic selection using crossbred populations.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0098-4) contains supplementary material, which is available to authorized users.  相似文献   

20.

Background

It is well known that genetic components play an important role in the etiology of mandibular prognathism, but few susceptibility loci have been mapped.

Methodology

In order to identify linkage regions for mandibular prognathism, we analyzed two Chinese pedigrees with 6,090 genome-wide single-nucleotide polymorphism (SNP) markers from Illumina Linkage-12 DNA Analysis Kit (average spacing 0.58 cM). Multipoint parametric and non-parametric (model-free) linkage analyses were used for the pedigrees.

Principal Finding

The most statistically significant linkage results were with markers on chromosome 4 (LOD  = 3.166 and NPL = 3.65 with rs 875864, 4p16.1, 8.38 cM). Candidate genes within the 4p16.1 include EVC, EVC2.

Conclusion

We detected a novel suggestive linkage locus for mandibular prognathism in two Chinese pedigrees, and this linkage region provides target for susceptibility gene identification, a process that will provide important insights into the molecular and cellular basis of mandibular prognathism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号