首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Mutations at the cystic fibrosis transmembrane conductance regulator gene (CFTR) cause cystic fibrosis, the most prevalent severe genetic disorder in individuals of European descent. We have analyzed normal allele and haplotype variation at four short tandem repeat polymorphisms (STRPs) and two single-nucleotide polymorphisms (SNPs) in CFTR in 18 worldwide population samples, comprising a total of 1,944 chromosomes. The rooted phylogeny of the SNP haplotypes was established by typing ape samples. STRP variation within SNP haplotype backgrounds was highest in most ancestral haplotypes-although, when STRP allele sizes were taken into account, differences among haplotypes became smaller. Haplotype background determines STRP diversity to a greater extent than populations do, which indicates that haplotype backgrounds are older than populations. Heterogeneity among STRPs can be understood as the outcome of differences in mutation rate and pattern. STRP sites had higher heterozygosities in Africans, although, when whole haplotypes were considered, no significant differences remained. Linkage disequilibrium (LD) shows a complex pattern not easily related to physical distance. The analysis of the fraction of possible different haplotypes not found may circumvent some of the methodological difficulties of LD measure. LD analysis showed a positive correlation with locus polymorphism, which could partly explain the unusual pattern of similar LD between Africans and non-Africans. The low values found in non-Africans may imply that the size of the modern human population that emerged "Out of Africa" may be larger than what previous LD studies suggested.  相似文献   

2.
We have developed a software analysis package, HapScope, which includes a comprehensive analysis pipeline and a sophisticated visualization tool for analyzing functionally annotated haplotypes. The HapScope analysis pipeline supports: (i) computational haplotype construction with an expectation-maximization or Bayesian statistical algorithm; (ii) SNP classification by protein coding change, homology to model organisms or putative regulatory regions; and (iii) minimum SNP subset selection by either a Brute Force Algorithm or a Greedy Partition Algorithm. The HapScope viewer displays genomic structure with haplotype information in an integrated environment, providing eight alternative views for assessing genetic and functional correlation. It has a user-friendly interface for: (i) haplotype block visualization; (ii) SNP subset selection; (iii) haplotype consolidation with subset SNP markers; (iv) incorporation of both experimentally determined haplotypes and computational results; and (v) data export for additional analysis. Comparison of haplotypes constructed by the statistical algorithms with those determined experimentally shows variation in haplotype prediction accuracies in genomic regions with different levels of nucleotide diversity. We have applied HapScope in analyzing haplotypes for candidate genes and genomic regions with extensive SNP and genotype data. We envision that the systematic approach of integrating functional genomic analysis with population haplotypes, supported by HapScope, will greatly facilitate current genetic disease research.  相似文献   

3.
Single nucleotide polymorphisms (SNPs) are widely used when investigators try to map complex disease genes. Although biallelic SNP markers are less informative than microsatellite markers, one can increase their information content by using haplotypes. However, assigning haplotypes (i.e., assigning phase) correctly can be problematic in the presence of SNP heterozygosity. For example, a doubly heterozygous individual, with genotype 12, 12, could have haplotypes 1-1/2-2 or 1-2/2-1 with equal probability; in the absence of additional information, there is no way to determine which haplotype is correct. Thus an algorithm that assigns haplotypes to such an individual will assign the wrong one 50% of the time. We have studied the frequency of haplotype misassignments, i.e., haplotypes that are misassigned solely because of inherent marker ambiguity (not because of errors in genotyping or calculation). We examined both SNPs and microsatellite markers. We used the computer programs GENEHUNTER and SIMWALK to assign the haplotypes. We simulated (a) families with 1-5 children, (b) haplotypes involving different numbers of marker loci (3, 5, 7 and 10 loci, all in linkage equilibrium), and (c) different allele frequencies. Misassignment rates are highest (a) in small families, (b) with many SNP loci, and (c) for loci with the greatest heterozygosity (i.e., where both alleles have frequency 0.5). For example, for triads (i.e., one-child families with both parents genotyped), misassignment rates for SNPs can reach almost 50%. Family sizes of 4-5 children are required in order to ensure a misassignment frequency of < or = 5% for ten-SNP haplotypes with allele frequencies of 0.25-0.5. For microsatellites, a family size of at least 2-3 children is necessary to keep haplotyping misassignments < or = 5%. Finally, we point out that it is misleading for a computer program to yield haplotype assignments without indicating that they may have been misassigned, and we discuss the implications of these misassignments for association and linkage analysis.  相似文献   

4.
Genetic variation in the human population may lead to functional variants of genes that contribute to risk for common chronic diseases such as cancer. In an effort to detect such possible predisposing variants, we constructed haplotypes for a candidate gene and tested their efficacy in association studies. We developed haplotypes consisting of 14 biallelic neutral-sequence variants that span 142 kb of the ATM locus. ATM is the gene responsible for the autosomal recessive disease ataxia-telangiectasia (AT). These ATM noncoding single-nucleotide polymorphisms (SNPs) were genotyped in nine CEPH families (89 individuals) and in 260 DNA samples from four different ethnic origins. Analysis of these data with an expectation-maximization algorithm revealed 22 haplotypes at this locus, with three major haplotypes having frequencies > or = .10. Tests for recombination and linkage disequilibrium (LD) show reduced recombination and extensive LD at the ATM locus, in all four ethnic groups studied. The most striking example was found in the study population of European ancestry, in which no evidence for recombination could be discerned. The potential of ATM haplotypes for detection of genetic variants through association studies was tested by analysis of 84 individuals carrying one of three ATM coding SNPs. Each coding SNP was detected by association with an ATM haplotype. We demonstrate that association studies with haplotypes for candidate genes have significant potential for the detection of genetic backgrounds that contribute to disease.  相似文献   

5.
The linkage relationships between the cystic fibrosis locus and six marker loci allowed us to build 122 haplotypes bearing the CF gene and to compare them to the 122 normal haplotypes. We obtained 13 different marker haplotypes associated with the CF chromosomes and 22 with the normals in our population. To examine the possibility of a correlation between the clinical and the genetical polymorphism of the disease haplotype analysis was carried out taking in account age at diagnosis, severity of the disease and a particular clinical subgroup such as meconium ileus or pancreatic sufficiency. The results showed that a particular haplotype (121) is predominant in the severe form of the disease. This may have implication in diagnosis and prognosis of the disease.  相似文献   

6.
Quantitative trait loci affecting clinical mastitis were detected and fine mapped to a narrow region on bovine chromosome 6 in the Norwegian Red cattle population. The region includes the casein gene cluster and several candidate genes thought to influence clinical mastitis. The most significant results were found for SNPs within the Mucin 7 gene. This gene encodes an antimicrobial peptide and constitutes part of the first line of defence for the mucosal immune system. Detection of long haplotypes extending several Mb may indicate that artificial selection has influenced the haplotype structures in the region. A search for selection sweeps supports this observation and coincides with association results found both by single SNP and haplotype analyses. Our analyses identified haplotypes carrying quantitative trait loci alleles associated with high protein yield and simultaneously fewer incidences of clinical mastitis. The fact that such haplotypes are found in relative high frequencies in Norwegian Red may reflect the combined breeding goal that is characterized by selection for both milk production and disease resistance. The identification of these haplotypes raises the possibility of overcoming the unfavourable genetic correlation between these traits through haplotype-assisted selection.  相似文献   

7.
The identification of genes for monogenic disorders has proven to be highly effective for understanding disease mechanisms, pathways and gene function in humans. Nevertheless, while thousands of Mendelian disorders have not yet been mapped there has been a trend away from studying single-gene disorders. In part, this is due to the fact that many of the remaining single-gene families are not large enough to map the disease locus to a single site in the genome. New tools and approaches are needed to allow researchers to effectively tap into this genetic gold-mine. Towards this goal, we have used haploid cell lines to experimentally validate the use of high-density single nucleotide polymorphism (SNP) arrays to define genome-wide haplotypes and candidate regions, using a small amyotrophic lateral sclerosis (ALS) family as a prototype. Specifically, we used haploid-cell lines to determine if high-density SNP arrays accurately predict haplotypes across entire chromosomes and show that haplotype information significantly enhances the genetic information in small families. Panels of haploid-cell lines were generated and a 5 centimorgan (cM) short tandem repeat polymorphism (STRP) genome scan was performed. Experimentally derived haplotypes for entire chromosomes were used to directly identify regions of the genome identical-by-descent in 5 affected individuals. Comparisons between experimentally determined and in silico haplotypes predicted from SNP arrays demonstrate that SNP analysis of diploid DNA accurately predicted chromosomal haplotypes. These methods precisely identified 12 candidate intervals, which are shared by all 5 affected individuals. Our study illustrates how genetic information can be maximized using readily available tools as a first step in mapping single-gene disorders in small families.  相似文献   

8.
The analysis of polymorphic markers within or closely linked to the cystic fibrosis transmembrane regulator (CFTR) gene is useful as a molecular tool for carrier detection of known and unknown mutations. To establish the association between mutations in the CFTR gene in western Mexican cystic fibrosis (CF) patients, the distribution of XV2c/KM19 haplotypes was analyzed by PCR and restriction enzyme digestion in 384 chromosomes from 74 CF patients, their unaffected parents, and normal subjects. The haplotype analysis revealed that haplotype B was present in 71.9% of CF chromosomes compared to 0% of non-CF chromosomes. The F508del and G542X mutations were strongly associated with haplotype B (96.7% and 100% of chromosomes, respectively). The haplotype distribution of the CF chromosomes carrying other CFTR mutations had a more heterogeneous background. Our results show that haplotype B is associated with CFTR mutations. Therefore, haplotype analysis is a suitable alternate strategy for screening CF patients with a heterogeneous clinical picture from populations with a high molecular heterogeneity where carrier detection programs are not available. In addition, it may be a helpful diagnostic tool for genetic counseling and carrier detection in the relatives of CF patients and in couples who are planning to have children.  相似文献   

9.
The definition of haplotype blocks of single-nucleotide polymorphisms (SNPs) has been proposed so that the haplotypes can be used as markers in association studies and to efficiently describe human genetic variation. The International Haplotype Map (HapMap) project to construct a comprehensive catalog of haplotypic variation in humans is underway. However, a number of factors have already been shown to influence the definition of blocks, including the population studied and the sample SNP density. Here, we examine the effect that marker selection has on the definition of blocks and the pattern of haplotypes by using comparable but complementary SNP sets and a number of block definition methods in various genomic regions and populations that were provided by the Encyclopedia of DNA Elements (ENCODE) project. We find that the chosen SNP set has a profound effect on the block-covered sequence and block borders, even at high marker densities. Our results question the very concept of discrete haplotype blocks and the possibility of generalizing block findings from the HapMap project. We comparatively apply the block-free tagging-SNP approach and discuss both the haplotype approach and the tagging-SNP approach as means to efficiently catalog genetic variation.  相似文献   

10.
A promising strategy for identifying disease susceptibility genes for both single- and multiple-gene diseases is to search patients' autosomes for shared chromosomal segments derived from a common ancestor. Such segments are characterized by the distinct identity of their haplotype. The methods and algorithms currently available have only a limited capability for determining a high-resolution haplotype genomewide. We herein introduce the homozygosity haplotype (HH), a haplotype described by the homozygous SNPs that are easily obtained from high-density SNP genotyping data. The HH represents haplotypes of both copies of homologous autosomes, allowing for direct comparisons of the autosomes among multiple patients and enabling the identification of the shared segments. The HH successfully detected the shared segments from members of a large family with Marfan syndrome, which is an autosomal dominant, single-gene disease. It also detected the shared segments from patients with model multigene diseases originating with common ancestors who lived 10-25 generations ago. The HH is therefore considered to be useful for the identification of disease susceptibility genes in both single- and multiple-gene diseases.  相似文献   

11.
Analysis of haplotypes based on multiple single-nucleotide polymorphisms (SNP) is becoming common for both candidate gene and fine-mapping studies. Before embarking on studies of haplotypes from genetically distinct populations, however, it is important to consider variation both in linkage disequilibrium (LD) and in haplotype frequencies within and across populations, as both vary. Such diversity will influence the choice of "tagging" SNPs for candidate gene or whole-genome association studies because some markers will not be polymorphic in all samples and some haplotypes will be poorly represented or completely absent. Here we analyze 11 genes, originally chosen as candidate genes for oral clefts, where multiple markers were genotyped on individuals from four populations. Estimated haplotype frequencies, measures of pairwise LD, and genetic diversity were computed for 135 European-Americans, 57 Chinese-Singaporeans, 45 Malay-Singaporeans, and 46 Indian-Singaporeans. Patterns of pairwise LD were compared across these four populations and haplotype frequencies were used to assess genetic variation. Although these populations are fairly similar in allele frequencies and overall patterns of LD, both haplotype frequencies and genetic diversity varied significantly across populations. Such haplotype diversity has implications for designing studies of association involving samples from genetically distinct populations.  相似文献   

12.
A haplotype is a single nucleotide polymorphism (SNP) sequence and a representative genetic marker describing the diversity of biological organs. Haplotypes have a wide range of applications such as pharmacology and medical applications. In particular, as a highly social species, haplotypes of the Apis mellifera (honeybee) benefit human health and medicine in diverse areas, including venom toxicology, infectious disease, and allergic disease. For this reason, assembling a pair of haplotypes from individual SNP fragments drives research and generates various computational models for this problem. The minimum error correction (MEC) model is an important computational model for an individual haplotype assembly problem. However, the MEC model has been proved to be NP-hard; therefore, no efficient algorithm is available to address this problem. In this study, we propose an improved version of a branch and bound algorithm that can assemble a pair of haplotypes with an optimal solution from SNP fragments of a honeybee specimen in practical time bound. First, we designed a local search algorithm to calculate the good initial upper bound of feasible solutions for enhancing the efficiency of the branch and bound algorithm. Furthermore, to accelerate the speed of the algorithm, we made use of the recursive property of the bounding function together with a lookup table. After conducting extensive experiments over honeybee SNP data released by the Human Genome Sequencing Center, we showed that our method is highly accurate and efficient for assembling haplotypes.  相似文献   

13.
Statistical estimation and pedigree analysis of CCR2-CCR5 haplotypes   总被引:4,自引:0,他引:4  
As more SNP marker data becomes available, researchers have used haplotypes of markers, rather than individual polymorphisms, for association analysis of candidate genes. In order to perform haplotype analysis in a population-based case-control study, haplotypes must be determined by estimation in the absence of family information or laboratory methods for establishing phase. Here, we test the accuracy of the Expectation-Maximization (EM) algorithm for estimating haplotype state and frequency in the CCR2-CCR5 gene region by comparison with haplotype state and frequency determined by pedigree analysis. To do this, we have characterized haplotypes comprising alleles at seven biallelic loci in the CCR2-CCR5 chemokine receptor gene region, a span of 20 kb on chromosome 3p21. Three-generation CEPH families (n=40), totaling 489 individuals, were genotyped by the 5'nuclease assay (TaqMan). Haplotype states and frequencies were compared in 103 grandparents who were assumed to have mated at random. Both pedigree analysis and the EM algorithm yielded the same small number of haplotypes for which linkage disequilibrium was nearly maximal. The haplotype frequencies generated by the two methods were nearly identical. These results suggest that the EM algorithm estimation of haplotype states, frequency, and linkage disequilibrium analysis will be an effective strategy in the CCR2-CCR5 gene region. For genetic epidemiology studies, CCR2-CCR5 allele and haplotype frequencies were determined in African-American (n=30), Hispanic (n=24) and European-American (n=34) populations.  相似文献   

14.
Data mining applied to linkage disequilibrium mapping   总被引:11,自引:0,他引:11       下载免费PDF全文
We introduce a new method for linkage disequilibrium mapping: haplotype pattern mining (HPM). The method, inspired by data mining methods, is based on discovery of recurrent patterns. We define a class of useful haplotype patterns in genetic case-control data and use the algorithm for finding disease-associated haplotypes. The haplotypes are ordered by their strength of association with the phenotype, and all haplotypes exceeding a given threshold level are used for prediction of disease susceptibility-gene location. The method is model-free, in the sense that it does not require (and is unable to utilize) any assumptions about the inheritance model of the disease. The statistical model is nonparametric. The haplotypes are allowed to contain gaps, which improves the method's robustness to mutations and to missing and erroneous data. Experimental studies with simulated microsatellite and SNP data show that the method has good localization power in data sets with large degrees of phenocopies and with lots of missing and erroneous data. The power of HPM is roughly identical for marker maps at a density of 3 single-nucleotide polymorphisms/cM or 1 microsatellite/cM. The capacity to handle high proportions of phenocopies makes the method promising for complex disease mapping. An example of correct disease susceptibility-gene localization with HPM is given with real marker data from families from the United Kingdom affected by type 1 diabetes. The method is extendable to include environmental covariates or phenotype measurements or to find several genes simultaneously.  相似文献   

15.
Beta-defensins are cationic antimicrobial peptides expressed by epithelial cells and exhibit antibacterial, antifungal, and antiviral properties. The defensins are part of the innate host defense network and may have a significant protective role in the oral cavity and other mucosa. Defects or alteration in expression of the beta-defensins may be associated with susceptibility to infection and mucosal disorders. We examined the occurrence of single-nucleotide polymorphisms (SNPs) in the human beta-defensin genes DEFB1 and DEFB2 encoding human beta-defensin-1 and -2 (hBD-1, hBD-2), respectively, in five ethnic populations and defined haplotypes in these populations. Fifteen SNPs were identified in both DEFB1 and DEFB2. Coding region SNPs were found in very low frequency in both genes. One nonsynonymous DEFB1 SNP, G1654A (Val --> Ile), and one nonsynonymous DEFB2 SNP, T2312A (Leu --> His), were identified. Seven sites in each gene exhibited statistically significant differences in frequency between ethnic groups, with the greatest variation in the promoter and in the 5'-untranslated region of DEFB1. DEFB1 displayed 10 common haplotypes, including one cosmopolitan haplotype. Eight common haplotypes were found in DEFB2, including one cosmopolitan haplotype shared among all five ethnic groups. Our results show that genotypic variability among ethnic groups will need to be addressed when performing associative genetic studies of innate defense mechanisms and susceptibility to disease.  相似文献   

16.
Soybean rust (SBR), caused by Phakopsora pachyrhizi Sydow, is one of the most economically important and destructive diseases of soybean [Glycine max (L.) Merr.] and the discovery of novel SBR resistance genes is needed because of virulence diversity in the pathogen. The objectives of this research were to map SBR resistance in plant introduction (PI) 561356 and to identify single nucleotide polymorphism (SNP) haplotypes within the region on soybean chromosome 18 where the SBR resistance gene Rpp1 maps. One-hundred F(2:3) lines derived from a cross between PI 561356 and the susceptible experimental line LD02-4485 were genotyped with genetic markers and phenotyped for resistance to P. pachyrhizi isolate ZM01-1. The segregation ratio of reddish brown versus tan lesion type in the population supported that resistance was controlled by a single dominant gene. The gene was mapped to a 1-cM region on soybean chromosome 18 corresponding to the same interval as Rpp1. A haplotype analysis of diverse germplasm across a 213-kb interval that included Rpp1 revealed 21 distinct haplotypes of which 4 were present among 5 SBR resistance sources that have a resistance gene in the Rpp1 region. Four major North American soybean ancestors belong to the same SNP haplotype as PI 561356 and seven belong to the same haplotype as PI 594538A, the Rpp1-b source. There were no North American soybean ancestors belonging to the SNP haplotypes found in PI 200492, the source of Rpp1, or PI 587886 and PI 587880A, additional sources with SBR resistance mapping to the Rpp1 region.  相似文献   

17.
Risk haplotype analysis for bovine paratuberculosis   总被引:1,自引:0,他引:1  
Paratuberculosis (Johne’s disease), caused by Mycobacterium avium subsp. paratuberculosis, is an important disease for bovines, although its genetic basis is poorly understood. In this study, three candidate genes were typed to study the associations between single nucleotide polymorphisms (SNPs) and paratuberculosis susceptibility (measured in a 1 or 0 form) at the haplotype level. A significant risk haplotype, constructed by a variant allele (C) at the first SNP and a common allele (A) at the second SNP, within the CARD15 gene was detected to trigger genetic effects on paratuberculosis infection in an overdominace manner. Marginally significant haplotypes were identified for the other two genes. The results obtained will provide scientific guidance about the selection and prediction of resistance types in bovines.  相似文献   

18.
In a de novo genotyping‐by‐sequencing (GBS) analysis of short, 64‐base tag‐level haplotypes in 4657 accessions of cultivated oat, we discovered 164741 tag‐level (TL) genetic variants containing 241224 SNPs. From this, the marker density of an oat consensus map was increased by the addition of more than 70000 loci. The mapped TL genotypes of a 635‐line diversity panel were used to infer chromosome‐level (CL) haplotype maps. These maps revealed differences in the number and size of haplotype blocks, as well as differences in haplotype diversity between chromosomes and subsets of the diversity panel. We then explored potential benefits of SNP vs. TL vs. CL GBS variants for mapping, high‐resolution genome analysis and genomic selection in oats. A combined genome‐wide association study (GWAS) of heading date from multiple locations using both TL haplotypes and individual SNP markers identified 184 significant associations. A comparative GWAS using TL haplotypes, CL haplotype blocks and their combinations demonstrated the superiority of using TL haplotype markers. Using a principal component‐based genome‐wide scan, genomic regions containing signatures of selection were identified. These regions may contain genes that are responsible for the local adaptation of oats to Northern American conditions. Genomic selection for heading date using TL haplotypes or SNP markers gave comparable and promising prediction accuracies of up to r = 0.74. Genomic selection carried out in an independent calibration and test population for heading date gave promising prediction accuracies that ranged between r = 0.42 and 0.67. In conclusion, TL haplotype GBS‐derived markers facilitate genome analysis and genomic selection in oat.  相似文献   

19.
To optimize the strategies for population-based pharmacogenetic studies, we extensively analyzed single-nucleotide polymorphisms (SNPs) and haplotypes in 199 drug-related genes, through use of 4,190 SNPs in 752 control subjects. Drug-related genes, like other genes, have a haplotype-block structure, and a few haplotype-tagging SNPs (htSNPs) could represent most of the major haplotypes constructed with common SNPs in a block. Because our data included 860 uncommon (frequency <0.1) SNPs with frequencies that were accurately estimated, we analyzed the relationship between haplotypes and uncommon SNPs within the blocks (549 SNPs). We inferred haplotype frequencies through use of the data from all htSNPs and one of the uncommon SNPs within a block and calculated four joint probabilities for the haplotypes. We show that, irrespective of the minor-allele frequency of an uncommon SNP, the majority (mean +/- SD frequency 0.943+/-0.117) of the minor alleles were assigned to a single haplotype tagged by htSNPs if the uncommon SNP was within the block. These results support the hypothesis that recombinations occur only infrequently within blocks. The proportion of a single haplotype tagged by htSNPs to which the minor alleles of an uncommon SNP were assigned was positively correlated with the minor-allele frequency when the frequency was <0.03 (P<.000001; n=233 [Spearman's rank correlation coefficient]). The results of simulation studies suggested that haplotype analysis using htSNPs may be useful in the detection of uncommon SNPs associated with phenotypes if the frequencies of the SNPs are higher in affected than in control populations, the SNPs are within the blocks, and the frequencies of the SNPs are >0.03.  相似文献   

20.
Genetic variation in the major histocompatibility complex (MHC) is known to affect disease resistance in many species. Investigations of MHC diversity in populations of wild species have focused on the antigen presenting class IIβ molecules due to the known polymorphic nature of these genes and the role these molecules play in pathogen recognition. Studies of MHC haplotype variation in the turkey (Meleagris gallopavo) are limited. This study was designed to examine MHC diversity in a group of Eastern wild turkeys (Meleagris gallopavo silvestris) collected during population expansion following reintroduction of the species in southern Wisconsin, USA. Southern blotting with BG and class IIβ probes and single nucleotide polymorphism (SNP) genotyping was used to measure MHC variation. SNP analysis focused on single copy MHC genes flanking the highly polymorphic class IIβ genes. Southern blotting identified 27 class IIβ phenotypes, whereas SNP analysis identified 13 SNP haplotypes occurring in 28 combined genotypes. Results show that genetic diversity estimates based on RFLP (Southern blot) analysis underestimate the level of variation detected by SNP analysis. Sequence analysis of the mitochondrial D-loop identified 7 mitochondrial haplotypes (mitotypes) in the sampled birds. Results show that wild turkeys located in southern Wisconsin have a genetically diverse MHC and originate from several maternal lineages.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号