首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
人类复杂疾病关联研究中群体分层的检出和校正   总被引:2,自引:1,他引:2  
病例对照研究是鉴定多基因疾病易感位点重要的遗传流行病学方法, 而群体分层是导致病例对照研究关联研究结果出现偏倚甚至是假关联的重要原因之一。文章对人群分层的检出及校正的方法和原理进行了阐述, 包括基于核心家系的传递/不平衡检验(TDT)以及基于不相关基因组遗传标记的基因组对照(GC)和结构化关联(SA)等, 并且对这几种方法进行了比较。  相似文献   

2.
Ning QL  Ma XD  Jiao LZ  Niu XR  Li JP  Wang B  Zhang H  Ma J 《遗传》2012,34(3):307-314
研究表明位于染色体8p21.3区域的EGR3(Early growth response 3)是精神分裂症(Schizophrenia)的重要易感基因,然而,仍有两个病例-对照研究未能验证上述发现。为了研究EGR3基因在我国患者中是否与疾病关联,文章在中国汉族的核心家系中选择EGR3基因座位上的5个SNPs位点(rs1996147、rs1877670、rs3750192、rs35201266和rs7009708)进行基因分型和传递不平衡检验(Transmission disequilibrium test,TDT)。结果表明遗传标记rs1996147和rs3750192分别显示出显著的传递不平衡(2>4.40,P<0.05)。在连锁不平衡分析中,由2个(rs3750192和rs35201266)、3个(rs1877670、rs3750192和rs7009708)以及4个(rs1996147、rs1877670、rs3750192和rs7009708)SNPs位点构建的单倍型均显示与精神分裂症显著性关联(2>7.10,整体P<0.05)。总之,EGR3基因与中国汉族人群精神分裂症遗传易感性相关,后续关于EGR3基因进一步的功能研究将会更好的帮助我们了解该基因在疾病病理学机制中的作用。  相似文献   

3.
研究表明位于染色体8p21.3区域的EGR3(Early growth response 3)是精神分裂症(Schizophrenia)的重要易感基因, 然而, 仍有两个病例-对照研究未能验证上述发现。为了研究EGR3基因在我国患者中是否与疾病关联, 文章在中国汉族的核心家系中选择EGR3基因座位上的5个SNPs位点(rs1996147、rs1877670、rs3750192、rs35201266和rs7009708)进行基因分型和传递不平衡检验(Transmission disequilibrium test, TDT)。结果表明遗传标记rs1996147和rs3750192分别显示出显著的传递不平衡(c2>4.40, P<0.05)。在连锁不平衡分析中, 由2个(rs3750192和rs35201266)、3个(rs1877670、rs3750192和rs7009708)以及4个(rs1996147、rs1877670、rs3750192和rs7009708)SNPs位点构建的单倍型均显示与精神分裂症显著性关联(c2>7.10, 整体P<0.05)。总之, EGR3基因与中国汉族人群精神分裂症遗传易感性相关, 后续关于EGR3基因进一步的功能研究将会更好的帮助我们了解该基因在疾病病理学机制中的作用。  相似文献   

4.
Attention-deficit hyperactivity disorder (ADHD) is a common childhood-onset psychiatric condition with a strong genetic component. Evidence from pharmacological, clinical and animal studies has suggested that the nicotinic system could be involved in the disorder. Previous studies have implicated the nicotinic acetylcholine receptor α4 subunit gene, CHRNA4 , in ADHD. Particularly, a polymorphism in the exon 2–intron 2 junction of CHRNA4 has been associated with severe inattention defined by latent class analysis. In the current study, we used the transmission disequilibrium test (TDT) to investigate four polymorphisms encompassing this region of CHRNA4 for association with ADHD in a sample of 264 nuclear families from Toronto. No significant evidence of biased transmission was observed for any of the marker alleles for ADHD defined as a categorical trait (all subtypes included), although one haplotype showed marginal evidence of under-transmission. No association was found with the ADHD predominantly inattentive subtype or with symptom dimension scores of inattention. On the contrary, nominally significant evidence of association of individual markers was obtained for the ADHD combined subtype and with teacher-rated hyperactivity–impulsivity scores, with the same haplotype being under-transmitted. Based on our results and others, CHRNA4 may be involved in ADHD; however, its role in ADHD symptomatology remains to be clarified.  相似文献   

5.
Summary Case-parent trio studies concerned with children affected by a disease and their parents aim to detect single nucleotide polymorphisms (SNPs) showing a preferential transmission of alleles from the parents to their affected offspring. A popular statistical test for detecting such SNPs associated with disease in this study design is the genotypic transmission/disequilibrium test (gTDT) based on a conditional logistic regression model, which usually needs to be fitted by an iterative procedure. In this article, we derive exact closed-form solutions for the parameter estimates of the conditional logistic regression models when testing for an additive, a dominant, or a recessive effect of a SNP, and show that such analytic parameter estimates also exist when considering gene-environment interactions with binary environmental variables. Because the genetic model underlying the association between a SNP and a disease is typically unknown, it might further be beneficial to use the maximum over the gTDT statistics for the possible effects of a SNP as test statistic. We therefore propose a procedure enabling a fast computation of the test statistic and the permutation-based p-value of this MAX gTDT. All these methods are applied to whole-genome scans of the case-parent trios from the International Cleft Consortium. These applications show our procedures dramatically reduce the required computing time compared to the conventional iterative methods allowing, for example, the analysis of hundreds of thousands of SNPs in a few minutes instead of several hours.  相似文献   

6.
Single nucleotide polymorphisms (SNPs) are the most abundant and richest form of genomic polymorphism and, hence, are highly favorable markers for genetic map construction and genome-wide association studies. Based on the DNA specific-locus amplified fragment sequencing (SLAF-seq) for large-scale SNP detection, the genetic diversity and population structure of Salix gordejevii Y. L. Chang et Skv., a valuable sand-fixing shrub, was assessed in 199 accessions from 20 populations in Hunshandake Sandland of northern China. A total of 623.15 M reads resulted in 30.49 × sequencing depth on average and a mean Q30 of 95.70%, and 2,287,715 SNPs in 178,509 polymorphic SLAF tags were obtained. By discarding minor allele frequency > 0.05 and integrity > 0.8, a total of 93,600 SNPs were retained for population genetic analyses, which revealed that 199 individuals could be divided into six groups based on cross-validation errors. However, this grouping pattern did not match the geographical distribution, indicating that there is no apparent geographic barrier in the blank areas where S. gordejevii was not distributed in Hunshandake Sandland. In addition, the physical distance of linkage disequilibrium decay in the analyzed S. gordejevii individuals was 18.5 kb when r2 = 0.1. The linkage disequilibrium decay distances for different chromosomes varied from 4.6 kb (chromosome 16) to 37.8 kb (chromosome 3). The obtained SNPs offer suitable marker resources for further genetic and genomic studies and will benefit S. gordejevii breeding programs.  相似文献   

7.
8.
9.
Bouaziz M  Ambroise C  Guedj M 《PloS one》2011,6(12):e28845
Genome-Wide Association Studies are powerful tools to detect genetic variants associated with diseases. Their results have, however, been questioned, in part because of the bias induced by population stratification. This is a consequence of systematic differences in allele frequencies due to the difference in sample ancestries that can lead to both false positive or false negative findings. Many strategies are available to account for stratification but their performances differ, for instance according to the type of population structure, the disease susceptibility locus minor allele frequency, the degree of sampling imbalanced, or the sample size. We focus on the type of population structure and propose a comparison of the most commonly used methods to deal with stratification that are the Genomic Control, Principal Component based methods such as implemented in Eigenstrat, adjusted Regressions and Meta-Analyses strategies. Our assessment of the methods is based on a large simulation study, involving several scenarios corresponding to many types of population structures. We focused on both false positive rate and power to determine which methods perform the best. Our analysis showed that if there is no population structure, none of the tests led to a bias nor decreased the power except for the Meta-Analyses. When the population is stratified, adjusted Logistic Regressions and Eigenstrat are the best solutions to account for stratification even though only the Logistic Regressions are able to constantly maintain correct false positive rates. This study provides more details about these methods. Their advantages and limitations in different stratification scenarios are highlighted in order to propose practical guidelines to account for population stratification in Genome-Wide Association Studies.  相似文献   

10.
Simulation studies were performed to analyze factors affecting the population dynamics of the system with the greenhouse whitefly (Trialeurodes vaporariorumWestwood ) and the parasitoid Encarsia formosaGahan and to develop strategies for the introduction of E. formosa. The reduction of parasitization efficiency with an increase in parasitoid density promotes the stability of the system, which coincides with the prediction from current theory. The stability of the system is also shown to be promoted by the effect of host feeding. The population levels of the system are remarkably suppressed with an increase in searching efficiency and a decrease in host oviposition. The control effect of the parasitoids is enhanced when the number of parasitoids is divided among many introductions. An optimal time, an optimal density ratio of parasitoids to hosts and optimal densities of hosts and parasitoids exist in the introduction programme of parasitoids.  相似文献   

11.
Domestic dogs share a wide range of important disease conditions with humans, including cancers, diabetes and epilepsy. Many of these conditions have similar or identical underlying pathologies to their human counterparts and thus dogs represent physiologically relevant natural models of human disorders. Comparative genomic approaches whereby disease genes can be identified in dog diseases and then mapped onto the human genome are now recognized as a valid method and are increasing in popularity. The majority of dog breeds have been created over the past few hundred years and, as a consequence, the dog genome is characterized by extensive linkage disequilibrium (LD), extending usually from hundreds of kilobases to several megabases within a breed, rather than tens of kilobases observed in the human genome. Genome‐wide canine SNP arrays have been developed, and increasing success of using these arrays to map disease loci in dogs is emerging. No equivalent of the human HapMap currently exists for different canine breeds, and the LD structure for such breeds is far less understood than for humans. This study is a dedicated large‐scale assessment of the functionalities (LD and SNP tagging performance) of canine genome‐wide SNP arrays in multiple domestic dog breeds. We have used genotype data from 18 breeds as well as wolves and coyotes genotyped by the Illumina 22K canine SNP array and Affymetrix 50K canine SNP array. As expected, high tagging performance was observed with most of the breeds using both Illumina and Affymetrix arrays when multi‐marker tagging was applied. In contrast, however, large differences in population structure, LD coverage and pairwise tagging performance were found between breeds, suggesting that study designs should be carefully assessed for individual breeds before undertaking genome‐wide association studies (GWAS).  相似文献   

12.
It is well-known that population substructure may lead to confounding in case–control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of five ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed four important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case–control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs.  相似文献   

13.
Several previous studies concluded that linkage disequilibrium (LD) in livestock populations from developed countries originated from the impact of strong selection. Here, we assessed the extent of LD in a cattle population from western Africa that was bred in an extensive farming system. The analyses were performed on 363 individuals in a Bos indicus x Bos taurus population using 42 microsatellite markers on BTA04, BTA07 and BTA13. A high level of expected heterozygosity (0.71), a high mean number of alleles per locus (9.7) and a mild shift in Hardy-Weinberg equilibrium were found. Linkage disequilibrium extended over shorter distances than what has been observed in cattle from developed countries. Effective population size was assessed using two methods; both methods produced large values: 1388 when considering heterozygosity (assuming a mutation rate of 10(-3)) and 2344 when considering LD on whole linkage groups (assuming a constant population size over generations). However, analysing the decay of LD as a function of marker spacing indicated a decreasing trend in effective population size over generations. This decrease could be explained by increasing selective pressure and/or by an admixture process. Finally, LD extended over small distances, which suggested that whole-genome scans will require a large number of markers. However, association studies using such populations will be effective.  相似文献   

14.
This study analyzes population structure and linkage disequilibrium (LD) among 187 commonly used Chinese maize inbred lines, representing the genetic diversity among public, commercial and historically important lines for corn breeding. Seventy SSR loci, evenly distributed over 10 chromosomes, were assayed for polymorphism. The identified 290 alleles served to estimate population structure and analyze the genome-wide LD. The population of lines was highly structured, showing 6 subpopulations: BSSS (American BSSS including Reid), PA (group A germplasm derived from modern U.S. hybrids in China), PB (group B germplasm derived from modern U.S. hybrid in China), Lan (Lancaster Surecrop), LRC (derivative lines from Lvda Reb Cob, a Chinese landrace) and SPT (derivative lines from Si-ping-tou, a Chinese landrace). Forty lines, which formerly had an unknown and/or miscellaneous origin and pedigree record, were assigned to the appropriate group. Relationship estimates based on SSR marker data were quantified in a Q matrix, and this information will inform breeder’s decisions regarding crosses. Extensive inter- and intra-chromosomal LD was detected between 70 microsatellite loci for the investigated maize lines (2109 loci pairs in LD with D′ > 0.1 and 93 out of them at P < 0.01).This suggests that rapidly evolving microsatellites may track recent population structure. Interlocus LD decay among the diverse maize germplasm indicated that association studies in QTLs and/or candidate genes might avoid nonfunctional and spurious associations since most of the LD blocks were broken between diverse germplasm. The defined population structure and the LD analysis present the basis for future association mapping. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

15.
One of the leading biological models of obsessive‐compulsive disorder (OCD) is the frontal‐striatal‐thalamic model. This study undertakes an extensive exploration of the variability in genes related to the regulation of the frontal‐striatal‐thalamic system in a sample of early‐onset OCD trios. To this end, we genotyped 266 single nucleotide polymorphisms (SNPs) in 35 genes in 84 OCD probands and their parents. Finally, 75 complete trios were included in the analysis. Twenty SNPs were overtransmitted from parents to early‐onset OCD probands and presented nominal pointwise P < 0.05 values. Three of these polymorphisms achieved P < 2 × 10?4, the significant P‐value after Bonferroni corrections: rs8190748 and rs992990 localized in GAD2 and rs2000292 in HTR1B. When we stratified our sample according to gender, different trends were observed between males and females. In males, SNP rs2000292 (HTR1B) showed the lowest P‐value (P = 0.0006), whereas the SNPs in GAD2 were only marginally significant (P = 0.01). In contrast, in females HTR1B polymorphisms were not significant, whereas rs8190748 (GAD2) showed the lowest P‐value (P = 0.0006). These results are in agreement with several lines of evidence that indicate a role for the serotonin and γ‐Aminobutyric acid (GABA) pathways in the risk of early‐onset OCD and with the gender differences in OCD pathophysiology reported elsewhere. However, our results need to be replicated in studies with larger cohorts in order to confirm these associations.  相似文献   

16.
In some occupational health studies, observations occur in both exposed and unexposed individuals. If the levels of all exposed individuals have been detected, a two-part zero-inflated log-normal model is usually recommended, which assumes that the data has a probability mass at zero for unexposed individuals and a continuous response for values greater than zero for exposed individuals. However, many quantitative exposure measurements are subject to left censoring due to values falling below assay detection limits. A zero-inflated log-normal mixture model is suggested in this situation since unexposed zeros are not distinguishable from those exposed with values below detection limits. In the context of this mixture distribution, the information contributed by values falling below a fixed detection limit is used only to estimate the probability of unexposed. We consider sample size and statistical power calculation when comparing the median of exposed measurements to a regulatory limit. We calculate the required sample size for the data presented in a recent paper comparing the benzene TWA exposure data to a regulatory occupational exposure limit. A simulation study is conducted to investigate the performance of the proposed sample size calculation methods.  相似文献   

17.
Evaluating population structure in the marine environment is a challenging task when the species of interest is continuously distributed, and yet the use of population or stock structure is a crucial component of management and conservation strategies. The franciscana dolphin (Pontoporia blainvillei), a rare endangered coastal cetacean, suffers high levels of by-catch all along its distribution range in the Western South Atlantic, and questions have been raised about boundaries or divisions for population management. Here we apply genetic tools to better understand population structure and migration, sex-biased dispersal, and to assess potential genetic and demographic impacts of by-catch. Our analyses, based on mtDNA control region sequences, reveal significant genetic division at the regional level and fine-scale structure within our study area. These results suggest that the population in northern Buenos Aires is the most isolated population in Argentina. We found no significant departure from an equal sex ratio among the by-caught animals. A few cases of multiple entanglements appeared to be mother–calf pairs based on field observations and individuals sharing the same mtDNA control region lineage. The distribution of haplotype frequencies observed could imply that some maternal lineages are more prone to be subject to higher rates of by-catch, although biopsy sampling is necessary to fully evaluate whether maternal lineage distributions are the same for biopsy sampled and by-caught animals. A genetic indication of population size disequilibrium was detected for all populations in Argentina, which is consistent with available rates of by-catch and abundance estimates. Collectively, our findings support the current scheme of larger recognized Franciscana Management Areas (FMA), but argue for a finer-scale subdivision within Northern Buenos Aires region (FMA IV). Finally, an integrated approach to promote conservation of this endangered small cetacean has to involve identification of genetic and demographic threats, a more sustainable fishery strategy to reduce by-catch, and designation of protected areas that are supported by underlying population structure for franciscana dolphins.  相似文献   

18.
OBJECTIVES: This is the first of two articles discussing the effect of population stratification on the type I error rate (i.e., false positive rate). This paper focuses on the confounding risk ratio (CRR). It is accepted that population stratification (PS) can produce false positive results in case-control genetic association. However, which values of population parameters lead to an increase in type I error rate is unknown. Some believe PS does not represent a serious concern, whereas others believe that PS may contribute to contradictory findings in genetic association. We used computer simulations to estimate the effect of PS on type I error rate over a wide range of disease frequencies and marker allele frequencies, and we compared the observed type I error rate to the magnitude of the confounding risk ratio. METHODS: We simulated two populations and mixed them to produce a combined population, specifying 160 different combinations of input parameters (disease prevalences and marker allele frequencies in the two populations). From the combined populations, we selected 5000 case-control datasets, each with either 50, 100, or 300 cases and controls, and determined the type I error rate. In all simulations, the marker allele and disease were independent (i.e., no association). RESULTS: The type I error rate is not substantially affected by changes in the disease prevalence per se. We found that the CRR provides a relatively poor indicator of the magnitude of the increase in type I error rate. We also derived a simple mathematical quantity, Delta, that is highly correlated with the type I error rate. In the companion article (part II, in this issue), we extend this work to multiple subpopulations and unequal sampling proportions. CONCLUSION: Based on these results, realistic combinations of disease prevalences and marker allele frequencies can substantially increase the probability of finding false evidence of marker disease associations. Furthermore, the CRR does not indicate when this will occur.  相似文献   

19.
20.
The existence of a large-scale population structure was investigated in Arabidopsis thaliana by studying patterns of polymorphism in a set of 71 European accessions. We used sequence polymorphism surveyed in 10 fragments of approximately 600 nucleotides and a set of nine microsatellite markers. Population structure was investigated using a model-based inference framework. Among the accessions studied, the presence of four groups was inferred using genetic data, without using prior information on the geographical origin of the accessions. Significant genetic isolation by geographical distance was detected at the group level, together with a geographical gradient in allelic richness across groups. These results are discussed with respect to the previously proposed scenario of postglacial colonization of Europe from putative glacial refugia. Finally, the contribution of the inferred structure to linkage disequilibrium among 171 pairs of essentially unlinked markers was also investigated. Linkage disequilibrium analysis revealed that significant associations detected in the whole sample were mainly due to genetic differentiation among the inferred groups. We discuss the implication of this finding for future association studies in A. thaliana.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号