首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Genetic isolates such as the Ashkenazi Jews (AJ) potentially offer advantages in mapping novel loci in whole genome disease association studies. To analyze patterns of genetic variation in AJ, genotypes of 101 healthy individuals were determined using the Affymetrix EAv3 500 K SNP array and compared to 60 CEPH-derived HapMap (CEU) individuals. 435,632 SNPs overlapped and met annotation criteria in the two groups.

Results

A small but significant global difference in allele frequencies between AJ and CEU was demonstrated by a mean F ST of 0.009 (P < 0.001); large regions that differed were found on chromosomes 2 and 6. Haplotype blocks inferred from pairwise linkage disequilibrium (LD) statistics (Haploview) as well as by expectation-maximization haplotype phase inference (HAP) showed a greater number of haplotype blocks in AJ compared to CEU by Haploview (50,397 vs. 44,169) or by HAP (59,269 vs. 54,457). Average haplotype blocks were smaller in AJ compared to CEU (e.g., 36.8 kb vs. 40.5 kb HAP). Analysis of global patterns of local LD decay for closely-spaced SNPs in CEU demonstrated more LD, while for SNPs further apart, LD was slightly greater in the AJ. A likelihood ratio approach showed that runs of homozygous SNPs were approximately 20% longer in AJ. A principal components analysis was sufficient to completely resolve the CEU from the AJ.

Conclusion

LD in the AJ versus was lower than expected by some measures and higher by others. Any putative advantage in whole genome association mapping using the AJ population will be highly dependent on regional LD structure.  相似文献   

2.

Background

The selection of markers in association studies can be informed through the use of haplotype blocks. Recent reports have determined the genomic architecture of chromosomal segments through different haplotype block definitions based on linkage disequilibrium (LD) measures or haplotype diversity criteria. The relative applicability of distinct block definitions to association studies, however, remains unclear. We compared different block definitions in 6.1 Mb of chromosome 17q in 189 unrelated healthy individuals. Using 137 single nucleotide polymorphisms (SNPs), at a median spacing of 15.5 kb, we constructed haplotype block maps using published methods and additional methods we have developed. Haplotype tagging SNPs (htSNPs) were identified for each map.

Results

Blocks were found to be shorter and coverage of the region limited with methods based on LD measures, compared to the method based on haplotype diversity. Although the distribution of blocks was highly variable, the number of SNPs that needed to be typed in order to capture the maximum number of haplotypes was consistent.

Conclusion

For the marker spacing used in this study, choice of block definition is not important when used as an initial screen of the region to identify htSNPs. However, choice of block definition has consequences for the downstream interpretation of association study results.  相似文献   

3.

Background

Fibroblast growth factor 20 (FGF20) is a neurotrophic factor preferentially expressed in the substantia nigra of rat brain and could be involved in dopaminergic neurons survival. Recently, a strong genetic association has been found between FGF20 gene and the risk of suffering from Parkinson's disease (PD). Our aim was to replicate this association in two independent populations.

Methods

Allelic, genotypic, and haplotype frequencies of four biallelic polymorphisms were assessed in 151 sporadic PD cases and 186 controls from Greece, and 144 sporadic PD patients and 135 controls from Finland.

Results

No association was found in any of the populations studied.

Conclusion

Taken together, these findings suggest that common genetic variants in FGF20 are not a risk factor for PD in, at least, some European populations.  相似文献   

4.

Introduction

Sex allocation theory predicts that in small mating groups simultaneous hermaphroditism is the optimal form of gender expression. Under these conditions, male allocation is predicted to be very low and overall per-capita reproductive output maximal. This is particularly true for individuals that live in pairs, but monogamy is highly susceptible to cheating by both partners. However, certain conditions favour social monogamy in hermaphrodites. This study addresses the influence of group size on group stability and moulting cycles in singles, pairs, triplets and quartets of the socially monogamous shrimp Lysmata amboinensis, a protandric simultaneous hermaphrodite.

Results

The effect of group size was very strong: Exactly one individual in each triplet and exactly two individuals in each quartet were killed in aggressive interactions, resulting in group sizes of two individuals. All killed individuals had just moulted. No mortality occurred in single and pair treatments. The number of moults in the surviving shrimp increased significantly after changing from triplets and quartets to pairs.

Conclusion

Social monogamy in L. amboinensis is reinforced by aggressive expulsion of supernumerous individuals. We suggest that the high risk of mortality in triplets and quartets results in suppression of moulting in groups larger than two individuals and that the feeding ecology of L. amboinensis favours social monogamy.  相似文献   

5.

Background

In population association studies, standard methods of statistical inference assume that study subjects are independent samples. In genetic association studies, it is therefore of interest to diagnose undocumented close relationships in nominally unrelated study samples.

Results

We describe the R package CrypticIBDcheck to identify pairs of closely-related subjects based on genetic marker data from single-nucleotide polymorphisms (SNPs). The package is able to accommodate SNPs in linkage disequibrium (LD), without the need to thin the markers so that they are approximately independent in the population. Sample pairs are identified by superposing their estimated identity-by-descent (IBD) coefficients on plots of IBD coefficients for pairs of simulated subjects from one of several common close relationships.

Conclusions

The methods implemented in CrypticIBDcheck are particularly relevant to candidate-gene association studies, in which dependent SNPs cluster in a relatively small number of genes spread throughout the genome. The accommodation of LD allows the use of all available genetic data, a desirable property when working with a modest number of dependent SNPs within candidate genes. CrypticIBDcheck is available from the Comprehensive R Archive Network (CRAN).
  相似文献   

6.

Background

Current methods for haplotype inference without pedigree information assume random mating populations. In animal and plant breeding, however, mating is often not random. A particular form of nonrandom mating occurs when parental individuals of opposite sex originate from distinct populations. In animal breeding this is called crossbreeding and hybridization in plant breeding. In these situations, association between marker and putative gene alleles might differ between the founding populations and origin of alleles should be accounted for in studies which estimate breeding values with marker data. The sequence of alleles from one parent constitutes one haplotype of an individual. Haplotypes thus reveal allele origin in data of crossbred individuals.

Results

We introduce a new method for haplotype inference without pedigree that allows nonrandom mating and that can use genotype data of the parental populations and of a crossbred population. The aim of the method is to estimate line origin of alleles. The method has a Bayesian set up with a Dirichlet Process as prior for the haplotypes in the two parental populations. The basic idea is that only a subset of the complete set of possible haplotypes is present in the population.

Conclusion

Line origin of approximately 95% of the alleles at heterozygous sites was assessed correctly in both simulated and real data. Comparing accuracy of haplotype frequencies inferred with the new algorithm to the accuracy of haplotype frequencies inferred with PHASE, an existing algorithm for haplotype inference, showed that the DP algorithm outperformed PHASE in situations of crossbreeding and that PHASE performed better in situations of random mating.  相似文献   

7.

Background

Understanding the evolution of biological networks can provide insight into how their modular structure arises and how they are affected by environmental changes. One approach to studying the evolution of these networks is to reconstruct plausible common ancestors of present-day networks, allowing us to analyze how the topological properties change over time and to posit mechanisms that drive the networks?? evolution. Further, putative ancestral networks can be used to help solve other difficult problems in computational biology, such as network alignment.

Results

We introduce a combinatorial framework for encoding network histories, and we give a fast procedure that, given a set of gene duplication histories, in practice finds network histories with close to the minimum number of interaction gain or loss events to explain the observed present-day networks. In contrast to previous studies, our method does not require knowing the relative ordering of unrelated duplication events. Results on simulated histories and real biological networks both suggest that common ancestral networks can be accurately reconstructed using this parsimony approach. A software package implementing our method is available under the Apache 2.0 license at http://cbcb.umd.edu/kingsford-group/parana.

Conclusions

Our parsimony-based approach to ancestral network reconstruction is both efficient and accurate. We show that considering a larger set of potential ancestral interactions by not assuming a relative ordering of unrelated duplication events can lead to improved ancestral network inference.  相似文献   

8.

Background

Estimation of allele frequency is of fundamental importance in population genetic analyses and in association mapping. In most studies using next-generation sequencing, a cost effective approach is to use medium or low-coverage data (e.g., < 15X). However, SNP calling and allele frequency estimation in such studies is associated with substantial statistical uncertainty because of varying coverage and high error rates.

Results

We evaluate a new maximum likelihood method for estimating allele frequencies in low and medium coverage next-generation sequencing data. The method is based on integrating over uncertainty in the data for each individual rather than first calling genotypes. This method can be applied to directly test for associations in case/control studies. We use simulations to compare the likelihood method to methods based on genotype calling, and show that the likelihood method outperforms the genotype calling methods in terms of: (1) accuracy of allele frequency estimation, (2) accuracy of the estimation of the distribution of allele frequencies across neutrally evolving sites, and (3) statistical power in association mapping studies. Using real re-sequencing data from 200 individuals obtained from an exon-capture experiment, we show that the patterns observed in the simulations are also found in real data.

Conclusions

Overall, our results suggest that association mapping and estimation of allele frequencies should not be based on genotype calling in low to medium coverage data. Furthermore, if genotype calling methods are used, it is usually better not to filter genotypes based on the call confidence score.  相似文献   

9.

Background

The estimation of individual ancestry from genetic data has become essential to applied population genetics and genetic epidemiology. Software programs for calculating ancestry estimates have become essential tools in the geneticist's analytic arsenal.

Results

Here we describe four enhancements to ADMIXTURE, a high-performance tool for estimating individual ancestries and population allele frequencies from SNP (single nucleotide polymorphism) data. First, ADMIXTURE can be used to estimate the number of underlying populations through cross-validation. Second, individuals of known ancestry can be exploited in supervised learning to yield more precise ancestry estimates. Third, by penalizing small admixture coefficients for each individual, one can encourage model parsimony, often yielding more interpretable results for small datasets or datasets with large numbers of ancestral populations. Finally, by exploiting multiple processors, large datasets can be analyzed even more rapidly.

Conclusions

The enhancements we have described make ADMIXTURE a more accurate, efficient, and versatile tool for ancestry estimation.  相似文献   

10.

Background

Analysis of microarray data has been used for the inference of gene-gene interactions. If, however, the aim is the discovery of disease-related biological mechanisms, then the criterion for defining such interactions must be specifically linked to disease.

Results

Here we present a computational methodology that jointly analyzes two sets of microarray data, one in the presence and one in the absence of a disease, identifying gene pairs whose correlation with disease is due to cooperative, rather than independent, contributions of genes, using the recently developed information theoretic measure of synergy. High levels of synergy in gene pairs indicates possible membership of the two genes in a shared pathway and leads to a graphical representation of inferred gene-gene interactions associated with disease, in the form of a "synergy network." We apply this technique on a set of publicly available prostate cancer expression data and successfully validate our results, confirming that they cannot be due to pure chance and providing a biological explanation for gene pairs with exceptionally high synergy.

Conclusion

Thus, synergy networks provide a computational methodology helpful for deriving "disease interactomes" from biological data. When coupled with additional biological knowledge, they can also be helpful for deciphering biological mechanisms responsible for disease.  相似文献   

11.

Background

This article describes classical and Bayesian interval estimation of genetic susceptibility based on random samples with pre-specified numbers of unrelated cases and controls.

Results

Frequencies of genotypes in cases and controls can be estimated directly from retrospective case-control data. On the other hand, genetic susceptibility defined as the expected proportion of cases among individuals with a particular genotype depends on the population proportion of cases (prevalence). Given this design, prevalence is an external parameter and hence the susceptibility cannot be estimated based on only the observed data. Interval estimation of susceptibility that can incorporate uncertainty in prevalence values is explored from both classical and Bayesian perspective. Similarity between classical and Bayesian interval estimates in terms of frequentist coverage probabilities for this problem allows an appealing interpretation of classical intervals as bounds for genetic susceptibility. In addition, it is observed that both the asymptotic classical and Bayesian interval estimates have comparable average length. These interval estimates serve as a very good approximation to the "exact" (finite sample) Bayesian interval estimates. Extension from genotypic to allelic susceptibility intervals shows dependency on phenotype-induced deviations from Hardy-Weinberg equilibrium.

Conclusions

The suggested classical and Bayesian interval estimates appear to perform reasonably well. Generally, the use of exact Bayesian interval estimation method is recommended for genetic susceptibility, however the asymptotic classical and approximate Bayesian methods are adequate for sample sizes of at least 50 cases and controls.  相似文献   

12.

AIM:

Distribution of HLA class I and II alleles and haplotype was studied in Pakistani population and compared with the data reported for Caucasoid, Africans, Orientals and Arab populations.

MATERIALS AND METHODS:

HLA class I and II polymorphisms in 1000 unrelated Pakistani individuals was studied using sequence-specific primers and polymerase chain reaction and assay.

RESULTS:

The most frequent class I alleles observed were A*02, B*35 and CW*07, with frequencies of 19.2, 13.7 and 20%, respectively. Fifteen distinct HLA-DRB1 alleles and eight HLA-DQB1 alleles were recognized. The most frequently observed DRB1 alleles which represented more than 60% of the subjects were DRB1 *03, *07, *11 and *15. The rare DRB1 alleles detected in this study were HLADRB1 *08 and *09, having frequencies of 0.9 and 1.7%, respectively. In addition, at DRB1-DQB1 loci there were 179 different haplotypes and 285 unique genotypes and the most common haplotype was DRB1*15-DQB1*06 which represented 17% of the total DRB1-DQB1 haplotypes. In our population, haplotype A*33-B*58-Cw*03 comprised 2.8% of the total class I haplotypes observed. This haplotype was seen only in the oriental populations and has not been reported in the African or European Caucasoid.

CONCLUSION:

Our study showed a close similarity of HLA class I and II alleles with that of European Caucasoid and Orientals. In Pakistani population, two rare loci and three haplotypes were identified, whereas haplotypes characteristic of Caucasians, Africans and Orientals were also found, suggesting an admixture of different races due to migration to and from this region.  相似文献   

13.

Background

DNA copy number alterations are one of the main characteristics of the cancer cell karyotype and can contribute to the complex phenotype of these cells. These alterations can lead to gains in cellular oncogenes as well as losses in tumor suppressor genes and can span small intervals as well as involve entire chromosomes. The ability to accurately detect these changes is central to understanding how they impact the biology of the cell.

Results

We describe a novel algorithm called CARAT (Copy Number Analysis with Regression And Tree) that uses probe intensity information to infer copy number in an allele-specific manner from high density DNA oligonuceotide arrays designed to genotype over 100, 000 SNPs. Total and allele-specific copy number estimations using CARAT are independently evaluated for a subset of SNPs using quantitative PCR and allelic TaqMan reactions with several human breast cancer cell lines. The sensitivity and specificity of the algorithm are characterized using DNA samples containing differing numbers of X chromosomes as well as a test set of normal individuals. Results from the algorithm show a high degree of agreement with results from independent verification methods.

Conclusion

Overall, CARAT automatically detects regions with copy number variations and assigns a significance score to each alteration as well as generating allele-specific output. When coupled with SNP genotype calls from the same array, CARAT provides additional detail into the structure of genome wide alterations that can contribute to allelic imbalance.  相似文献   

14.

Background

Surfactant proteins (SP) are important for the innate host defence and essential for a physiological lung function. Several linkage and association studies have investigated the genes coding for different surfactant proteins in the context of pulmonary diseases such as chronic obstructive pulmonary disease or respiratory distress syndrome of preterm infants. In this study we tested whether SP-B was in association with two further pulmonary diseases in children, i. e. severe infections caused by respiratory syncytial virus and bronchial asthma.

Methods

We chose to study five polymorphisms in SP-B: rs2077079 in the promoter region; rs1130866 leading to the amino acid exchange T131I; rs2040349 in intron 8; rs3024801 leading to L176F and rs3024809 resulting in R272H. Statistical analyses made use of the Armitage's trend test for single polymorphisms and FAMHAP and FASTEHPLUS for haplotype analyses.

Results

The polymorphisms rs3024801 and rs3024809 were not present in our study populations. The three other polymorphisms were common and in tight linkage disequilibrium with each other. They did not show association with bronchial asthma or severe RSV infection in the analyses of single polymorphisms. However, haplotypes analyses revealed association of SP-B with severe RSV infection (p = 0.034).

Conclusion

Thus our results indicate a possible involvement of SP-B in the genetic predisposition to severe RSV infections in the German population. In order to determine which of the three polymorphisms constituting the haplotypes is responsible for the association, further case control studies on large populations are necessary. Furthermore, functional analysis need to be conducted.  相似文献   

15.

Background

Parkinson's disease (PD) is the most common neurodegenerative movement disorder, characterized clinically by resting tremor, bradykinesia, postural instability and rigidity. The prevalence of PD is approximately 2% of the population over 65 years of age and 1.7 million PD patients (age ≥ 55 years) live in China. Recently, a common LRRK2 variant Gly2385Arg was reported in ethnic Chinese PD population in Taiwan. We analyzed the frequency of this variant in our independent PD case-control population of Han Chinese from Taiwan.

Methods

305 patients and 176 genetically unrelated healthy controls were examined by neurologists and the diagnosis of PD was based on the published criteria. The region of interest was amplified with standard polymerase chain reaction (PCR). PCR fragments then were directly sequenced in both forward and reverse directions. Differences in genotype frequencies between groups were assessed by the X 2 test, while X 2 analysis was used to test for the Hardy-Weinberg equilibrium.

Results

Of the 305 patients screened we identified 27 (9%) with heterozygous G2385R variant. This mutation was only found in 1 (0.5%) in our healthy control samples (odds ratio = 16.99, 95% CI: 2.29 to 126.21, p = 0.0002). Sequencing of the entire open reading frame of LRRK2 in G2385R carriers revealed no other variants.

Conclusion

These data suggest that the G2385R variant contributes significantly to the etiology of PD in ethnic Han Chinese individuals. With consideration of the enormous and expanding aging Chinese population in mainland China and in Taiwan, this variant is probably the most common known genetic factor for PD worldwide.  相似文献   

16.

Background

The Global Programme to Eliminate Lymphatic Filariasis (GPELF) depends upon Mass Drug Administration (MDA) to interrupt transmission. Therefore, delimitation of transmission risk areas is an important step, and hence we attempted to define a geo-environmental risk model (GERM) for determining the areas of potential transmission of lymphatic filariasis.

Methods

A range of geo-environmental variables has been selected, and customized on GIS platform to develop GERM for identifying the areas of filariasis transmission in terms of "risk" and "non-risk". The model was validated through a 'ground truth study' following standard procedure using GIS tools for sampling and Immuno-chromotographic Test (ICT) for screening the individuals.

Results

A map for filariasis transmission was created and stratified into different spatial entities, "risk' and "non-risk", depending on Filariasis Transmission Risk Index (FTRI). The model estimation corroborated well with the ground (observed) data.

Conclusion

The geo-environmental risk model developed on GIS platform is useful for spatial delimitation purpose on a macro scale.  相似文献   

17.
Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of statistical and computational methods that infer haplotype phase from genotype data. Although a substantial number of such methods have been developed, they have focused principally on inference from unrelated individuals, and comparisons between methods have been rather limited. Here, we describe the extension of five leading algorithms for phase inference for handling father-mother-child trios. We performed a comprehensive assessment of the methods applied to both trios and to unrelated individuals, with a focus on genomic-scale problems, using both simulated data and data from the HapMap project. The most accurate algorithm was PHASE (v2.1). For this method, the percentages of genotypes whose phase was incorrectly inferred were 0.12%, 0.05%, and 0.16% for trios from simulated data, HapMap Centre d'Etude du Polymorphisme Humain (CEPH) trios, and HapMap Yoruban trios, respectively, and 5.2% and 5.9% for unrelated individuals in simulated data and the HapMap CEPH data, respectively. The other methods considered in this work had comparable but slightly worse error rates. The error rates for trios are similar to the levels of genotyping error and missing data expected. We thus conclude that all the methods considered will provide highly accurate estimates of haplotypes when applied to trio data sets. Running times differ substantially between methods. Although it is one of the slowest methods, PHASE (v2.1) was used to infer haplotypes for the 1 million-SNP HapMap data set. Finally, we evaluated methods of estimating the value of r(2) between a pair of SNPs and concluded that all methods estimated r(2) well when the estimated value was >or=0.8.  相似文献   

18.

Background

Frontotemporal lobar degeneration (FTLD) represents a clinically, pathologically and genetically heterogenous neurodegenerative disorder, often complicated by neurological signs such as motor neuron-related limb weakness, spasticity and paralysis, parkinsonism and gait disturbances. Linkage to chromosome 9p had been reported for pedigrees with the neurodegenerative disorder, frontotemporal lobar degeneration (FTLD) and motor neuron disease (MND). The objective in this study is to identify the genetic locus in a multi-generational Australian family with FTLD-MND.

Methods

Clinical review and standard neuropathological analysis of brain sections from affected pedigree members. Genome-wide scan using microsatellite markers and single nucleotide polymorphism fine mapping. Examination of candidate genes by direct DNA sequencing.

Results

Neuropathological examination revealed cytoplasmic deposition of the TDP-43 protein in three affected individuals. Moreover, we identify a family member with clinical Alzheimer's disease, and FTLD-Ubiquitin neuropathology. Genetic linkage and haplotype analyses, defined a critical region between markers D9S169 and D9S1845 on chromosome 9p21. Screening of all candidate genes within this region did not reveal any novel genetic alterations that co-segregate with disease haplotype, suggesting that one individual carrying a meiotic recombination may represent a phenocopy. Re-analysis of linkage data using the new affection status revealed a maximal two-point LOD score of 3.24 and a multipoint LOD score of 3.41 at marker D9S1817. This provides the highest reported LOD scores from a single FTLD-MND pedigree.

Conclusion

Our reported increase in the minimal disease region should inform other researchers that the chromosome 9 locus may be more telomeric than predicted by published recombination boundaries. Moreover, the existence of a family member with clinical Alzheimer's disease, and who shares the disease haplotype, highlights the possibility that late-onset AD patients in the other linked pedigrees may be mis-classified as sporadic dementia cases.  相似文献   

19.

Background

Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs). By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required.

Results

We developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo.

Conclusions

ParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.  相似文献   

20.

Introduction

Different variants of haplotype frequencies may lead to various frequencies of the same variants in individuals with drug resistance and disease susceptibility at the population level.

Materials and methods

In this study, the haplotype frequencies of 4 STR loci including the D8S1132, D8S1779, D8S514 and D8S1743, and 3 STR loci including D11S1304, D11S1998 and D11S934 were investigated in 563 individuals of four Iranian ethnic groups in the capital city of Iran, Tehran. One hundred thirty subjects had the metabolic syndrome. Haplotype frequencies of all markers were calculated.

Results

There were significant differences in the haplotype frequencies in short and long alleles between the metabolic affected subjects and controls. In addition, haplotype frequencies were significant in the four ethnic groups in both chromosomes 8 and 11.

Conclusion

Our findings show a relation between the short allele of D8S1743 in all related haplotype frequencies of subjects with metabolic syndrome. These findings may require more studies of some candidate genes, including the lipoprotein lipase gene, in this chromosomal region.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号