首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
SUMMARY: VeriScan is a software package for the analysis of DNA sequence polymorphisms at the whole genome scale. Among other features, the software (1) can conduct many population genetic analyses; (2) incorporates a multiresolution wavelet transform-based method that allows capturing relevant information from DNA polymorphism data; (3) facilitates the visualization of the results in the most commonly used genome browsers.  相似文献   

2.
The analysis of the structure of populations on the basis of genetic data is essential in population genetics. It is used, for instance, to study the evolution of species or to correct for population stratification in association studies. These genetic data, normally based on DNA polymorphisms, may contain irrelevant information that biases the inference of population structure. In this paper we adapt a recently proposed algorithm, named multistart EMA, to be used in the inference of population structure. This algorithm is able to deal with irrelevant information when obtaining the (probabilistic) population partition. Additionally, we present a maker selection test able to obtain the most relevant markers to retrieve that population partition. The proposed algorithm is compared with the widely used STRUCTURE software on the basis of the F(ST) metric and the log-likelihood score. It is shown that the proposed algorithm improves the obtention of the population structure. Moreover, information about relevant markers obtained by the multi-start EMA can be used to improve the results obtained by other methods, correct for population stratification or even also reduce the economical cost of sequencing new samples. The software presented in this paper is available online at http://www.sc.ehu.es/ccwbayes/members/guzman.  相似文献   

3.
We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm.  相似文献   

4.
Background: Epidemiological studies have identified potentially modifiable risks for colorectal cancer, including alcohol intake, diet and a sedentary lifestyle. Modelling these environmental factors alongside genetic risk is critical in obtaining accurate estimates of disease risk and improving our understanding of behavioural modifications. Methods: 14 independent single nucleotide polymorphisms identified though GWAS studies and reported on by the international consortium COGENT were used to model genetic disease risk at a population level. Six well validated environmental risks were selected for modelling together with the genetic risk factors (alcohol intake; smoking; exercise levels; BMI; fibre intake and consumption of red and processed meat). Through a simulation study using risk modelling software, we assessed the potential impact of behavioural modifications on disease risk. Results: Modelling the genetic data alone leads to 24% of the population being classified as reduced risk; 60% average risk; 10% elevated risk and 6% high risk for colorectal cancer. Adding alcohol consumption to the model reduced the elevated and high risk categories to 9% and 5% respectively. The simulation study suggests that a substantial proportion of individuals could reduce their disease risk profile by altering their behaviour, including reclassification of over 62% of heavy drinkers. Conclusion: Modelling lifestyle factors alongside genetic risk can provide useful strategies to select individuals for screening for colorectal cancer risk. Impact: Quantifying the impact of moderating behaviour, particularly related to alcohol intake and obesity levels, is beneficial for informing health campaigns and tailoring prevention strategies.  相似文献   

5.
The X chromosome is a singular source of information in population genetics, anthropological research and in forensic cases. Thus, many researchers have been interested in characterizing X chromosome markers in different populations. The Brazilian Genetic Database of Chromosome X (BGBX—Banco Genético Brasileiro do Cromossomo X) website is freely available in Portuguese and English versions and was developed with the main purpose of compiling all Brazilian population genetic data for X chromosome short tandem repeats (X-STRs) markers published in scientific journals searchable via PubMed. Furthermore, this database presents other relevant information concerning X-STRs, such as genetic and physical locations, allele structure, nomenclature, mutation rates, primers described in the literature and likelihood ratio calculation. The entire scientific community is now encouraged to submit their X-STR population genetic data to this website, available at http://www.bgbx.com.br. Regarding future prospects of BGBX, the authors intend to expand the website with data and information of X-linked insertion–deletion polymorphisms.  相似文献   

6.
This paper outlines a PCR-based approach for population genetics that offers several advantages over conventional Southern blotting methods for revealing restriction-fragment-length polymorphisms (RFLPs) in nuclear DNA. Primers are constructed from clones isolated from a nuclear DNA library, and these primers subsequently are employed in in vitro syntheses of homologous regions. Amplified products are then screened directly for RFLPs by using gel-staining procedures. Population applications for this PCR-based approach, including potential strengths and weaknesses, are exemplified by two RFLP data sets generated to estimate (a) male-mediated gene flow in the green turtle (Chelonia mydas) and (b) geographic population genetic structure in the American oyster (Crassostrea virginica). Restriction assays of amplified products from 14 or 15 independent primer pairs in each species revealed polymorphisms at several loci that proved highly informative in the population genetic analyses. In general, the Mendelian polymorphisms produced by this PCR-based approach will provide useful genetic markers for population studies, particularly in situations where simpler and less expensive allozyme methods have failed, for whatever reason, to provide adequate information.   相似文献   

7.
We have developed a publicly accessible database (ALFRED, the ALlele FREquency Database) that catalogues allele frequency data for a wide range of population samples and DNA polymorphisms. This database is web-accessible through our laboratory (Kidd Lab) Web site: http://info.med.yale.edu/genetics/kkidd. ALFRED currently contains data on 60 populations and 156 genetic systems including single nucleotide polymorphisms (SNPs), short tandem repeat polymorphisms (STRPs), variable number of tandem repeats (VNTRs) and insertion-deletion polymorphisms. While data are not available for all population-DNA polymorphism combinations, over 2000 allele frequency tables have been entered. Our database is designed (i) to address our specific research requirements as well as broader scientific objectives; (ii) to allow researchers and interested educators to easily navigate and retrieve data of interest to them; and (iii) to integrate links to other related public databases such as dbSNP, GenBank and PubMed.  相似文献   

8.
Population-based genetic association studies, popularly known as case-control studies, have continued to be the most preferred method for deciphering the genetic basis of various complex diseases, even in the post-human genome sequencing era. However, interpopulation differences in allele, genotype, and haplotype frequencies and linkage disequilibrium patterns lead to inconsistent results in candidate gene association studies. Therefore, for any meaningful disease association study, knowledge of the normative genetic background of the baseline population is a prerequisite. In addition, such genetic variation data also provide a ready-made menu of allele frequencies and linkage disequilibrium patterns of various polymorphisms in specific candidate genes in a particular population, which is a useful reference for further genetic association studies. Such genetic variation data are lacking for the Indian population, which represents about one-sixth of the world's population. In the present study we have reported the allele, genotype, and haplotype frequencies, Hardy-Weinberg equilibrium status, and linkage disequilibrium patterns of 12 polymorphisms in six candidate genes from the renin-angiotensin-aldosterone system among Indians. Because of their different history of origin, the Indian population is broadly divided into two subpopulations: North Indians (Caucasian Europeans) and South Indians (Dravidians). Considering this well-documented difference in gene pools, we have presented a comparative account of the normative genetic data of North Indian and South Indian populations with at least four individuals of urban and suburban origin from each of the representative states of northern and southern India.  相似文献   

9.

Background  

Population structure analysis is important to genetic association studies and evolutionary investigations. Parametric approaches, e.g. STRUCTURE and L-POP, usually assume Hardy-Weinberg equilibrium (HWE) and linkage equilibrium among loci in sample population individuals. However, the assumptions may not hold and allele frequency estimation may not be accurate in some data sets. The improved version of STRUCTURE (version 2.1) can incorporate linkage information among loci but is still sensitive to high background linkage disequilibrium. Nowadays, large-scale single nucleotide polymorphisms (SNPs) are becoming popular in genetic studies. Therefore, it is imperative to have software that makes full use of these genetic data to generate inference even when model assumptions do not hold or allele frequency estimation suffers from high variation.  相似文献   

10.
Temperature gradient capillary electrophoresis (TGCE) is a high-throughput method to detect segregating single nucleotide polymorphisms and InDel polymorphisms in genetic mapping populations. Existing software that analyzes TGCE data was, however, designed for mutation analysis rather than genetic mapping. Genetic recombinant analysis and mapping assistant (GRAMA) is a new tool that automates TGCE data analysis for the purpose of genetic mapping. Data from multiple TGCE runs are analyzed, integrated, and displayed in an intuitive visual format. GRAMA includes an algorithm to detect peaks in electropherograms and can automatically compare its peak calls with those produced by another software package. Consequently, GRAMA provides highly accurate results with a low false positive rate of 5.9% and an even lower false negative rate of 1.3%. Because of its accuracy and intuitive interface, GRAMA boosts user productivity more than twofold relative to previous manual methods of scoring TGCE data. GRAMA is written in Java and is freely available at .  相似文献   

11.
Inferring the demographic history of species and their populations is crucial to understand their contemporary distribution, abundance and adaptations. The high computational overhead of likelihood‐based inference approaches severely restricts their applicability to large data sets or complex models. In response to these restrictions, approximate Bayesian computation (ABC) methods have been developed to infer the demographic past of populations and species. Here, we present the results of an evaluation of the ABC‐based approach implemented in the popular software package diyabc using simulated data sets (mitochondrial DNA sequences, microsatellite genotypes and single nucleotide polymorphisms). We simulated population genetic data under five different simple, single‐population models to assess the model recovery rates as well as the bias and error of the parameter estimates. The ability of diyabc to recover the correct model was relatively low (0.49): 0.6 for the simplest models and 0.3 for the more complex models. The recovery rate improved significantly when reducing the number of candidate models from five to three (from 0.57 to 0.71). Among the parameters of interest, the effective population size was estimated at a higher accuracy compared to the timing of events. Increased amounts of genetic data did not significantly improve the accuracy of the parameter estimates. Some gains in accuracy and decreases in error were observed for scaled parameters (e.g., Neμ) compared to unscaled parameters (e.g., Ne and μ). We concluded that diyabc ‐based assessments are not suited to capture a detailed demographic history, but might be efficient at capturing simple, major demographic changes.  相似文献   

12.
ABSTRACT: BACKGROUND: Extensive genetic diversity in viral populations within infected hosts and the divergence of variants from existing reference genomes impede the analysis of deep viral sequencing data. A de novo population consensus assembly is valuable both as a single linear representation of the population and, as a backbone on which intra-host variants can be accurately mapped. The availability of consensus assemblies and robustly mapped variants are crucial to the genetic study of viral disease progression, transmission dynamics, and viral evolution. Existing de novo assembly techniques fail to robustly assemble ultra-deep sequence data from genetically heterogeneous populations such as viruses into full-length genomes due to the presence of extensive genetic variability, contaminants, and variable sequence coverage. RESULTS: We present VICUNA, a de novo assembly algorithm suitable for generating consensus assemblies from genetically heterogeneous populations. We demonstrate its effectiveness on Dengue, Human Immunodeficiency and West Nile viral populations, representing a range of intra-host diversity. Compared to state-of-the-art assemblers designed for haploid or diploid systems, VICUNA recovers full-length consensus and captures insertion/deletion polymorphisms in diverse samples. Final assemblies maintain a high base calling accuracy. VICUNA program is publicly available at: http://www.broadinstitute.org/scientific-community/science/projects/viral-genomics/viral-genomics-analysis-software CONCLUSIONS: We developed VICUNA, a publicly available software tool, that enables consensus assembly of ultra-deep sequence derived from diverse viral populations. While VICUNA was developed for the analysis of viral populations, its application to other heterogeneous sequence data sets such as metagenomic or tumor cell population samples may prove beneficial in these fields of research.  相似文献   

13.
目的:研究新疆维吾尔族、汉族两民族亚甲基四氢叶酸还原酶(Methyleneterahyofolate reductase MTHFR)基因多态性的分布情况,获取新疆维吾尔族与汉族MTHFR 1298位群体遗传学数据。方法:应用PCR-RFLP技术检测新疆维吾尔族及汉族MTHFR1298位多态性位点基因频率及基因型频率。结果:新疆维吾尔族、汉族MTHFR 1298位C等位基因分布频率分别为12%、23%,P<0.05有统计学差异性,且新疆维吾尔族MTHFR 1298位C等位基因分布频率与现有报道的少数民族贵州苗族、布依族具有统计学差异。结论:MTHFR 1298位多态性在不同民族具有差异性:MTHFR 1298位多态性在新疆维吾尔族和汉族有民族差异;新疆新疆维吾尔族MTHFR 1298位C等位基因频率与贵州苗族、布依族少数民族之间具有民族差异性。  相似文献   

14.
《Genomics》2020,112(6):3837-3845
The genetic polymorphisms of diallelic deletion/insertion polymorphic (DIP) loci in the Shaanxi Han population are still not clearly characterized. Herein, allele frequencies and forensic application efficiencies for 30 diallelic DIP loci were investigated in 506 unrelated healthy Han individuals from Chinese Shaanxi province. Based on population data of the same 30 diallelic DIP loci, the genetic differentiations, hierarchical clustering relationships and population architectures among Shaanxi Han and other 50 populations were further dissected through genetic and bioinformatics analyses. Results indicated that most of the 30 diallelic DIP loci were relatively high polymorphisms in the Shaanxi Han population; and there were the genetically intimate relationships between Shaanxi Han and the East Asian populations. In summary, this study provided significant insights into genetic background of Shaanxi Han population, and the multiplex amplification of these 30 diallelic DIP loci was appropriate for forensic individual identification and population genetic research in Shaanxi Han population.  相似文献   

15.
According to classical genetic theory, allelic genes at one locus are expected to segregate and be manifested independently of allelic genes at another locus. At the population level any significant deviation from this general hypothesis resulting from specific biologic and genetic effects can be recognized in the form of nonrandom associations between genetic markers. The present data, consisting of 24 genetic polymorphisms determined from a sample of 998 unselected and unrelated South African blacks, offers an opportunity to test whether or not any such nonrandom associations exist between the genetic markers. After appropriate statistical calculations on the population data, we found that 13 pairs of genetic polymorphisms demonstrate a nonrandom association (statistically significant). Because the results cannot be explained in terms of known biologic mechanisms, we conclude that the associations observed could be due to random statistical effects (repeated application of the chi-square test) and/or to real (as yet unknown) biologic phenomena in the population studied. This tentative conclusion can serve as a guideline for more specific investigations.  相似文献   

16.
Understanding the factors that maintain genetic variation in natural populations is a foundational goal of evolutionary biology. To this end, population geneticists have developed a variety of models that can produce stable polymorphisms. In one of the earliest models, Owen ( 1953 ) demonstrated that differences in selection pressures acting on males and females could maintain multiple alleles of a gene at a stable equilibrium. If the selection pressures act in opposite directions in males and females, we refer to this as (inter‐) sexual conflict or sexual antagonism (Arnqvist & Rowe, 2005 ). Testing if sexual conflict maintains genetic variation in natural populations is a tremendous challenge—it requires both identifying loci that harbor sexually antagonistic alleles and determining whether those alleles are maintained as stable polymorphisms (Mank, 2017 ). Doing so genome‐wide is even harder because it is not tractable to identify sexually antagonistic alleles and test for stable polymorphisms at all loci. Dutoit et al. ( 2018 ) confront this challenge in a paper published in this issue of Molecular Ecology. Using gene expression and population genomic data from the collared flycatcher, Dutoit et al. ( 2018 ) identify associations and correlations between genomic signatures of balanced polymorphisms and sexual conflict.  相似文献   

17.
The role that balancing selection plays in the maintenance of genetic diversity remains unresolved. Here, we introduce a new test, based on the McDonald–Kreitman test, in which the number of polymorphisms that are shared between populations is contrasted to those that are private at selected and neutral sites. We show that this simple test is robust to a variety of demographic changes, and that it can also give a direct estimate of the number of shared polymorphisms that are directly maintained by balancing selection. We apply our method to population genomic data from humans and provide some evidence that hundreds of nonsynonymous polymorphisms are subject to balancing selection.

What maintains genetic variation remains an unresolved mystery. This study describes the development of a new test and its application to human population genomic data, suggesting that natural selection may have a much more important role than previously thought, with hundreds of non-synonymous polymorphisms subject to balancing selection.  相似文献   

18.
The analysis of genetic marker data is increasingly being conducted in the context of the spatial arrangement of strata (e.g. populations) necessitating a more flexible set of analysis tools. GeneticStudio consists of four interacting programs: (i) Geno a spreadsheet-like interface for the analysis of spatially explicit marker-based genetic variation; (ii) Graph software for the analysis of Population Graph and network topologies, (iii) Manteller, a general purpose for matrix analysis program; and (iv) SNPFinder, a program for identifying single nucleotide polymorphisms. The GeneticStudio suite is available as source code as well as binaries for OSX and Windows and is distributed under the GNU General Public License.  相似文献   

19.
The power of genome-wide association studies (GWAS) rests on several foundations: (i) there is a significant amount of additive genetic variation, (ii) individual causal polymorphisms often have sizable effects and (iii) they segregate at moderate-to-intermediate frequencies, or will be effectively ‘tagged'' by polymorphisms that do. Each of these assumptions has recently been questioned. (i) Why should genetic variation appear additive given that the underlying molecular networks are highly nonlinear? (ii) A new generation of relatedness-based analyses directs us back to the nearly infinitesimal model for effect sizes that quantitative genetics was long based upon. (iii) Larger effect causal polymorphisms are often low frequency, as selection might lead us to expect. Here, we review these issues and other findings that appear to question many of the foundations of the optimism GWAS prompted. We then present a roadmap emerging as one possible future for quantitative genetics. We argue that in future GWAS should move beyond purely statistical grounds. One promising approach is to build upon the combination of population genetic models and molecular biological knowledge. This combined treatment, however, requires fitting experimental data to models that are very complex, as well as accurate capturing of the uncertainty of resulting inference. This problem can be resolved through Bayesian analysis and tools such as approximate Bayesian computation—a method growing in popularity in population genetic analysis. We show a case example of anterior–posterior segmentation in Drosophila, and argue that similar approaches will be helpful as a GWAS augmentation, in human and agricultural research.  相似文献   

20.
MOTIVATION: Polymorphisms in human genes are being described in remarkable numbers. Determining which polymorphisms and which environmental factors are associated with common, complex diseases has become a daunting task. This is partly because the effect of any single genetic variation will likely be dependent on other genetic variations (gene-gene interaction or epistasis) and environmental factors (gene-environment interaction). Detecting and characterizing interactions among multiple factors is both a statistical and a computational challenge. To address this problem, we have developed a multifactor dimensionality reduction (MDR) method for collapsing high-dimensional genetic data into a single dimension thus permitting interactions to be detected in relatively small sample sizes. In this paper, we describe the MDR approach and an MDR software package. RESULTS: We developed a program that integrates MDR with a cross-validation strategy for estimating the classification and prediction error of multifactor models. The software can be used to analyze interactions among 2-15 genetic and/or environmental factors. The dataset may contain up to 500 total variables and a maximum of 4000 study subjects. AVAILABILITY: Information on obtaining the executable code, example data, example analysis, and documentation is available upon request. SUPPLEMENTARY INFORMATION: All supplementary information can be found at http://phg.mc.vanderbilt.edu/Software/MDR.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号