共查询到20条相似文献,搜索用时 15 毫秒
1.
Matthew R. L. Egyud Zofia K. Z. Gajdos Johannah L. Butler Sam Tischfield Loic Le Marchand Laurence N. Kolonel Christopher A. Haiman Brian E. Henderson Joel N. Hirschhorn 《Human genetics》2009,125(3):295-303
Many association methods use a subset of genotyped single nucleotide polymorphisms (SNPs) to capture or infer genotypes at
other untyped SNPs. We and others previously showed that tag SNPs selected to capture common variation using data from The
International HapMap Consortium (Nature 437:1299–1320, 2005), The International HapMap Consortium (Nature 449:851–861, 2007) could also capture variation in populations of similar ancestry to HapMap reference populations (de Bakker et al. in Nat
Genet 38:1298–1303, 2006; González-Neira et al. in Genome Res 16:323–330, 2006; Montpetit et al. in PLoS Genet 2:282–290, 2006; Mueller et al. in Am J Hum Genet 76:387–398, 2005). To capture variation in admixed populations or populations less similar to HapMap panels, a “cosmopolitan approach,” in
which all samples from HapMap are used as a single reference panel, was proposed. Here we refine this suggestion and show
that use of a “weighted reference panel,” constructed based on empirical estimates of ancestry in the target population (relative
to available reference panels), is more efficient than the cosmopolitan approach. Weighted reference panels capture, on average,
only slightly fewer common variants (minor allele frequency > 5%) than the cosmopolitan approach (mean r
2 = 0.977 vs. 0.989, 94.5% variation captured vs. 96.8% at r
2 > 0.8), across the five populations of the Multiethnic Cohort, but entail approximately 25% fewer tag SNPs per panel (average
538 vs. 718). These results extend a recent study in two Indian populations (Pemberton et al. in Ann Hum Genet 72:535–546,
2008). Weighted reference panels are potentially useful for both the selection of tag SNPs in diverse populations and perhaps
in the design of reference panels for imputation of untyped genotypes in genome-wide association studies in admixed populations.
Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users. 相似文献
2.
Background
DNA sequencing is used ubiquitously: from deciphering genomes[1] to determining the primary sequence of small RNAs (smRNAs) [2–5]. The cloning of smRNAs is currently the most conventional method to determine the actual sequence of these important regulators of gene expression. Typical smRNA cloning projects involve the sequencing of hundreds to thousands of smRNA clones that are delimited at their 5' and 3' ends by fixed sequence regions. These primers result from the biochemical protocol used to isolate and convert the smRNA into clonable PCR products. Recently we completed a smRNA cloning project involving tobacco plants, where analysis was required for ~700 smRNA sequences[6]. Finding no easily accessible research tool to enter and analyze smRNA sequences we developed Ebbie to assist us with our study. 相似文献3.
Emily T. Norris Lu Wang Andrew B. Conley Lavanya Rishishwar Leonardo Mariño-Ramírez Augusto Valderrama-Aguirre I. King Jordan 《BMC genomics》2018,19(8):861
Background
Modern Latin American populations were formed via genetic admixture among ancestral source populations from Africa, the Americas and Europe. We are interested in studying how combinations of genetic ancestry in admixed Latin American populations may impact genomic determinants of health and disease. For this study, we characterized the impact of ancestry and admixture on genetic variants that underlie health- and disease-related phenotypes in population genomic samples from Colombia, Mexico, Peru, and Puerto Rico.Results
We analyzed a total of 347 admixed Latin American genomes along with 1102 putative ancestral source genomes from Africans, Europeans, and Native Americans. We characterized the genetic ancestry, relatedness, and admixture patterns for each of the admixed Latin American genomes, finding a spectrum of ancestry proportions within and between populations. We then identified single nucleotide polymorphisms (SNPs) with anomalous ancestry-enrichment patterns, i.e. SNPs that exist in any given Latin American population at a higher frequency than expected based on the population’s genetic ancestry profile. For this set of ancestry-enriched SNPs, we inspected their phenotypic impact on disease, metabolism, and the immune system. All four of the Latin American populations show ancestry-enrichment for a number of shared pathways, yielding evidence of similar selection pressures on these populations during their evolution. For example, all four populations show ancestry-enriched SNPs in multiple genes from immune system pathways, such as the cytokine receptor interaction, T cell receptor signaling, and antigen presentation pathways. We also found SNPs with excess African or European ancestry that are associated with ancestry-specific gene expression patterns and play crucial roles in the immune system and infectious disease responses. Genes from both the innate and adaptive immune system were found to be regulated by ancestry-enriched SNPs with population-specific regulatory effects.Conclusions
Ancestry-enriched SNPs in Latin American populations have a substantial effect on health- and disease-related phenotypes. The concordant impact observed for same phenotypes across populations points to a process of adaptive introgression, whereby ancestry-enriched SNPs with specific functional utility appear to have been retained in modern populations by virtue of their effects on health and fitness.4.
Madoka Koyanagi Julie A Kerns Linda Chung Yan Zhang Scott Brown Tudor Moldoveanu Harmit S Malik Mark Bix 《BMC evolutionary biology》2010,10(1):223
Background
Interleukin-4 (IL4) is a secreted immunoregulatory cytokine critically involved in host protection from parasitic helminths [1]. Reasoning that helminths may have evolved mechanisms to antagonize IL4 to maximize their dispersal, we explored mammalian IL4 evolution. 相似文献5.
Background
It is commonly thought that large asexual populations evolve more rapidly than smaller ones, due to their increased rate of beneficial mutations. Less clear is how population size influences the level of fitness an asexual population can attain. Here, we simulate the evolution of bacteria in repeated serial passage experiments to explore how features such as fitness landscape ruggedness, the size of the mutational target under selection, and the mutation supply rate, interact to affect the evolution of microbial populations of different sizes. 相似文献6.
QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species 总被引:2,自引:0,他引:2
Jifeng Tang Ben Vosman Roeland E Voorrips C Gerard van der Linden Jack AM Leunissen 《BMC bioinformatics》2006,7(1):438
Background
Single nucleotide polymorphisms (SNPs) are important tools in studying complex genetic traits and genome evolution. Computational strategies for SNP discovery make use of the large number of sequences present in public databases (in most cases as expressed sequence tags (ESTs)) and are considered to be faster and more cost-effective than experimental procedures. A major challenge in computational SNP discovery is distinguishing allelic variation from sequence variation between paralogous sequences, in addition to recognizing sequencing errors. For the majority of the public EST sequences, trace or quality files are lacking which makes detection of reliable SNPs even more difficult because it has to rely on sequence comparisons only. 相似文献7.
Sean Myles Dan Davison Jeffrey Barrett Mark Stoneking Nic Timpson 《BMC medical genomics》2008,1(1):1-10
Background
Recent genome-wide association (GWA) studies have provided compelling evidence of association between genetic variants and common complex diseases. These studies have made use of cases and controls almost exclusively from populations of European ancestry and little is known about the frequency of risk alleles in other populations. The present study addresses the transferability of disease associations across human populations by examining levels of population differentiation at disease-associated single nucleotide polymorphisms (SNPs).Methods
We genotyped ~1000 individuals from 53 populations worldwide at 25 SNPs which show robust association with 6 complex human diseases (Crohn's disease, type 1 diabetes, type 2 diabetes, rheumatoid arthritis, coronary artery disease and obesity). Allele frequency differences between populations for these SNPs were measured using Fst. The Fst values for the disease-associated SNPs were compared to Fst values from 2750 random SNPs typed in the same set of individuals.Results
On average, disease SNPs are not significantly more differentiated between populations than random SNPs in the genome. Risk allele frequencies, however, do show substantial variation across human populations and may contribute to differences in disease prevalence between populations. We demonstrate that, in some cases, risk allele frequency differences are unusually high compared to random SNPs and may be due to the action of local (i.e. geographically-restricted) positive natural selection. Moreover, some risk alleles were absent or fixed in a population, which implies that risk alleles identified in one population do not necessarily account for disease prevalence in all human populations.Conclusion
Although differences in risk allele frequencies between human populations are not unusually large and are thus likely not due to positive local selection, there is substantial variation in risk allele frequencies between populations which may account for differences in disease prevalence between human populations. 相似文献8.
GongXin Yu 《BMC bioinformatics》2010,11(1):508
Background
Microevolution is the study of short-term changes of alleles within a population and their effects on the phenotype of organisms. The result of the below-species-level evolution is heterogeneity, where populations consist of subpopulations with a large number of structural variations. Heterogeneity analysis is thus essential to our understanding of how selective and neutral forces shape bacterial populations over a short period of time. The Solexa Genome Analyzer, a next-generation sequencing platform, allows millions of short sequencing reads to be obtained with great accuracy, allowing for the ability to study the dynamics of the bacterial population at the whole genome level. The tool referred to as Gen Htr was developed for genome-wide heterogeneity analysis. 相似文献9.
Nikolaos Refenes Juliane Bolbrinker Georgios Tagaris Antonio Orlacchio Nikolaos Drakoulis Reinhold Kreutz 《BMC neurology》2009,9(1):26
Background
The extended tau haplotype (H1) that covers the entire human microtubule-associated protein tau (MAPT) gene has been implicated in Parkinson's disease (PD). Nevertheless, controversial results, such as two studies in Greek populations with opposite effects, have been reported. Therefore, we set out to determine whether the H1 haplotype and additional single nucleotide polymorphisms (SNPs) included in H1 are associated with PD in a sample of Greek patients. 相似文献10.
Background
Since the single nucleotide polymorphisms (SNPs) are genetic variations which determine the difference between any two unrelated individuals, the SNPs can be used to identify the correct source population of an individual. For efficient population identification with the HapMap genotype data, as few informative SNPs as possible are required from the original 4 million SNPs. Recently, Park et al. (2006) adopted the nearest shrunken centroid method to classify the three populations, i.e., Utah residents with ancestry from Northern and Western Europe (CEU), Yoruba in Ibadan, Nigeria in West Africa (YRI), and Han Chinese in Beijing together with Japanese in Tokyo (CHB+JPT), from which 100,736 SNPs were obtained and the top 82 SNPs could completely classify the three populations. 相似文献11.
Background
Recent studies have shown that the patterns of linkage disequilibrium observed in human populations have a block-like structure, and a small subset of SNPs (called tag SNPs) is sufficient to distinguish each pair of haplotype patterns in the block. In reality, some tag SNPs may be missing, and we may fail to distinguish two distinct haplotypes due to the ambiguity caused by missing data. 相似文献12.
13.
Diane M Martin Sébastien Aubourg Marina B Schouwey Laurent Daviet Michel Schalk Omid Toub Steven T Lund Jörg Bohlmann 《BMC plant biology》2010,10(1):226
Background
Terpenoids are among the most important constituents of grape flavour and wine bouquet, and serve as useful metabolite markers in viticulture and enology. Based on the initial 8-fold sequencing of a nearly homozygous Pinot noir inbred line, 89 putative terpenoid synthase genes (VvTPS) were predicted by in silico analysis of the grapevine (Vitis vinifera) genome assembly [1]. The finding of this very large VvTPS family, combined with the importance of terpenoid metabolism for the organoleptic properties of grapevine berries and finished wines, prompted a detailed examination of this gene family at the genomic level as well as an investigation into VvTPS biochemical functions. 相似文献14.
Liberles DA Schreiber DR Govindarajan S Chamberlin SG Benner SA 《Genome biology》2001,2(4):preprint00-18
Background
Developing an understanding of the molecular basis for the divergence of species lies at the heart of biology. The Adaptive Evolution Database (TAED) serves as a starting point to link events that occur at the same time in the evolutionary history (tree of life) of species, based upon coding sequence evolution analyzed with the Master Catalog. The Master Catalog is a collection of evolutionary models, including multiple sequence alignments, phylogenetic trees, and reconstructed ancestral sequences, for all independently evolving protein sequence modules encoded by genes in GenBank [1]. 相似文献15.
Jeffrey T Foster Gerard J Allan Agnes P Chan Pablo D Rabinowicz Jacques Ravel Paul J Jackson Paul Keim 《BMC plant biology》2010,10(1):13
Background
Castor bean (Ricinus communis) is an agricultural crop and garden ornamental that is widely cultivated and has been introduced worldwide. Understanding population structure and the distribution of castor bean cultivars has been challenging because of limited genetic variability. We analyzed the population genetics of R. communis in a worldwide collection of plants from germplasm and from naturalized populations in Florida, U.S. To assess genetic diversity we conducted survey sequencing of the genomes of seven diverse cultivars and compared the data to a reference genome assembly of a widespread cultivar (Hale). We determined the population genetic structure of 676 samples using single nucleotide polymorphisms (SNPs) at 48 loci. 相似文献16.
17.
The Fast Changing Landscape of Sequencing Technologies and Their Impact on Microbial Genome Assemblies and Annotation 总被引:1,自引:0,他引:1
Konstantinos Mavromatis Miriam L. Land Thomas S. Brettin Daniel J. Quest Alex Copeland Alicia Clum Lynne Goodwin Tanja Woyke Alla Lapidus Hans Peter Klenk Robert W. Cottingham Nikos C. Kyrpides 《PloS one》2012,7(12)
Background
The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation.Methodology/Principal Findings
In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis.Conclusion
These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio). 相似文献18.
Background
Single nucleotide polymorphisms (SNPs) provide an important tool in pinpointing susceptibility genes for complex diseases and in unveiling human molecular evolution. Selection and retrieval of an optimal SNP set from publicly available databases have emerged as the foremost bottlenecks in designing large-scale linkage disequilibrium studies, particularly in case-control settings. 相似文献19.
Background
Completed genomes and environmental genomic sequences are bringing a significant contribution to understanding the evolution of gene families, microbial metabolism and community eco-physiology. Here, we used comparative genomics and phylogenetic analyses in conjunction with enzymatic data to probe the evolution and functions of a microbial nitrilase gene family. Nitrilases are relatively rare in bacterial genomes, their biological function being unclear. 相似文献20.
Tom M Conrad Andrew R Joyce M Kenyon Applebee Christian L Barrett Bin Xie Yuan Gao Bernhard Ø Palsson 《Genome biology》2009,10(10):R118-12