共查询到20条相似文献,搜索用时 31 毫秒
1.
30个祖先信息位点的筛选及应用 总被引:3,自引:0,他引:3
摘要:目的 筛选一组祖先信息SNPs位点(AIMs,Ancestry Informative Markers),构建复合检测体系,用于东亚、欧洲和非洲人群遗传成分描述及个体种族来源推断。方法 以HapMap数据库9个人群的658份样本的分型数据为基础,从30个表型相关基因总共282个SNPs位点中筛选出30个AIMs位点,基于微测序-通用芯片技术构建复合检测体系,并建立人群等位基因频率数据库。使用这组位点分析HapMap数据库中658份人群样本,初步验证位点的区分效能;然后,使用研究构建的体系检验收集的5个人群194份无关个体的DNA样本。最后,通过Structure软件分析获取人群的成分构成以及个体的遗传成分,对个体样本进行种族来源推断。 结果 筛选的30个AIMs位点符合哈迪温伯格平衡(p>0.01),位点之间没有连锁(r2<0.1), 658份HapMap数据库样本和194份实验样本的祖先成分分析结果与已知结果完全一致。 结论 本文筛选并建立的30个AIMs位点复合检测体系,能够有效实现东亚、欧洲、非洲人群及混合人群的成分构成和个体遗传成分的分析,有效控制遗传连锁分析中由于人群分层现象带来的误差,也可以用于法医DNA检验中个体祖先来源推断。 相似文献
2.
3.
4.
Today biotechnology is perhaps the most important technology field because of the strong health and food implications. However, due to the nature of said technology, there is the need of a huge amount of investments to sustain the experimentation costs. Consequently, investors aim to safeguard as much as possible their investments. Intellectual Property, and in particular patents, has been demonstrated to actually constitute a powerful tool to help them. Moreover, patents represent an extremely important means to disclose biotechnology inventions. Patentable biotechnology inventions involve products as nucleotide and amino acid sequences, microorganisms, processes or methods for modifying said products, uses for the manufacture of medicaments, etc. There are several ways to protect inventions, but all follow the three main patentability requirements: novelty, inventive step and industrial application. 相似文献
5.
Chromosomes or other long DNA sequences contain many highly similar repeated sub-sequences. While there are efficient methods for detecting strict repeats or detecting already characterized repeats, there is no software available for detecting approximate repeats in large DNA sequences allowing for weighted substitutions and indels in a coherent statistical framework. Here, we present an implementation of a two-steps method (seed detection followed by their extension) that detects those approximate repeats. Our method is computationally efficient enough to handle large sequences and is flexible enough to account for influencing factors, such as sequence-composition biases both at the seed detection and alignment levels. AVAILABILITY: http://wwwabi.snv.jussieu.fr/public/RepSeek/ 相似文献
6.
Galanter JM Fernandez-Lopez JC Gignoux CR Barnholtz-Sloan J Fernandez-Rozadilla C Via M Hidalgo-Miranda A Contreras AV Figueroa LU Raska P Jimenez-Sanchez G Zolezzi IS Torres M Ponte CR Ruiz Y Salas A Nguyen E Eng C Borjas L Zabala W Barreto G González FR Ibarra A Taboada P Porras L Moreno F Bigham A Gutierrez G Brutsaert T León-Velarde F Moore LG Vargas E Cruz M Escobedo J Rodriguez-Santana J Rodriguez-Cintrón W Chapela R Ford JG Bustamante C Seminara D Shriver M Ziv E Burchard EG Haile R 《PLoS genetics》2012,8(3):e1002554
Most individuals throughout the Americas are admixed descendants of Native American, European, and African ancestors. Complex historical factors have resulted in varying proportions of ancestral contributions between individuals within and among ethnic groups. We developed a panel of 446 ancestry informative markers (AIMs) optimized to estimate ancestral proportions in individuals and populations throughout Latin America. We used genome-wide data from 953 individuals from diverse African, European, and Native American populations to select AIMs optimized for each of the three main continental populations that form the basis of modern Latin American populations. We selected markers on the basis of locus-specific branch length to be informative, well distributed throughout the genome, capable of being genotyped on widely available commercial platforms, and applicable throughout the Americas by minimizing within-continent heterogeneity. We then validated the panel in samples from four admixed populations by comparing ancestry estimates based on the AIMs panel to estimates based on genome-wide association study (GWAS) data. The panel provided balanced discriminatory power among the three ancestral populations and accurate estimates of individual ancestry proportions (R2 > 0.9 for ancestral components with significant between-subject variance). Finally, we genotyped samples from 18 populations from Latin America using the AIMs panel and estimated variability in ancestry within and between these populations. This panel and its reference genotype information will be useful resources to explore population history of admixture in Latin America and to correct for the potential effects of population stratification in admixed samples in the region. 相似文献
7.
With the influx of various SNP genotyping assays in recent years, there has been a need for an assay that is robust, yet cost effective, and could be performed using standard gel-based procedures. In this context, CAPS markers have been shown to meet these criteria. However, converting SNPs to CAPS markers can be a difficult process if done manually. In order to address this problem, we describe a computer program, SNP2CAPS, that facilitates the computational conversion of SNP markers into CAPS markers. 413 multiple aligned sequences derived from barley ESTs were analysed for the presence of polymorphisms in 235 distinct restriction sites. 282 (90%) of 314 alignments that contain sequence variation due to SNPs and InDels revealed at least one polymorphic restriction site. After reducing the number of restriction enzymes from 235 to 10, 31% of the polymorphic sites could still be detected. In order to demonstrate the usefulness of this tool for marker development, we experimentally validated some of the results predicted by SNP2CAPS. 相似文献
8.
The completion of the human genome project, and other genome sequencing projects, has spearheaded the emergence of the field of bioinformatics. Using computer programs to analyse DNA and protein information has become an important area of life science research and development. While it is not necessary for most life science researchers to develop specialist bioinformatic skills (including software development), basic skills in the application of common bioinformatics software and the effective interpretation of results are increasingly required by all life science researchers. Training in bioinformatics is increasingly occurring within the university system as part of existing undergraduate science and specialist degrees. One difficulty in bioinformatics education is the sheer number of software programs required in order to provide a thorough grounding in the subject to the student. Teaching requires either a well-maintained internal server with all the required software, properly interfacing with student terminals, and with sufficient capacity to handle multiple simultaneous requests, or it requires the individual installation and maintenance of every piece of software on each computer. In both cases, there are difficult issues regarding site maintenance and accessibility. In this article, we discuss the use of BioManager, a web-based bioinformatics application integrating a variety of common bioinformatics tools, for teaching, including its role as the main bioinformatics training tool in some Australian and international universities. We discuss some of the issues with using a bioinformatics resource primarily created for research in an undergraduate teaching environment. 相似文献
9.
Eberle MA Ng PC Kuhn K Zhou L Peiffer DA Galver L Viaud-Martinez KA Lawley CT Gunderson KL Shen R Murray SS 《PLoS genetics》2007,3(10):1827-1837
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ~ 1.8–2.0). Relative risks as low as λ ~ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data. 相似文献
10.
Silva MC Zuccherato LW Soares-Souza GB Vieira ZM Cabrera L Herrera P Balqui J Romero C Jahuira H Gilman RH Martins ML Tarazona-Santos E 《Genetics and molecular research : GMR》2010,9(4):2069-2085
Admixture occurs when individuals from parental populations that have been isolated for hundreds of generations form a new hybrid population. Currently, interest in measuring biogeographic ancestry has spread from anthropology to forensic sciences, direct-to-consumers personal genomics, and civil rights issues of minorities, and it is critical for genetic epidemiology studies of admixed populations. Markers with highly differentiated frequencies among human populations are informative of ancestry and are called ancestry informative markers (AIMs). For tri-hybrid Latin American populations, ancestry information is required for Africans, Europeans and Native Americans. We developed two multiplex panels of AIMs (for 14 SNPs) to be genotyped by two mini-sequencing reactions, suitable for investigators of medium-small laboratories to estimate admixture of Latin American populations. We tested the performance of these AIMs by comparing results obtained with our 14 AIMs with those obtained using 108 AIMs genotyped in the same individuals, for which DNA samples is available for other investigators. We emphasize that this type of comparison should be made when new admixture/population structure panels are developed. At the population level, our 14 AIMs were useful to estimate European admixture, though they overestimated African admixture and underestimated Native American admixture. Combined with more AIMs, our panel could be used to infer individual admixture. We used our panel to infer the pattern of admixture in two urban populations (Montes Claros and Manhua?u) of the State of Minas Gerais (southeastern Brazil), obtaining a snapshot of their genetic structure in the context of their demographic history. 相似文献
11.
Zhang J Wheeler DA Yakub I Wei S Sood R Rowe W Liu PP Gibbs RA Buetow KH 《PLoS computational biology》2005,1(5):e53
Identification of single nucleotide polymorphisms (SNPs) and mutations is important for the discovery of genetic predisposition to complex diseases. PCR resequencing is the method of choice for de novo SNP discovery. However, manual curation of putative SNPs has been a major bottleneck in the application of this method to high-throughput screening. Therefore it is critical to develop a more sensitive and accurate computational method for automated SNP detection. We developed a software tool, SNPdetector, for automated identification of SNPs and mutations in fluorescence-based resequencing reads. SNPdetector was designed to model the process of human visual inspection and has a very low false positive and false negative rate. We demonstrate the superior performance of SNPdetector in SNP and mutation analysis by comparing its results with those derived by human inspection, PolyPhred (a popular SNP detection tool), and independent genotype assays in three large-scale investigations. The first study identified and validated inter- and intra-subspecies variations in 4,650 traces of 25 inbred mouse strains that belong to either the Mus musculus species or the M. spretus species. Unexpected heterozygosity in CAST/Ei strain was observed in two out of 1,167 mouse SNPs. The second study identified 11,241 candidate SNPs in five ENCODE regions of the human genome covering 2.5 Mb of genomic sequence. Approximately 50% of the candidate SNPs were selected for experimental genotyping; the validation rate exceeded 95%. The third study detected ENU-induced mutations (at 0.04% allele frequency) in 64,896 traces of 1,236 zebra fish. Our analysis of three large and diverse test datasets demonstrated that SNPdetector is an effective tool for genome-scale research and for large-sample clinical studies. SNPdetector runs on Unix/Linux platform and is available publicly (http://lpg.nci.nih.gov). 相似文献
12.
13.
Oinn T Addis M Ferris J Marvin D Senger M Greenwood M Carver T Glover K Pocock MR Wipat A Li P 《Bioinformatics (Oxford, England)》2004,20(17):3045-3054
MOTIVATION: In silico experiments in bioinformatics involve the co-ordinated use of computational tools and information repositories. A growing number of these resources are being made available with programmatic access in the form of Web services. Bioinformatics scientists will need to orchestrate these Web services in workflows as part of their analyses. RESULTS: The Taverna project has developed a tool for the composition and enactment of bioinformatics workflows for the life sciences community. The tool includes a workbench application which provides a graphical user interface for the composition of workflows. These workflows are written in a new language called the simple conceptual unified flow language (Scufl), where by each step within a workflow represents one atomic task. Two examples are used to illustrate the ease by which in silico experiments can be represented as Scufl workflows using the workbench application. 相似文献
14.
In recent years, virtual learning is growing rapidly. Universities, colleges, and secondary schools are now delivering training and education over the internet. Beside this, resources available over the WWW are huge and understanding the various techniques employed in the field of Bioinformatics is increasingly complex for students during implementation. Here, we discuss its importance in developing and delivering an educational system in Bioinformatics based on e-learning environment. 相似文献
15.
Nameeta?Shah Michael?V?Teplitsky Simon?Minovitsky Len?A?Pennacchio Philip?Hugenholtz Bernd?Hamann Inna?L?Dubchak
Background
Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at [1]. 相似文献16.
The use of AFLP to find an informative SNP: genetic differences across a migratory divide in willow warblers 总被引:6,自引:0,他引:6
We used the amplified fragment length polymorphism (AFLP) method to obtain genetic markers distinguishing two subspecies of willow warblers Phylloscopus trochilus that have different migratory behaviours but are not differentiated in mitochondrial DNA or at several microsatellite loci. With the inverse-polymerase chain reaction (PCR) approach we converted a dominant AFLP-marker to a codominant single nucleotide polymorphism (SNP). Across Scandinavia we typed 621 birds at the SNP locus AFLP-WW1 and we found a sigmoid change in allele frequencies centred around 62 degrees latitude. North of the latitudinal cline was a west-east cline. Both clines are narrower than one would expect from dispersal distances in willow warblers, which suggests that these are maintained by selection. The latitudinal cline at the locus AFLP-WW1 is paralleled by changes in several other traits, all of which might be maintained by a single selective force. The most plausible selection factor that we have identified is selection against hybrids because of inferior migratory behaviour. The selective force maintaining the east-west cline is less obvious. We discuss alternatives to the selection scenario, involving colonization history and asymmetric gene flow. 相似文献
17.
Verena Ras Gerrit Botha Shaun Aron Katie Lennard Imane Allali Shantelle Claassen-Weitz Kilaza Samson Mwaikono Dane Kennedy Jessica R. Holmes Gloria Rendon Sumir Panji Christopher J Fields Nicola Mulder 《PLoS computational biology》2021,17(2)
With more microbiome studies being conducted by African-based research groups, there is an increasing demand for knowledge and skills in the design and analysis of microbiome studies and data. However, high-quality bioinformatics courses are often impeded by differences in computational environments, complicated software stacks, numerous dependencies, and versions of bioinformatics tools along with a lack of local computational infrastructure and expertise. To address this, H3ABioNet developed a 16S rRNA Microbiome Intermediate Bioinformatics Training course, extending its remote classroom model. The course was developed alongside experienced microbiome researchers, bioinformaticians, and systems administrators, who identified key topics to address. Development of containerised workflows has previously been undertaken by H3ABioNet, and Singularity containers were used here to enable the deployment of a standard replicable software stack across different hosting sites. The pilot ran successfully in 2019 across 23 sites registered in 11 African countries, with more than 200 participants formally enrolled and 106 volunteer staff for onsite support. The pulling, running, and testing of the containers, software, and analyses on various clusters were performed prior to the start of the course by hosting classrooms. The containers allowed the replication of analyses and results across all participating classrooms running a cluster and remained available posttraining ensuring analyses could be repeated on real data. Participants thus received the opportunity to analyse their own data, while local staff were trained and supported by experienced experts, increasing local capacity for ongoing research support. This provides a model for delivering topic-specific bioinformatics courses across Africa and other remote/low-resourced regions which overcomes barriers such as inadequate infrastructures, geographical distance, and access to expertise and educational materials. 相似文献
18.
Franois-Olivier Desmet Dalil Hamroun Marine Lalande Gwenaëlle Collod-Broud Mireille Claustres Christophe Broud 《Nucleic acids research》2009,37(9):e67
Thousands of mutations are identified yearly. Although many directly affect protein expression, an increasing proportion of mutations is now believed to influence mRNA splicing. They mostly affect existing splice sites, but synonymous, non-synonymous or nonsense mutations can also create or disrupt splice sites or auxiliary cis-splicing sequences. To facilitate the analysis of the different mutations, we designed Human Splicing Finder (HSF), a tool to predict the effects of mutations on splicing signals or to identify splicing motifs in any human sequence. It contains all available matrices for auxiliary sequence prediction as well as new ones for binding sites of the 9G8 and Tra2-β Serine-Arginine proteins and the hnRNP A1 ribonucleoprotein. We also developed new Position Weight Matrices to assess the strength of 5′ and 3′ splice sites and branch points. We evaluated HSF efficiency using a set of 83 intronic and 35 exonic mutations known to result in splicing defects. We showed that the mutation effect was correctly predicted in almost all cases. HSF could thus represent a valuable resource for research, diagnostic and therapeutic (e.g. therapeutic exon skipping) purposes as well as for global studies, such as the GEN2PHEN European Project or the Human Variome Project. 相似文献
19.
20.
Panitz F Stengaard H Hornshøj H Gorodkin J Hedegaard J Cirera S Thomsen B Madsen LB Høj A Vingborg RK Zahn B Wang X Wang X Wernersson R Jørgensen CB Scheibye-Knudsen K Arvin T Lumholdt S Sawera M Green T Nielsen BJ Havgaard JH Brunak S Fredholm M Bendixen C 《Bioinformatics (Oxford, England)》2007,23(13):i387-i391
MOTIVATION: Single nucleotide polymorphisms (SNPs) analysis is an important means to study genetic variation. A fast and cost-efficient approach to identify large numbers of novel candidates is the SNP mining of large scale sequencing projects. The increasing availability of sequence trace data in public repositories makes it feasible to evaluate SNP predictions on the DNA chromatogram level. MAVIANT, a platform-independent Multipurpose Alignment VIewing and Annotation Tool, provides DNA chromatogram and alignment views and facilitates evaluation of predictions. In addition, it supports direct manual annotation, which is immediately accessible and can be easily shared with external collaborators. RESULTS: Large-scale SNP mining of polymorphisms bases on porcine EST sequences yielded more than 7900 candidate SNPs in coding regions (cSNPs), which were annotated relative to the human genome. Non-synonymous SNPs were analyzed for their potential effect on the protein structure/function using the PolyPhen and SIFT prediction programs. Predicted SNPs and annotations are stored in a web-based database. Using MAVIANT SNPs can visually be verified based on the DNA sequencing traces. A subset of candidate SNPs was selected for experimental validation by resequencing and genotyping. This study provides a web-based DNA chromatogram and contig browser that facilitates the evaluation and selection of candidate SNPs, which can be applied as genetic markers for genome wide genetic studies. AVAILABILITY: The stand-alone version of MAVIANT program for local use is freely available under GPL license terms at http://snp.agrsci.dk/maviant. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献