首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/.  相似文献   

4.
5.

Background

Large clinical genomics studies using next generation DNA sequencing require the ability to select and track samples from a large population of patients through many experimental steps. With the number of clinical genome sequencing studies increasing, it is critical to maintain adequate laboratory information management systems to manage the thousands of patient samples that are subject to this type of genetic analysis.

Results

To meet the needs of clinical population studies using genome sequencing, we developed a web-based laboratory information management system (LIMS) with a flexible configuration that is adaptable to continuously evolving experimental protocols of next generation DNA sequencing technologies. Our system is referred to as MendeLIMS, is easily implemented with open source tools and is also highly configurable and extensible. MendeLIMS has been invaluable in the management of our clinical genome sequencing studies.

Conclusions

We maintain a publicly available demonstration version of the application for evaluation purposes at http://mendelims.stanford.edu. MendeLIMS is programmed in Ruby on Rails (RoR) and accesses data stored in SQL-compliant relational databases. Software is freely available for non-commercial use at http://dna-discovery.stanford.edu/software/mendelims/.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-290) contains supplementary material, which is available to authorized users.  相似文献   

6.
7.
We present a de novo re-determination of the secondary (2°) structure and domain architecture of the 23S and 5S rRNAs, using 3D structures, determined by X-ray diffraction, as input. In the traditional 2° structure, the center of the 23S rRNA is an extended single strand, which in 3D is seen to be compact and double helical. Accurately assigning nucleotides to helices compels a revision of the 23S rRNA 2° structure. Unlike the traditional 2° structure, the revised 2° structure of the 23S rRNA shows architectural similarity with the 16S rRNA. The revised 2° structure also reveals a clear relationship with the 3D structure and is generalizable to rRNAs of other species from all three domains of life. The 2° structure revision required us to reconsider the domain architecture. We partitioned the 23S rRNA into domains through analysis of molecular interactions, calculations of 2D folding propensities and compactness. The best domain model for the 23S rRNA contains seven domains, not six as previously ascribed. Domain 0 forms the core of the 23S rRNA, to which the other six domains are rooted. Editable 2° structures mapped with various data are provided (http://apollo.chemistry.gatech.edu/RibosomeGallery).  相似文献   

8.
A Genomic Islands (GI) is a chunk of DNA sequence in a genome whose origin can be traced back to other organisms or viruses. The detection of GIs plays an indispensable role in biomedical research, due to the fact that GIs are highly related to special functionalities such as disease-causing GIs - pathogenicity islands. It is also very important to visualize genomic islands, as well as the supporting features corresponding to the genomic islands in the genome. We have developed a program, Genomic Island Visualization (GIV), which displays the locations of genomic islands in a genome, as well as the corresponding supportive feature information for GIs. GIV was implemented in C++, and was compiled and executed on Linux/Unix operating systems.

Availability

GIV is freely available for non-commercial use at http://www5.esu.edu/cpsc/bioinfo/software/GIV  相似文献   

9.
Although comparison of RNA-protein interaction profiles across different conditions has become increasingly important to understanding the function of RNA-binding proteins (RBPs), few computational approaches have been developed for quantitative comparison of CLIP-seq datasets. Here, we present an easy-to-use command line tool, dCLIP, for quantitative CLIP-seq comparative analysis. The two-stage method implemented in dCLIP, including a modified MA normalization method and a hidden Markov model, is shown to be able to effectively identify differential binding regions of RBPs in four CLIP-seq datasets, generated by HITS-CLIP, iCLIP and PAR-CLIP protocols. dCLIP is freely available at http://qbrc.swmed.edu/software/.  相似文献   

10.
11.
12.
Tn-seq is a high throughput technique for analysis of transposon mutant libraries. Tn-seq Explorer was developed as a convenient and easy-to-use package of tools for exploration of the Tn-seq data. In a typical application, the user will have obtained a collection of sequence reads adjacent to transposon insertions in a reference genome. The reads are first aligned to the reference genome using one of the tools available for this task. Tn-seq Explorer reads the alignment and the gene annotation, and provides the user with a set of tools to investigate the data and identify possibly essential or advantageous genes as those that contain significantly low counts of transposon insertions. Emphasis is placed on providing flexibility in selecting parameters and methodology most appropriate for each particular dataset. Tn-seq Explorer is written in Java as a menu-driven, stand-alone application. It was tested on Windows, Mac OS, and Linux operating systems. The source code is distributed under the terms of GNU General Public License. The program and the source code are available for download at http://www.cmbl.uga.edu/downloads/programs/Tn_seq_Explorer/ and https://github.com/sina-cb/Tn-seqExplorer.  相似文献   

13.
The development of Next Generation Sequencing technologies, capable of sequencing hundreds of millions of short reads (25–70 bp each) in a single run, is opening the door to population genomic studies of non-model species. In this paper we present SHRiMP - the SHort Read Mapping Package: a set of algorithms and methods to map short reads to a genome, even in the presence of a large amount of polymorphism. Our method is based upon a fast read mapping technique, separate thorough alignment methods for regular letter-space as well as AB SOLiD (color-space) reads, and a statistical model for false positive hits. We use SHRiMP to map reads from a newly sequenced Ciona savignyi individual to the reference genome. We demonstrate that SHRiMP can accurately map reads to this highly polymorphic genome, while confirming high heterozygosity of C. savignyi in this second individual. SHRiMP is freely available at http://compbio.cs.toronto.edu/shrimp.  相似文献   

14.
A combined approach of whole genome shotgun sequencing and ultra-high density linkage mapping using skim sequencing of a segregating population is effective for assembling allopolyploid genomes.See related Research, http://dx.doi.org/10.1186/s13059-015-0582-8  相似文献   

15.
Recent studies have revealed that a small non-coding RNA, microRNA (miRNA) down-regulates its mRNA targets. This effect is regarded as an important role in various biological processes. Many studies have been devoted to predicting miRNA-target interactions. These studies indicate that the interactions may only be functional in some specific tissues, which depend on the characteristics of an miRNA. No systematic methods have been established in the literature to investigate the correlation between miRNA-target interactions and tissue specificity through microarray data. In this study, we propose a method to investigate miRNA-target interaction-supported tissues, which is based on experimentally validated miRNA-target interactions. The tissue specificity results by our method are in accordance with the experimental results in the literature.

Availability and Implementation

Our analysis results are available at http://tsmti.mbc.nctu.edu.tw/ and http://www.stat.nctu.edu.tw/hwang/tsmti.html.  相似文献   

16.

Background

Identification of DNA/Protein motifs is a crucial problem for biologists. Computational techniques could be of great help in this identification. In this direction, many computational models for motifs have been proposed in the literature.

Methods

One such important model is the motif model. In this paper we describe a motif search web tool that predominantly employs this motif model. This web tool exploits the state-of-the art algorithms for solving the motif search problem.

Results

The online tool has been helping scientists identify many unknown motifs. Many of our predictions have been successfully verified as well. We hope that this paper will expose this crucial tool to many more scientists.

Availability and requirements

Project name: PMS - Panoptic Motif Search Tool. Project home page: http://pms.engr.uconn.edu or http://motifsearch.com. Licence: PMS tools will be readily available to any scientist wishing to use it for non-commercial purposes, without restrictions. The online tool is freely available without login.  相似文献   

17.
The software tool PBEAM provides a parallel implementation of the BEAM, which is the first algorithm for large scale epistatic interaction mapping, including genome-wide studies with hundreds of thousands of markers. BEAM describes markers and their interactions with a Bayesian partitioning model and computes the posterior probability of each marker sets via Markov Chain Monte Carlo (MCMC). PBEAM takes the advantage of simulating multiple Markov chains simultaneously. This design can efficiently reduce ~n-fold execution time in the circumstance of n CPUs. The implementation of PBEAM is based on MPI libraries.

Availability

PBEAM is available for download at http://bioinfo.au.tsinghua.edu.cn/pbeam/  相似文献   

18.
Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.  相似文献   

19.
Jordan D. Ward 《Genetics》2015,199(2):363-377
As in other organisms, CRISPR/Cas9 methods provide a powerful approach for genome editing in the nematode Caenorhabditis elegans. Oligonucleotides are excellent repair templates for introducing substitutions and short insertions, as they are cost effective, require no cloning, and appear in other organisms to target changes by homologous recombination at DNA double-strand breaks (DSBs). Here, I describe a methodology in C. elegans to efficiently knock in epitope tags in 8–9 days, using a temperature-sensitive lethal mutation in the pha-1 gene as a co-conversion marker. I demonstrate that 60mer oligos with 29 bp of homology drive efficient knock-in of point mutations, and that disabling nonhomologous end joining by RNAi inactivation of the cku-80 gene significantly improves knock-in efficiency. Homology arms of 35–80 bp are sufficient for efficient editing and DSBs up to 54 bp away from the insertion site produced knock-ins. These findings will likely be applicable for a range of genome editing approaches in C. elegans, which will improve editing efficiency and minimize screening efforts.  相似文献   

20.
The frequent emergence of drug-resistant bacteria has created an urgent demand for new antimicrobial agents. Traditional methods of novel antibiotic development are almost obsolete. Antimicrobial peptides (AMPs) are now regarded as a potential solution to revive the traditional methods of antibiotic development, although, until now, many AMPs have failed in clinical trials. A comprehensive database of AMPs with information about their antimicrobial activity and cytotoxicity will help promote the process of finding novel AMPs with improved antimicrobial activity and reduced cytotoxicity and eventually accelerate the speed of translating the discovery of new AMPs into clinical or preclinical trials. LAMP, a database linking AMPs, serves as a tool to aid the discovery and design of AMPs as new antimicrobial agents. The current version of LAMP has 5,547 entries, comprising 3,904 natural AMPs and 1,643 synthetic peptides. The database can be queried using either simply keywords or combinatorial conditions searches. Equipped with the detailed antimicrobial activity and cytotoxicity data, the cross-linking and top similar AMPs functions implemented in LAMP will help enhance our current understanding of AMPs and this may speed up the development of new AMPs for medical applications. LAMP is freely available at: http://biotechlab.fudan.edu.cn/database/lamp.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号