首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data.  相似文献   

2.
CLIP-seq is widely used to study genome-wide interactions between RNA-binding proteins and RNAs. However, there are few tools available to analyze CLIP-seq data, thus creating a bottleneck to the implementation of this methodology. Here, we present PIPE-CLIP, a Galaxy framework-based comprehensive online pipeline for reliable analysis of data generated by three types of CLIP-seq protocol: HITS-CLIP, PAR-CLIP and iCLIP. PIPE-CLIP provides both data processing and statistical analysis to determine candidate cross-linking regions, which are comparable to those regions identified from the original studies or using existing computational tools. PIPE-CLIP is available at http://pipeclip.qbrc.org/.  相似文献   

3.
PhyloBLAST is an internet-accessed application based on CGI/Perl programming that compares a users protein sequence to a SwissProt/TREMBL database using BLAST2 and then allows phylogenetic analyses to be performed on selected sequences from the BLAST output. Flexible features such as ability to input your own multiple sequence alignment and use PHYLIP program options provide additional web-based phylogenetic analysis functionality beyond the analysis of a BLAST result.  相似文献   

4.
MaskerAid: a performance enhancement to RepeatMasker   总被引:14,自引:0,他引:14  
SUMMARY: Identifying and masking repetitive elements is usually the first step when analyzing vertebrate genomic sequence. Current repeat identification software is sensitive but slow, creating a costly bottleneck in large-scale analyses. We have developed MaskerAid, a software enhancement to RepeatMasker that increased the speed of masking more than 30-fold at the most sensitive setting. AVAILABILITY: On request from the authors (see http://sapiens.wustl.edu/MaskerAid). CONTACT: maskeraid@watson.wustl.edu  相似文献   

5.
SUMMARY: BLAST2GENE is a program that allows a detailed analysis of genomic regions containing completely or partially duplicated genes. From a BLAST (or BL2SEQ) comparison of a protein or nucleotide query sequence with any genomic region of interest, BLAST2GENE processes all high scoring pairwise alignments (HSPs) and provides the disposition of all independent copies along the genomic fragment. The results are provided in text and PostScript formats to allow an automatic and visual evaluation of the respective region. AVAILABILITY: The program is available upon request from the authors. A web server of BLAST2GENE is maintained at http://www.bork.embl.de/blast2gene  相似文献   

6.
The adaptive immune system includes populations of B and T cells capable of binding foreign epitopes via antigen specific receptors, called immunoglobulin (IG) for B cells and the T cell receptor (TCR) for T cells. In order to provide protection from a wide range of pathogens, these cells display highly diverse repertoires of IGs and TCRs. This is achieved through combinatorial rearrangement of multiple gene segments in addition, for B cells, to somatic hypermutation. Deep sequencing technologies have revolutionized analysis of the diversity of these repertoires; however, accurate TCR/IG diversity profiling requires specialist bioinformatics tools. Here we present LymAnalzyer, a software package that significantly improves the completeness and accuracy of TCR/IG profiling from deep sequence data and includes procedures to identify novel alleles of gene segments. On real and simulated data sets LymAnalyzer produces highly accurate and complete results. Although, to date we have applied it to TCR/IG data from human and mouse, it can be applied to data from any species for which an appropriate database of reference genes is available. Implemented in Java, it includes both a command line version and a graphical user interface and is freely available at https://sourceforge.net/projects/lymanalyzer/.  相似文献   

7.
BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences   总被引:49,自引:0,他引:49  
'BLAST 2 Sequences', a new BLAST-based tool for aligning two protein or nucleotide sequences, is described. While the standard BLAST program is widely used to search for homologous sequences in nucleotide and protein databases, one often needs to compare only two sequences that are already known to be homologous, coming from related species or, e.g. different isolates of the same virus. In such cases searching the entire database would be unnecessarily time-consuming. 'BLAST 2 Sequences' utilizes the BLAST algorithm for pairwise DNA-DNA or protein-protein sequence comparison. A World Wide Web version of the program can be used interactively at the NCBI WWW site (http://www.ncbi.nlm.nih.gov/gorf/bl2.++ +html). The resulting alignments are presented in both graphical and text form. The variants of the program for PC (Windows), Mac and several UNIX-based platforms can be downloaded from the NCBI FTP site (ftp://ncbi.nlm.nih.gov).  相似文献   

8.

Background  

BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming.  相似文献   

9.
Prokaryotic cells display a striking subcellular organization. Studies of the underlying mechanisms in different species have greatly enhanced our understanding of the morphological and physiological adaptation of bacteria to different environmental niches. The image analysis software tool BacStalk is designed to extract comprehensive quantitative information from the images of morphologically complex bacteria with stalks, flagella, or other appendages. The resulting data can be visualized in interactive demographs, kymographs, cell lineage plots, and scatter plots to enable fast and thorough data analysis and representation. Notably, BacStalk can generate demographs and kymographs that display fluorescence signals within the two-dimensional cellular outlines, to accurately represent their subcellular location. Beyond organisms with visible appendages, BacStalk is also suitable for established, non-stalked model organisms with common or uncommon cell shapes. BacStalk, therefore, contributes to the advancement of prokaryotic cell biology and physiology, as it widens the spectrum of easily accessible model organisms and enables highly intuitive and interactive data analysis and visualization.  相似文献   

10.
Investigating chromatin interactions between regulatory regions such as enhancer and promoter elements is vital for understanding the regulation of gene expression. Compared to Hi-C and its variants, the emerging 3D mapping technologies focusing on enriched signals, such as TrAC-looping, reduce the sequencing cost and provide higher interaction resolution for cis-regulatory elements. A robust pipeline is needed for the comprehensive interpretation of these data, especially for loop-centric analysis. Therefore, we have developed a new versatile tool named cLoops2 for the full-stack analysis of these 3D chromatin interaction data. cLoops2 consists of core modules for peak-calling, loop-calling, differentially enriched loops calling and loops annotation. It also contains multiple modules for interaction resolution estimation, data similarity estimation, features quantification, feature aggregation analysis, and visualization. cLoops2 with documentation and example data are open source and freely available at GitHub: https://github.com/KejiZhaoLab/cLoops2.  相似文献   

11.
12.

Background

Large-scale sequence studies requiring BLAST-based analysis produce huge amounts of data to be parsed. BLAST parsers are available, but they are often missing some important features, such as keeping all information from the raw BLAST output, allowing direct access to single results, and performing logical operations over them.

Findings

We implemented BlaSTorage, a Python package that parses multi BLAST results and returns them in a purpose-built object-database format. Unlike other BLAST parsers, BlaSTorage retains and stores all parts of BLAST results, including alignments, without loss of information; a complete API allows access to all the data components.

Conclusions

BlaSTorage shows comparable speed of more basic parser written in compiled languages as C++ and can be easily integrated into web applications or software pipelines.  相似文献   

13.

Background -  

Sequencing of EST and BAC end datasets is no longer limited to large research groups. Drops in per-base pricing have made high throughput sequencing accessible to individual investigators. However, there are few options available which provide a free and user-friendly solution to the BLAST result storage and data mining needs of biologists.  相似文献   

14.
Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) is a new technology to study genome-wide long-range chromatin interactions bound by protein factors. Here we present ChIA-PET Tool, a software package for automatic processing of ChIA-PET sequence data, including linker filtering, mapping tags to reference genomes, identifying protein binding sites and chromatin interactions, and displaying the results on a graphical genome browser. ChIA-PET Tool is fast, accurate, comprehensive, user-friendly, and open source (available at http://chiapet.gis.a-star.edu.sg).  相似文献   

15.
We report here the release of a web-based tool (MDDNA) to study and model the fine structural details of DNA on the basis of data extracted from a set of molecular dynamics (MD) trajectories of DNA sequences involving all the unique tetranucleotides. The dynamic web interface can be employed to analyze the first neighbor sequence context effects on the 10 unique dinucleotide steps of DNA. Functionality is included to build all atom models of any user-defined sequence based on the MD results. The backend of this interface is a relational database storing the conformational details of DNA obtained in 39 different MD simulation trajectories comprising all the 136 unique tetranucleotide steps. Examples of the use of this data to predict DNA structures are included. Availability: http://humphry.chem.wesleyan.edu:8080/MDDNA. Supplementary information: Supplementary data including color figures are available at Bioinformatics online.  相似文献   

16.
Rawlings ND  Morton FR 《Biochimie》2008,90(2):243-259
Many of the 181 families of peptidases contain homologues that are known to have functions other than peptide bond hydrolysis. Distinguishing an active peptidase from a homologue that is not a peptidase requires specialist knowledge of the important active site residues, because replacement or lack of one of these catalytic residues is an important clue that the homologue in question is unlikely to hydrolyse peptide bonds. Now that the rate at which proteins are characterized is outstripped by the rate that genome sequences are determined, many genes are being incorrectly annotated because only sequence similarity is taken into consideration. We present a tool called the MEROPS batch BLAST which not only performs a comparison against the MEROPS sequence collection, but also does a pair-wise alignment with the closest homologue detected and calculates the position of the active site residues. A non-peptidase homologue can be distinguished by the absence or unacceptable replacement of any of these residues. An analysis of peptidase homologues in the genome of the bacterium Erythrobacter litoralis is presented as an example.  相似文献   

17.
Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.  相似文献   

18.
Quantitative flow visualization has many roots and has takenseveral approaches. The advent of digital image processing hasmade it possible to practically extract useful information fromevery kind of flow image. In a direct approach, the image intensityor color (wavelength or frequency) can be used as an indicationof concentration, density and temperature fields or gradientsof these scalar fields in the flow (Merzkirch, 1987). For whole-fieldvelocity measurement, the method of choice by experimental fluidmechanicians has been the technique of Particle Image Velocimetry(DPIV). This paper presents a novel approach to extend the DPIVtechnique from a planar method to a full three-dimensional volumemapping technique useful in both engineering and biologicalapplications.  相似文献   

19.
Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the postprocessing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server and a downloadable application that makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline also provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download.  相似文献   

20.
We carry out an extensive statistical study of the applicability of normal modes to the prediction of mobile regions in proteins. In particular, we assess the degree to which the observed motions found in a comprehensive data set of 377 nonredundant motions can be modeled by a single normal-mode vibration. We describe each motion in our data set by vectors connecting corresponding atoms in two crystallographically known conformations. We then measure the geometric overlap of these motion vectors with the displacement vectors of the lowest-frequency mode, for one of the conformations. Our study suggests that the lowest mode contains useful information about the parts of a protein that move most (i.e., have the largest amplitudes) and about the direction of this movement. Based on our findings, we developed a Web tool for motion prediction (available from http://molmovdb.org/nma) and apply it here to four representative motions--from bacteriorhodopsin, calmodulin, insulin, and T7 RNA polymerase.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号