首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
SRS (Sequence Retrieval System) is a widely used keyword search engine for querying biological databases. BLAST2 is the most widely used tool to query databases by sequence similarity search. These tools allow users to retrieve sequences by shared keyword or by shared similarity, with many public web servers available. However, with the increasingly large datasets available it is now quite common that a user is interested in some subset of homologous sequences but has no efficient way to restrict retrieval to that set. By allowing the user to control SRS from the BLAST output, BLAST2SRS (http://blast2srs.embl.de/) aims to meet this need. This server therefore combines the two ways to search sequence databases: similarity and keyword.  相似文献   

2.

Background  

Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary.  相似文献   

3.
Background: Traditional Chinese medicine (TCM) treats diseases in a holistic manner, while TCM formulae are multi-component, multi-target agents at the molecular level. Thus there are many parallels between the key ideas of TCM pharmacology and network pharmacology. These years, TCM network pharmacology has developed as an interdisciplinary of TCM science and network pharmacology, which studies the mechanism of TCM at the molecular level and in the context of biological networks. It provides a new research paradigm that can use modern biomedical science to interpret the mechanism of TCM, which is promising to accelerate the modernization and internationalization of TCM. Results: In this paper we introduce state-of-the-art free data sources, web servers and softwares that can be used in the TCM network pharmacology, including databases of TCM, drug targets and diseases, web servers for the prediction of drug targets, and tools for network and functional analysis. Conclusions: This review could help experimental pharmacologists make better use of the existing data and methods in their study of TCM.  相似文献   

4.
del Campo J  Massana R 《Protist》2011,162(3):435-448
In recent years, a substantial amount of data on aquatic protists has been obtained from culture-independent molecular approaches, unveiling a large diversity and the existence of new lineages. However, sequences affiliated with minor groups (in terms of clonal abundance) have often been under-analyzed, and this hides a potentially relevant source of phylogenetic information. Here we have searched public databases for 18S rDNA sequences of chrysophytes, choanoflagellates and bicosoecids retrieved from molecular surveys of protists. These three groups are often considered to account for most of the heterotrophic flagellates, an important functional component in microbial food webs. They represented a significant fraction of clones in freshwater studies, whereas their relative clonal abundance was low in marine studies. The novelty displayed by this dataset was notable. Most environmental sequences were distant to sequences of cultured organisms, indicating a significant bias in the representation of taxa in culture. Moreover, they were often distant to sequences from other molecular surveys, suggesting an insufficient sequencing effort to characterize the in situ diversity of these groups. Phylogenetic trees with complete sequences present the most accurate representation of the diversity of these groups, with the emergence of several new clades formed exclusively by environmental sequences. Exhaustive data mining in sequence databases allowed the identification of new diversity hidden inside chrysophytes, choanoflagellates and bicosoecids.  相似文献   

5.
Photosynthetic eukaryotes have a critical role as the main producers in most ecosystems of the biosphere. The ongoing environmental metabarcoding revolution opens the perspective for holistic ecosystems biological studies of these organisms, in particular the unicellular microalgae that often lack distinctive morphological characters and have complex life cycles. To interpret environmental sequences, metabarcoding necessarily relies on taxonomically curated databases containing reference sequences of the targeted gene (or barcode) from identified organisms. To date, no such reference framework exists for photosynthetic eukaryotes. In this study, we built the PhytoREF database that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages. We compiled 3333 amplicon sequences available from public databases and 879 sequences extracted from plastidial genomes, and generated 411 novel sequences from cultured marine microalgal strains belonging to different eukaryotic lineages. A total of 1867 environmental Sanger 16S rDNA sequences were also included in the database. Stringent quality filtering and a phylogeny‐based taxonomic classification were applied for each 16S rDNA sequence. The database mainly focuses on marine microalgae, but sequences from land plants (representing half of the PhytoREF sequences) and freshwater taxa were also included to broaden the applicability of PhytoREF to different aquatic and terrestrial habitats. PhytoREF, accessible via a web interface ( http://phytoref.fr ), is a new resource in molecular ecology to foster the discovery, assessment and monitoring of the diversity of photosynthetic eukaryotes using high‐throughput sequencing.  相似文献   

6.
Fold recognition techniques assist the exploration of protein structures, and web-based servers are part of the standard set of tools used in the analysis of biochemical problems. Despite their success, current methods are only able to predict the correct fold in a relatively small number of cases. We propose an approach that improves the selection of correct folds from among the results of two methods implemented as web servers (SAMT99 and 3DPSSM). Our approach is based on the training of a system of neural networks with models generated by the servers and a set of associated characteristics such as the quality of the sequence-structure alignment, distribution of sequence features (sequence-conserved positions and apolar residues), and compactness of the resulting models. Our results show that it is possible to detect adequate folds to model 80% of the sequences with a high level of confidence. The improvements achieved by taking into account sequence characteristics open the door to future improvements by directly including such factors in the step of model generation. This approach has been implemented as an automatic system LIBELLULA, available as a public web server at http://www.pdg.cnb.uam.es/servers/libellula.html.  相似文献   

7.
Peña C  Malm T 《PloS one》2012,7(6):e39071
There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).  相似文献   

8.
EVA (http://cubic.bioc.columbia.edu/eva/) is a web server for evaluation of the accuracy of automated protein structure prediction methods. The evaluation is updated automatically each week, to cope with the large number of existing prediction servers and the constant changes in the prediction methods. EVA currently assesses servers for secondary structure prediction, contact prediction, comparative protein structure modelling and threading/fold recognition. Every day, sequences of newly available protein structures in the Protein Data Bank (PDB) are sent to the servers and their predictions are collected. The predictions are then compared to the experimental structures once a week; the results are published on the EVA web pages. Over time, EVA has accumulated prediction results for a large number of proteins, ranging from hundreds to thousands, depending on the prediction method. This large sample assures that methods are compared reliably. As a result, EVA provides useful information to developers as well as users of prediction methods.  相似文献   

9.
Thousands of new vertebrate genes have been discovered and genetic systems are needed to address their functions at the cellular level. The chicken B cell line DT40 allows efficient gene disruptions due to its high homologous recombination activity. However, cloning the gene of interest is often cumbersome, since relatively few chicken cDNA sequences are present in the public databases. In addition, the accumulation of multiple mutations within the same cell clone is limited by the consumption of one drug-resistance marker for each transfection. Here, we present the DT40 web site (http://genetics.hpi.uni-hamburg.de/dt40.html), which includes a comprehensive database of chicken bursal ESTs to identify disruption candidate genes and recyclable marker cassettes based on the loxP system. These freely available resources greatly facilitate the analysis of genes and genetic networks.  相似文献   

10.
We describe several algorithms and public servers that were developed to analyze and predict various features of protein structures. These servers provide information about the covalent state of cysteine (CYSREDOX), as well as about residues involved in non-covalent cross links that play an important role in the structural stability of proteins (SCIDE and SCPRED). We also discuss methods and servers developed to identify helical transmembrane proteins from large databases and rough genomic data, including two of the most popular transmembrane prediction methods, DAS and HMMTOP. Several biologically interesting applications of these servers are also presented. The servers are available through http://www.enzim.hu/servers.html.  相似文献   

11.

Background  

Very often genome-wide data analysis requires the interoperation of multiple databases and analytic tools. A large number of genome databases and bioinformatics applications are available through the web, but it is difficult to automate interoperation because: 1) the platforms on which the applications run are heterogeneous, 2) their web interface is not machine-friendly, 3) they use a non-standard format for data input and output, 4) they do not exploit standards to define application interface and message exchange, and 5) existing protocols for remote messaging are often not firewall-friendly. To overcome these issues, web services have emerged as a standard XML-based model for message exchange between heterogeneous applications. Web services engines have been developed to manage the configuration and execution of a web services workflow.  相似文献   

12.
Phosphorylation is a crucial way to control the activity of proteins in many eukaryotic organisms in vivo. Experimental methods to determine phosphorylation sites in substrates are usually restricted by the in vitro condition of enzymes and very intensive in time and labor. Although some in silico methods and web servers have been introduced for automatic detection of phosphorylation sites, sophisticated methods are still in urgent demand to further improve prediction performances. Protein primary sequences can help predict phosphorylation sites catalyzed by different protein kinase and most computational approaches use a short local peptide to make prediction. However, the useful information may be lost if only the conservative residues that are not close to the phosphorylation site are considered in prediction, which would hamper the prediction results. A novel prediction method named IEPP (Information-Entropy based Phosphorylation Prediction) is presented in this paper for automatic detection of potential phosphorylation sites. In prediction, the sites around the phosphorylation sites are selected or excluded by their entropy values. The algorithm was compared with other methods such as GSP and PPSP on the ABL, MAPK and PKA PK families. The superior prediction accuracies were obtained in various measurements such as sensitivity (Sn) and specificity (Sp). Furthermore, compared with some online prediction web servers on the new discovered phosphorylation sites, IEPP also yielded the best performance. IEPP is another useful computational resource for identification of PK-specific phosphorylation sites and it also has the advantages of simpleness, efficiency and convenience.  相似文献   

13.
GenBank.   总被引:5,自引:2,他引:3       下载免费PDF全文
The GenBank sequence database continues to expand its data coverage, quality control, annotation content and retrieval services. GenBank is comprised of DNA sequences submitted directly by authors as well as sequences from the other major public databases. An integrated retrieval system, known as Entrez, contains data from GenBank and from the major protein sequence and structural databases, as well as related MEDLINE abstracts. Users may access GenBank over the Internet through the World Wide Web and through special client-server programs for text and sequence similarity searching. FTP, CD-ROM and e-mail servers are alternate means of access.  相似文献   

14.
1H NMR is now a standard method to determine de novo primary sequence of all sorts of glycans. These last 30 years, tens of thousands of oligosaccharide sequences have been elucidated by NMR spectroscopy in conjunction with other physico-chemical methods including mass spectrometry and gas chromatography. Most of these sequences are now compiled and available in several web databases recently unified in publicly available GlycomeDB, along with sets of experimental data. However, because the search for an exact sequence exclusively based on proton chemical shifts is sometimes delicate for NMR non-specialists, we worked out a new type of query, named SOACS, which allows the easy retrieval of existing sequences. This query is based on the readily distinguished 1H chemical shifts from any 1H NMR spectrum, and was designed to be usable to the widest scientist community.  相似文献   

15.
交互网络(Internet)的发展为联网的计算机用户之间进行信息交流提供了有效途径.就分子生物学家而言,他们不仅可以利用电子邮件系统发送和接收信息,而且更重要的是能够存取大量的分子生物学数据库和软件.利用Internet可以开展多种序列分析作业,包括序列数据库的类似性检索、基因编码区鉴定和蛋白质二级结构分析等.一个数据库,例如GenBank,可以通过多种方式来存取:a.电子邮件文件服务器,b.文件传送协议(FTP),c.Gopher,WAIS或WWW等服务器-客户机(Server-Client)系统.专为分子生物学家设计的BIOSCI电子公告牌为研究人员开展学术讨论、寻求别人帮助和与数据库人员交流提供了极大的方便.  相似文献   

16.
The last decades have seen an upsurge in ecological studies incorporating phylogenetic information with increasing species samples, motivated by the common conjecture that species with common ancestors should share some ecological characteristics due to niche conservatism. This has been carried out using various methods of increasing complexity and reliability: using only taxonomical classification; constructing supertrees that incorporate only topological information from previously published phylogenies; or building supermatrices of molecular data that are used to estimate phylogenies with evolutionary meaningful branch lengths. Although the latter option is more informative than the others, it remains under‐used in ecology because ecologists are generally unaware of or unfamiliar with modern molecular phylogenetic methods. However, a solid phylogenetic hypothesis is necessary to conduct reliable ecological analysis integrating evolutive aspects. Our aim here is to clarify the concepts and methodological issues associated with the reconstruction of dated megaphylogenies, and to show that it is nowadays possible to obtain accurate and well sampled megaphylogenies with informative branch‐lengths on large species samples. This is possible thanks to improved phylogenetic methods, vast amounts of molecular data available from databases such as Genbank, and consensus knowledge on deep phylogenetic relationships for an increasing number of groups of organisms. Finally, we include a detailed step‐by‐step workflow pipeline (Supplementary material), from data acquisition to phylogenetic inference, mainly based on the R environment (widely used by ecologists) and the use of free web‐servers, that has been applied to the reconstruction of a species‐level phylogeny of all breeding birds of Europe.  相似文献   

17.
园艺植物分子育种相关生物信息资源及其应用   总被引:5,自引:0,他引:5  
园艺植物分子育种中,生物信息技术是一项新技术.GenBank、EMBL、DDBJ、Swiss-Prot等数据库及其序列查询系统、序列比对软件和序列提交软件是园艺植物分子育种中的重要生物信息资源.本文综述了这些生物信息资源,以及它们在克隆新基因、预测新序列功能、鉴定种质资源和进行系谱分析等方面的应用.  相似文献   

18.
园艺植物分子育种中, 生物信息技术是一项新技术。GenBank、EMBL、DDBJ、Swiss-Prot等数据库及其序列查询系统、序列比对软件和序列提交软件是园艺植物分子育种中的重要生物信息资源。本文综述了这些生物信息资源, 以及它们在克隆新基因、预测新序列功能、鉴定种质资源和进行系谱分析等方面的应用。  相似文献   

19.
Movement is crucial to the biological function of many proteins, yet crystallographic structures of proteins can give us only a static snapshot. The protein dynamics that are important to biological function often happen on a timescale that is unattainable through detailed simulation methods such as molecular dynamics as they often involve crossing high-energy barriers. To address this coarse-grained motion, several methods have been implemented as web servers in which a set of coordinates is usually linearly interpolated from an initial crystallographic structure to a final crystallographic structure. We present a new morphing method that does not extrapolate linearly and can therefore go around high-energy barriers and which can produce different trajectories between the same two starting points. In this work, we evaluate our method and other established coarse-grained methods according to an objective measure: how close a coarse-grained dynamics method comes to a crystallographically determined intermediate structure when calculating a trajectory between the initial and final crystal protein structure. We test this with a set of five proteins with at least three crystallographically determined on-pathway high-resolution intermediate structures from the Protein Data Bank. For simple hinging motions involving a small conformational change, segmentation of the protein into two rigid sections outperforms other more computationally involved methods. However, large-scale conformational change is best addressed using a nonlinear approach and we suggest that there is merit in further developing such methods.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号