首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.

Background  

Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful.  相似文献   

2.
3.
Protein sequence databases   总被引:2,自引:0,他引:2  
A variety of protein sequence databases exist, ranging from simple sequence repositories, which store data with little or no manual intervention in the creation of the records, to expertly curated universal databases that cover all species and in which the original sequence data are enhanced by the manual addition of further information in each sequence record. As the focus of researchers moves from the genome to the proteins encoded by it, these databases will play an even more important role as central comprehensive resources of protein information. Several the leading protein sequence databases are discussed here, with special emphasis on the databases now provided by the Universal Protein Knowledgebase (UniProt) consortium.  相似文献   

4.
5.
Recent years have seen an explosion in the amount of available biological data. More and more genomes are being sequenced and annotated, and protein and gene interaction data are accumulating. Biological databases have been invaluable for managing these data and for making them accessible. Depending on the data that they contain, the databases fulfil different functions. But, although they are architecturally similar, so far their integration has proved problematic.  相似文献   

6.
7.
8.

Background  

SRS (Sequence Retrieval System) has proven to be a valuable platform for storing, linking, and querying biological databases. Due to the availability of a broad range of different scientific databases in SRS, it has become a useful platform to incorporate and mine microarray data to facilitate the analyses of biological questions and non-hypothesis driven quests. Here we report various solutions and tools for integrating and mining annotated expression data in SRS.  相似文献   

9.
Nicholas HB  Deerfield DW  Ropelewski AJ 《BioTechniques》2000,28(6):1174-8, 1180, 1182 passim
We provide a detailed overview of the choices inherent in performing a sequence database search, including the choice of algorithm, substitution matrix and gap model. Each of these choices has implications that can be described as restrictions on the underlying model of sequence evolution, the expected degree of divergence between the query sequence and the database sequences (if one uses an evolutionary based matrix), as well as the sensitivity and selectivity of the search. We conclude with a series of recommendations for researchers performing these searches based on our experience and literature studies.  相似文献   

10.
11.
Publically available cDNA sequence data of Citrullus lanatus were searched for simple sequence repeats (SSRs). Nineteen microsatellites were identified and primer pairs were designed to amplify those loci. Primers were evaluated for their ability to detect polymorphisms within a set of several watermelon varieties and local landraces, C. colocynthis, and interspecific hybrids. Eighteen polymorphic SSR loci were identified. These polymorphic loci can be used for varietal identification and other uses.  相似文献   

12.
MOTIVATION: At present, mapping of sequence identifiers across databases is a daunting, time-consuming and computationally expensive process, usually achieved by sequence similarity searches with strict threshold values. SUMMARY: We present a rapid and efficient method to map sequence identifiers across databases. The method uses the MD5 checksum algorithm for message integrity to generate sequence fingerprints and uses these fingerprints as hash strings to map sequences across databases. The program, called MagicMatch, is able to cross-link any of the major sequence databases within a few seconds on a modest desktop computer.  相似文献   

13.
14.

Background

Bisulfite sequencing using next generation sequencers yields genome-wide measurements of DNA methylation at single nucleotide resolution. Traditional aligners are not designed for mapping bisulfite-treated reads, where the unmethylated Cs are converted to Ts. We have developed BS Seeker, an approach that converts the genome to a three-letter alphabet and uses Bowtie to align bisulfite-treated reads to a reference genome. It uses sequence tags to reduce mapping ambiguity. Post-processing of the alignments removes non-unique and low-quality mappings.

Results

We tested our aligner on synthetic data, a bisulfite-converted Arabidopsis library, and human libraries generated from two different experimental protocols. We evaluated the performance of our approach and compared it to other bisulfite aligners. The results demonstrate that among the aligners tested, BS Seeker is more versatile and faster. When mapping to the human genome, BS Seeker generates alignments significantly faster than RMAP and BSMAP. Furthermore, BS Seeker is the only alignment tool that can explicitly account for tags which are generated by certain library construction protocols.

Conclusions

BS Seeker provides fast and accurate mapping of bisulfite-converted reads. It can work with BS reads generated from the two different experimental protocols, and is able to efficiently map reads to large mammalian genomes. The Python program is freely available at http://pellegrini.mcdb.ucla.edu/BS_Seeker/BS_Seeker.html.  相似文献   

15.
Nucleic acid and protein sequences contain a wealth of informationof interest to molecular biologists. The advent of molecularsequence databases provides a unique opportunity for the computeranalysis of all available sequences. Sequence databases servetwo main functions: (i) to facilitate comparisons with newlydetermined sequences, and (ii) to act as a source of data forthe generation and testing of hypotheses concerning molecularsequence organisation and evolution. The large amounts of sequencedata now becoming available require that algorithms for databasesearching be fast and efficient and considerable progress isbeing made in this area.  相似文献   

16.

Background  

Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil.  相似文献   

17.
This paper aims to give an overview of current resources onhuman sequence variations and give an idea about the directionin which these services are moving.   相似文献   

18.
Light-weight integration of molecular biological databases   总被引:1,自引:0,他引:1  
MOTIVATION: Due to the increasing number of molecular biological databases and the exponential growth of their contents, database integration is an important topic of research in bioinformatics. Existing approaches in this area have in common that considerable efforts are needed to provide integrated access to heterogeneous data sources. RESULTS: This article describes the LIMBO architecture as a light-weight approach to molecular biological database integration. By building systems upon this architecture, the efforts needed for database integration can be significantly lowered. AVAILABILITY: As an illustration of the principle usefulness of the underlying ideas, a prototypical implementation based upon the LIMBO architecture is described. This implementation is exclusively based on freely available open source components like the PostgreSQL database management system and the BioRuby project. Additional files and modified components are available upon request from the author.  相似文献   

19.
The relational phenomena exhibited by metabolizing systems may be considered as special cases of those exhibited by a more general class of systems. This class is specified, and some of tis properties developed. An attempt is then made to apply these properties to a theory of metabolism by suitable specialization. A number of biologically significant theorems are obtained which apply directly to the theory of the free-living single cell. Among the results obtained are the following: On the basis of our model, there must always exist a component of the system which cannot be replaced or repaired by the system in the event of its inhibition or destruction. Under certain conditions, a metabolizing system possesses a component the inhibition of which will completely terminate the metabolic activity of the system. Furthermore a number of other diverse phenomena, such as the effects of a deficient environment, encystment phenomena, and even an indication of why a metabolizing system which represents a cell should possess a nucleus, follow in a straightforward fashion from our model.  相似文献   

20.
《BIOSILICO》2003,1(4):134-142
The increasing amount of data produced by large-scale biological experiments has highlighted the inadequacies of traditional scientific data management methods such as laboratory notebooks. Databases designed to store biological information are becoming increasingly common, but there is little guidance in the literature about the best practices of biological database design. This paper suggests best practices, and provides examples for the implementation of these practices.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号