期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SMART: a web-based tool for the study of genetically mobile domains 总被引：61，自引：2，他引：59

Schultz J Copley RR Doerks T Ponting CP Bork P 《Nucleic acids research》2000,28(1):231-234

SMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures (http://SMART.embl-heidelberg.de ). More than 400 domain families found in signalling, extra-cellular and chromatin-associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database as well as search parameters and taxonomic information are stored in a relational database system. User interfaces to this database allow searches for proteins containing specific combinations of domains in defined taxa. 相似文献

2.

CDD: a curated Entrez database of conserved domain alignments

Marchler-Bauer A Anderson JB DeWeese-Scott C Fedorova ND Geer LY He S Hurwitz DI Jackson JD Jacobs AR Lanczycki CJ Liebert CA Liu C Madej T Marchler GH Mazumder R Nikolskaya AN Panchenko AR Rao BS Shoemaker BA Simonyan V Song JS Thiessen PA Vasudevan S Wang Y Yamashita RA Yin JJ Bryant SH 《Nucleic acids research》2003,31(1):383-387

The Conserved Domain Database (CDD) is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R). This allows users to search for domain types by name, for example, or to view the domain architecture of any protein in Entrez's sequence database. CDD can be accessed on the WorldWideWeb at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=cdd. Users may also employ the CD-Search service to identify conserved domains in new sequences, at http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. CD-Search results, and pre-computed links from Entrez's protein database, are calculated using the RPS-BLAST algorithm and Position Specific Score Matrices (PSSMs) derived from CDD alignments. CD-Searches are also run by default for protein-protein queries submitted to BLAST(R) at http://www.ncbi.nlm.nih.gov/BLAST. CDD mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI. Structure information is used to identify the core substructure likely to be present in all family members, and to produce sequence alignments consistent with structure conservation. This alignment model allows NCBI curators to annotate 'columns' corresponding to functional sites conserved among family members. 相似文献

3.

Mendel-GFDb and Mendel-ESTS: databases of plant gene families and ESTs annotated with gene family numbers and gene family names

Lonsdale D Crowe M Arnold B Arnold BC 《Nucleic acids research》2001,29(1):120-122

相似文献

4.

ProRule: a new database containing functional and structural information on PROSITE profiles

Sigrist CJ De Castro E Langendijk-Genevaux PS Le Saux V Bairoch A Hulo N 《Bioinformatics (Oxford, England)》2005,21(21):4060-4066

MOTIVATION: Increase the discriminatory power of PROSITE profiles to facilitate function determination and provide biologically relevant information about domains detected by profiles for the annotation of proteins. SUMMARY: We have created a new database, ProRule, which contains additional information about PROSITE profiles. ProRule contains notably the position of structurally and/or functionally critical amino acids, as well as the condition they must fulfill to play their biological role. These supplementary data should help function determination and annotation of the UniProt Swiss-Prot knowledgebase. ProRule also contains information about the domain detected by the profile in the Swiss-Prot line format. Hence, ProRule can be used to make Swiss-Prot annotation more homogeneous and consistent. The format of ProRule can be extended to provide information about combination of domains. AVAILABILITY: ProRule can be accessed through ScanProsite at http://www.expasy.org/tools/scanprosite. A file containing the rules will be made available under the PROSITE copyright conditions on our ftp site (ftp://www.expasy.org/databases/prosite/) by the next PROSITE release. 相似文献

5.

Computational analysis of protein tyrosine phosphatases: practical guide to bioinformatics and data resources

Andersen JN Del Vecchio RL Kannan N Gergel J Neuwald AF Tonks NK 《Methods (San Diego, Calif.)》2005,35(1):90-114

The exponential growth of sequence data has become a challenge to database curators and end-users alike and biologists seeking to utilize the data effectively are faced with numerous analysis methods. Here, with practical examples from our bioinformatics analysis of the protein tyrosine phosphatases (PTPs), we show how computational analysis can be exploited to fuel hypothesis-driven experimental research through the exploration of online databases. We cover the following elements: (i) similarity searches and strategies to collect a non-redundant database of tyrosine-specific PTP domains; (ii) utilization of this database to classify human, fly, and worm PTPs (based on alignments and phylogenetic analysis); (iii) three-dimensional structural analysis to identify conserved regions (structure-function) and non-conserved selectivity-determining regions (substrate specificity); and (iv) genomic analysis, including mapping of exon structure, identification of pseudogenes, and exploration of disease databases. We discuss the importance of manual curation, illustrating examples in which pseudogenes give rise to predicted proteins in GenBank and note that domain servers, such as PFAM and SMART, erroneously include dual-specificity and lipid phosphatases in their collection of tyrosine-specific PTPs. To capitalize on our annotated set of 402 PTP domains (from 47 species and five phyla), we identify sequence conservation across taxonomic categories and explore structure-function relationships among tandem domain receptor-like PTPs. We define three Src homology 2 domain-containing PTP genes in stingray, zebrafish, and fugu and speculate on their evolutionary relationship with human pseudogenes. Our annotated sequences, along with a web service for phylogenetic classification of PTP domains, are available online (http://ptp.cshl.edu and http://science.novonordisk.com/ptp). 相似文献

6.

NIFAS: visual analysis of domain evolution in proteins

Storm CE Sonnhammer EL 《Bioinformatics (Oxford, England)》2001,17(4):343-348

MOTIVATION: Multi-domain proteins have evolved by insertions or deletions of distinct protein domains. Tracing the history of a certain domain combination can be important for functional annotation of multi-domain proteins, and for understanding the function of individual domains. In order to analyze the evolutionary history of the domains in modular proteins it is desirable to inspect a phylogenetic tree based on sequence divergence with the modular architecture of the sequences superimposed on the tree. RESULT: A Java applet, NIFAS, that integrates graphical domain schematics for each sequence in an evolutionary tree was developed. NIFAS retrieves domain information from the Pfam database and uses CLUSTAL W to calculate a tree for a given Pfam domain. The tree can be displayed with symbolic bootstrap values, and to allow the user to focus on a part of the tree, the layout can be altered by swapping nodes, changing the outgroup, and showing/collapsing subtrees. NIFAS is integrated with the Pfam database and is accessible over the internet (http://www.cgr.ki.se/Pfam). As an example, we use NIFAS to analyze the evolution of domains in Protein Kinases C. 相似文献

7.

Automated Improvement of Domain ANnotations using context analysis of domain arrangements (AIDAN)

Beaussart F Weiner J Bornberg-Bauer E 《Bioinformatics (Oxford, England)》2007,23(14):1834-1836

MOTIVATION: Since protein domains are the units of evolution, databases of domain signatures such as ProDom or Pfam enable both a sensitive and selective sequence analysis. However, manually curated databases have a low coverage and automatically generated ones often miss relationships which have not yet been discovered between domains or cannot display similarities between domains which have drifted apart. METHODS: We present a tool which makes use of the fact that overall domain arrangements are often conserved. AIDAN (Automated Improvement of Domain ANnotations) identifies potential annotation artifacts and domains which have drifted apart. The underlying database supplements ProDom and is interfaced by a graphical tool allowing the localization of single domain deletions or annotations which have been falsely made by the automated procedure. AVAILABILITY: http://www.uni-muenster.de/Evolution/ebb/Services/AIDAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. 相似文献

8.

PIR: a new resource for bioinformatics 总被引：3，自引：0，他引：3

McGarvey PB Huang H Barker WC Orcutt BC Garavelli JS Srinivasarao GY Yeh LS Xiao C Wu CH 《Bioinformatics (Oxford, England)》2000,16(3):290-291

SUMMARY: The Protein Information Resource (PIR) has greatly expanded its Web site and developed a set of interactive search and analysis tools to facilitate the analysis, annotation, and functional identification of proteins. New search engines have been implemented to combine sequence similarity search results with database annotation information. The new PIR search systems have proved very useful in providing enriched functional annotation of protein sequences, determining protein superfamily-domain relationships, and detecting annotation errors in genomic database archives. AVAILABILITY: http://pir.georgetown.edu/. CONTACT: mcgarvey@nbrf.georgetown.edu 相似文献

9.

The Pfam Protein Families Database 总被引：17，自引：0，他引：17

下载免费PDF全文

Alex Bateman Ewan Birney Lorenzo Cerruti Richard Durbin Laurence Etwiller Sean R. Eddy Sam Griffiths-Jones Kevin L. Howe Mhairi Marshall Erik L. L. Sonnhammer 《Nucleic acids research》2002,30(1):276-280

Pfam is a large collection of protein multiple sequence alignments and profile hidden Markov models. Pfam is available on the World Wide Web in the UK at http://www.sanger.ac.uk/Software/Pfam/, in Sweden at http://www.cgb.ki.se/Pfam/, in France at http://pfam.jouy.inra.fr/ and in the US at http://pfam.wustl.edu/. The latest version (6.6) of Pfam contains 3071 families, which match 69% of proteins in SWISS-PROT 39 and TrEMBL 14. Structural data, where available, have been utilised to ensure that Pfam families correspond with structural domains, and to improve domain-based annotation. Predictions of non-domain regions are now also included. In addition to secondary structure, Pfam multiple sequence alignments now contain active site residue mark-up. New search tools, including taxonomy search and domain query, greatly add to the functionality and usability of the Pfam resource. 相似文献

10.

PartiGene--constructing partial genomes 总被引：4，自引：0，他引：4

Parkinson J Anthony A Wasmuth J Schmid R Hedley A Blaxter M 《Bioinformatics (Oxford, England)》2004,20(9):1398-1404

Expressed sequence tags (ESTs) offer a low-cost approach to gene discovery and are being used by an increasing number of laboratories to obtain sequence information for a wide variety of organisms. The challenge lies in processing and organizing this data within a genomic context to facilitate large scale analyses. Here we present PartiGene, an integrated sequence analysis suite that uses freely available public domain software to (1) process raw trace chromatograms into sequence objects suitable for submission to dbEST; (2) place these sequences within a genomic context; (3) perform customizable first-pass annotation of the data; and (4) present the data as HTML tables and an SQL database resource. PartiGene has been used to create a number of non-model organism database resources including NEMBASE (http://www.nematodes.org) and LumbriBase (http://www.earthworms.org/). The packages are readily portable, freely available and can be run on simple Linux-based workstations. AVAILABILITY: PartiGene is available from http://www.nematodes.org/PartiGene and also forms part of the EST analysis software, associated with the Natural Environmental Research Council (UK) Bio-Linux project (http://envgen.nox.ac.uk/biolinux.html). 相似文献

11.

VIP DB - A viral protein domain usage and distribution database

Chen TW Gan RR Wu TH Lin WC Tang P 《Genomics》2012,100(3):149-156

During the viral infection and replication processes, viral proteins are highly regulated and may interact with host proteins. However, the functions and interaction partners of many viral proteins have yet to be explored. Here, we compiled a VIral Protein domain DataBase (VIP DB) to associate viral proteins with putative functions and interaction partners. We systematically assign domains and infer the functions of proteins and their protein interaction partners from their domain annotations. A total of 2,322 unique domains that were identified from 2,404 viruses are used as a starting point to correlate GO classification, KEGG metabolic pathway annotation and domain-domain interactions. Of the unique domains, 42.7% have GO records, 39.6% have at least one domain-domain interaction record and 26.3% can also be found in either mammals or plants. This database provides a resource to help virologists identify potential roles for viral protein. All of the information is available at http://vipdb.cgu.edu.tw. 相似文献

12.

Automatic annotation of protein function based on family identification

Abascal F Valencia A 《Proteins》2003,53(3):683-692

相似文献

13.

RiceGAAS: an automated annotation system and database for rice genome sequence 总被引：27，自引：0，他引：27

下载免费PDF全文

Katsumi Sakata Yoshiaki Nagamura Hisataka Numa Baltazar A. Antonio Hideki Nagasaki Atsuko Idonuma Wakako Watanabe Yuji Shimizu Ikuo Horiuchi Takashi Matsumoto Takuji Sasaki Kenichi Higo 《Nucleic acids research》2002,30(1):98-102

An extensive effort of the International Rice Genome Sequencing Project (IRGSP) has resulted in rapid accumulation of genome sequence, and >137 Mb has already been made available to the public domain as of August 2001. This requires a high-throughput annotation scheme to extract biologically useful and timely information from the sequence data on a regular basis. A new automated annotation system and database called Rice Genome Automated Annotation System (RiceGAAS) has been developed to execute a reliable and up-to-date analysis of the genome sequence as well as to store and retrieve the results of annotation. The system has the following functional features: (i) collection of rice genome sequences from GenBank; (ii) execution of gene prediction and homology search programs; (iii) integration of results from various analyses and automatic interpretation of coding regions; (iv) re-execution of analysis, integration and automatic interpretation with the latest entries in reference databases; (v) integrated visualization of the stored data using web-based graphical view. RiceGAAS also has a data submission mechanism that allows public users to perform fully automated annotation of their own sequences. The system can be accessed at http://RiceGAAS.dna.affrc.go.jp/. 相似文献

14.

MMDB: Entrez's 3D-structure database

Chen J Anderson JB DeWeese-Scott C Fedorova ND Geer LY He S Hurwitz DI Jackson JD Jacobs AR Lanczycki CJ Liebert CA Liu C Madej T Marchler-Bauer A Marchler GH Mazumder R Nikolskaya AN Rao BS Panchenko AR Shoemaker BA Simonyan V Song JS Thiessen PA Vasudevan S Wang Y Yamashita RA Yin JJ Bryant SH 《Nucleic acids research》2003,31(1):474-477

Three-dimensional structures are now known within most protein families and it is likely, when searching a sequence database, that one will identify a homolog of known structure. The goal of Entrez's 3D-structure database is to make structure information and the functional annotation it can provide easily accessible to molecular biologists. To this end, Entrez's search engine provides several powerful features: (i) links between databases, for example between a protein's sequence and structure; (ii) pre-computed sequence and structure neighbors; and (iii) structure and sequence/structure alignment visualization. Here, we focus on a new feature of Entrez's Molecular Modeling Database (MMDB): Graphical summaries of the biological annotation available for each 3D structure, based on the results of automated comparative analysis. MMDB is available at: http://www.ncbi.nlm.nih.gov/Entrez/structure.html. 相似文献

15.

RAD and the RAD Study-Annotator: an approach to collection, organization and exchange of all relevant information for high-throughput gene expression studies 总被引：2，自引：0，他引：2

Manduchi E Grant GR He H Liu J Mailman MD Pizarro AD Whetzel PL Stoeckert CJ 《Bioinformatics (Oxford, England)》2004,20(4):452-459

相似文献

16.

The aquatic animals’ transcriptome resource for comparative functional analysis

Chih-Hung Chou Hsi-Yuan Huang Wei-Chih Huang Sheng-Da Hsu Chung-Der Hsiao Chia-Yu Liu Yu-Hung Chen Yu-Chen Liu Wei-Yun Huang Meng-Lin Lee Yi-Chang Chen Hsien-Da Huang 《BMC genomics》2018,19(2):103

相似文献

17.

Involving undergraduates in the annotation and analysis of global gene expression studies: creation of a maize shoot apical meristem expression database 总被引：3，自引：0，他引：3

下载免费PDF全文

Buckner B Beck J Browning K Fritz A Grantham L Hoxha E Kamvar Z Lough A Nikolova O Schnable PS Scanlon MJ Janick-Buckner D 《Genetics》2007,176(2):741-747

Through a multi-university and interdisciplinary project we have involved undergraduate biology and computer science research students in the functional annotation of maize genes and the analysis of their microarray expression patterns. We have created a database to house the results of our functional annotation of >4400 genes identified as being differentially regulated in the maize shoot apical meristem (SAM). This database is located at http://sam.truman.edu and is now available for public use. The undergraduate students involved in constructing this unique SAM database received hands-on training in an intellectually challenging environment, which has prepared them for graduate and professional careers in biological sciences. We describe our experiences with this project as a model for effective research-based teaching of undergraduate biology and computer science students, as well as for a rich professional development experience for faculty at predominantly undergraduate institutions. 相似文献

18.

Architectures of the unique domains associated with the DEAD-box helicase motif

《Cell cycle (Georgetown, Tex.)》2013,12(20):4228-4235

Helicases are motor proteins of biological system, which catalyze the opening of energetically stable duplex nucleic acids in an ATP-dependent manner and thereby are involved in almost all aspects of nucleic acid metabolism including cell cycle progression. They contain several conserved domains including the DEAD-box and also several unique domains associated with these. The Pfam database (http://pfam.janelia.org/) is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). A diverse range of proteins are found in nature, and the functional specificity to each protein, to a greater extent, is imparted by its domain architecture. To this extent, a DEAD-box ATP-dependent RNA helicase (LOC_Os01g36890; Genomic sequence length: 6284 nucleotides; CDS length: 1299 nucleotides; Protein length: 432 amino acids) was studied. The protein sequence was imported for domain search on Pfam. This particular Pfam entry after covering a large proportion of the sequences in the underlying database has generated a more comprehensive coverage across a wide range of phyla of the known domains that are associated with the typical DEAD-box helicase motif. A total of 362 domain architectures were recollected from the Pfam database for the Family: DEAD (PF00270). We have therefore systematically analyzed the domains closely associated with DEAD-motif, which occur in a variety of proteins and can provide insights into their function. 相似文献

19.

EXProt: a database for proteins with an experimentally verified function 总被引：1，自引：1，他引：0

下载免费PDF全文

Bjrn M. Ursing Frank H. J. van Enckevort Jack A. M. Leunissen Roland J. Siezen 《Nucleic acids research》2002,30(1):50-51

EXProt is a non-redundant protein database containing a selection of entries from genome annotation projects and public databases, aimed at including only proteins with an experimentally verified function. In EXProt release 2.0 we have collected entries from the Pseudomonas aeruginosa community annotation project (PseudoCAP), the Escherichia coli genome and proteome database (GenProtEC) and the translated coding sequences from the Prokaryotes division of EMBL nucleotide sequence database, which are described as having an experimentally verified function. Each entry in EXProt has a unique ID number and contains information about the species, amino acid sequence, functional annotation and, in most cases, links to references in MEDLINE/PubMed and to the entry in the original database. EXProt is indexed in SRS at CMBI (http://www.cmbi.kun.nl/srs/) and can be searched with BLAST and FASTA through the EXProt web page (http://www.cmbi.kun.nl/EXProt/). 相似文献

20.

DDBASE2.0: updated domain database with improved identification of structural domains

Vinayagam A Shi J Pugalenthi G Meenakshi B Blundell TL Sowdhamini R 《Bioinformatics (Oxford, England)》2003,19(14):1760-1764

MOTIVATION: Although many methods are available for the identification of structural domains from protein three-dimensional structures, accurate definition of protein domains and the curation of such data for a large number of proteins are often possible only after manual intervention. The availability of domain definitions for protein structural entries is useful for the sequence analysis of aligned domains, structure comparison, fold recognition procedures and understanding protein folding, domain stability and flexibility. RESULTS: We have improved our method of domain identification starting from the concept of clustering secondary structural elements, but with an intention of reducing the number of discontinuous segments in identified domains. The results of our modified and automatic approach have been compared with the domain definitions from other databases. On a test data set of 55 proteins, this method acquires high agreement (88%) in the number of domains with the crystallographers' definition and resources such as SCOP, CATH, DALI, 3Dee and PDP databases. This method also obtains 98% overlap score with the other resources in the definition of domain boundaries of the 55 proteins. We have examined the domain arrangements of 4592 non-redundant protein chains using the improved method to include 5409 domains leading to an update of the structural domain database. AVAILABILITY: The latest version of the domain database and online domain identification methods are available from http://www.ncbs.res.in/~faculty/mini/ddbase/ddbase.html Supplementary information: http://www.ncbs.res.in/~faculty/mini/ddbase/supplementary/supplementary.html 相似文献