首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A Web-based database system was constructed and implemented that contains 174 tumor suppressor genes. The database homepage was created to accommodate these genes in a pull-down window so that each gene can be viewed individually in a separate Web page. Information displayed on each page includes gene name, aliases, source organism, chromosome location, expression cells/tissues, gene structure, protein size, gene functions and major reference sources. Queries to the database can be conducted through a user-friendly interface, and query results are returned in the HTML format on dynamically generated web pages. AVAILABILITY: The database is available at http://www.cise.ufl.edu/~yy1/HTML-TSGDB/Homepage.html (data files also at http://www.patcar.org/Databases/Tumor_Suppressor_Genes)  相似文献   

2.
Assigning functions to proteins of unknown function is of considerable interest to the proteomic researchers as the genes encoding them are conserved over various species. Here, we describe HypoDB, a database of hypothetical genes and proteins in six eukaryotes. The database was collected and organized based on the number of entries in each chromosome with few annotations. Hypothetical protein database contains information related to gene and protein sequences, chromosome number and location, secondary and tertiary structure related data. AVAILABILITY: The database is available for free at http://www.trimslabs.com/database/hypodb/index.html.  相似文献   

3.
Shi L  Zhang Q  Rui W  Lu M  Jing X  Shang T  Tang J 《Regulatory peptides》2004,120(1-3):1-3
Bioactive peptide database (BioPD) is a web-based knowledge base that contains more than 1100 protein sequences from human, mouse and rat, which are putative or are known to be bioactive peptides. In addition to peptide sequences and the annotation, the database also contains gene sequences with annotation, protein interaction and disease data related to the peptides. Each entry has as many references as possible to support the information represented. BioPD consists of six parts: PROTEIN, GENE, DISEASE, LINKS, INTERACTION, and REFERENCE. The database is searchable through keyword, gene and protein name, receptor name, etc. The links to PDB, InterPro, Pfam, OMIM, etc. are provided in each entry. Thus BioPD is formed as an information center for the bioactive peptide and serves as a gateway for exploration of bioactive peptides. The database can be accessed at http://biopd.bjmu.edu.cn.  相似文献   

4.
The inaugural version of the InGaP database (Integrative Gene and Protein expression database; http://www.kazusa.or.jp/ingap/index.html) is a comprehensive database of gene/protein expression profiles of 127 mKIAA genes/proteins related to hypothetical ones obtained in our ongoing cDNA project. Information about each gene/protein consists of cDNA microarray analysis, subcellular localization of the ectopically expressed gene, and experimental data using anti-mKIAA antibody such as Western blotting and immunohistochemical analyses. KIAA cDNAs and their mouse counterparts, mKIAA cDNAs, were mainly isolated from cDNA libraries derived from brain tissues, thus we expect our database to contribute to the field of neuroscience. In fact, cDNA microarray analysis revealed that nearly half of our gene collection is predominantly expressed in brain tissues. Immunohistochemical analysis of the mouse brain provides functional insight into the specific area and/or cell type of the brain. This database will be a resource for the neuroscience community by seamlessly integrating the genomic and proteomic information about the mouse KIAA genes/proteins.  相似文献   

5.
GEO(Gene Expression Omnibus ):高通量基因表达数据库   总被引:2,自引:0,他引:2  
 GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10 000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/.  相似文献   

6.
BioThesaurus is a web-based system designed to map a comprehensive collection of protein and gene names to protein entries in the UniProt Knowledgebase. Currently covering more than two million proteins, BioThesaurus consists of over 2.8 million names extracted from multiple molecular biological databases according to the database cross-references in iProClass. The BioThesaurus web site allows the retrieval of synonymous names of given protein entries and the identification of protein entries sharing the same names. AVAILABILITY: BioThesaurus is accessible for online searching at http://pir.georgetown.edu/iprolink/biothesaurus  相似文献   

7.
MOTIVATION: The advent of genomics yields thousands of reading frames in search of function. Identification of conserved functional motifs in protein sequences can be helpful for function prediction. RESULTS: A database and a classification of reported DNA-binding protein motifs has been designed. A program ('TranScout') has been developed for the detection and evaluation of conserved motifs in prokaryotic and eukaryotic sequences of proteins with a gene regulatory function. The efficiency of the program is shown in a benchmark against a database obtained from SWISS-PROT without the protein sequences used to train the program. All motifs were detected with a mean average sensitivity of 0.98 and a mean average specificity of 0.92. AVAILABILITY: The program is freely available for use on the internet at http://luz.uab.es/transcout/. The user can find additional information at this site.  相似文献   

8.
Random clones from a cDNA library made from mRNA purified from dissected salivary glands of feeding female Amblyomma variegatum ticks were subjected to single pass sequence analysis. A total of 3992 sequences with an average read length of 580 nucleotides have been used to construct a gene index called AvGI that consists of 2109 non-redundant sequences. A provisional gene identity has been assigned to 39% of the database entries by sequence similarity searches against a non-redundant amino acid database and a protein database that has been assigned gene ontology terms. Homologs of genes encoding basic cellular functions including previously characterised enzyme activities, such as stearoyl CoA saturase and protein phosphatase, of ixodid tick salivary glands were found. Several families of abundant cDNA sequences that may code for protein components of tick cement and A. variegatum proteins which may contribute to anti-haemostatic and anti-inflammatory responses, and, one with potential immunosuppressive activity, were also identified. Interference with the function of such proteins might disrupt the life cycle of A. variegatum and help to control this ectoparasite or to reduce its ability to transmit disease causing organisms. AvGI represents an electronic knowledge base, which can be used to launch investigations of the biology of the salivary glands of this tick species. The database may be accessed via the World Wide Web at http://www.tigr.org/tdb/tgi.shtml.  相似文献   

9.
In pursuit of a better updated source including 'omics' information for breast cancer, Breast Cancer Database (BCDB) has been developed to provide the researcher with the quick overview of the Breast cancer disease and other relevant information. This database comprises of myriad of information about genes involved in breast cancer, its functions and drug molecules which are currently being used in the treatment of breast cancer. The data available in BCDB is retrieved from the biomedical research literature. It facilitates the user to search information on gene, its location in chromosome, functions and its importance in cancer diseases. Broadly, this can be queried by giving gene name, protein name and drug name. This database is platform independent, user friendly and freely accessible through internet. The data present in BCDB is directly linked to other on-line resources such as NCBI, PDB and PubMed. Hence, it can act as a complete web resource comprising gene sequences, drug structures and literature information related to breast cancer, which is not available in any other breast cancer database. AVAILABILITY: The database is freely available at http://122.165.25.137/bioinfo/breastcancerdb/  相似文献   

10.
Post-processing of BLAST results using databases of clustered sequences   总被引:1,自引:0,他引:1  
Motivation: When evaluating the results of a sequence similaritysearch, there are many situations where it can be useful todetermine whether sequences appearing in the results share somedistinguishing characteristic. Such dependencies between databaseentries are often not readily identifiable, but can yield importantnew insights into the biological function of a gene or protein. Results: We have developed a program called CBLAST that sortsthe results of a BLAST sequence similarity search accordingto sequence membership in user-defined ‘clusters’of sequences. To demonstrate the utility of this application,we have constructed two cluster databases. The first describesclusters of nucleotide sequences representing the same gene,as documented in the UNIGENE database, and the second describesclusters of protein sequences which are members of the proteinfamilies documented in the PROSITE database. Cluster databasesand the CBLAST post-processor provide an efficient mechanismfor identifying and exploring relationships and dependenciesbetween new sequences and database entries. Availability: The software described in this article is availablefree of charge from the EBI software archive at < ftp: //ftp.ebi. ac. uk/pub/software/unix >. Contact: E-mail: rainer _fuchs@glaxowellcome.com  相似文献   

11.
With the exponentially increasing amount of information in the biomedical field, the significance of advanced information retrieval and information extraction, as well as the role of databases, has been increasing. PRIME is an integrated gene/protein informatics database based on natural language processing. It provides automatically extracted protein/family/gene/compound interaction information including both physical and genetic interactions, gene ontology based functions, and graphic pathway viewers. Gene/protein/family names and functional terms are recognized based on dictionaries developed in our laboratory. The interaction and functional information are extracted by syntactic dependencies and various phrase patterns. We have included about 920,000 (non-redundant) protein interactions and 360,000 annotated gene-function relationships for major eukaryotes. By combining the sequence and text information, the pathway comparison between two organisms and simple pathway deduction based on other organism interaction data, and pathway filtering using tissue expression data, are also available. This database is accessible at http://prime.ontology.ims.u-tokyo.ac.jp:8081.  相似文献   

12.
13.
SUMMARY: Alternative translational initiation is an important cellular mechanism contributing to the diversity of protein products and functions. We develop a database that provides a comprehensive collection of alternative translational initiation events. The purpose of this alternative translational initiation database (ATID) is to facilitate the systematic study of alternative translational initiation of genes. The current version of database contains 300 genes from Homo sapiens, Mus musculus and other species. Each of the genes has two or more isoforms due to alternative translational initiation. Resources in ATID, including gene information, alternative products of genes and domain structures of isoforms, are provided through a user-friendly web interface. AVAILABILITY: The ATID database is available for public use at http://bioinfo.au.tsinghua.edu.cn/atie/.  相似文献   

14.
ABSTRACT: BACKGROUND: Dystrophin is a large essential protein of skeletal and heart muscle. It is a filamentous scaffolding protein with numerous binding domains. Mutations in the DMD gene, which encodes dystrophin, mostly result in the deletion of one or several exons and cause Duchenne (DMD) and Becker (BMD) muscular dystrophies. The most common DMD mutations are frameshift mutations resulting in an absence of dystrophin from tissues. In-frame DMD mutations are less frequent and result in a protein with partial wild-type dystrophin function. The aim of this study was to highlight structural and functional modifications of dystrophin caused by in-frame mutations. Methods and results We developed a dedicated database for dystrophin, the eDystrophin database. It contains 209 different non frame-shifting mutations found in 945 patients from a French cohort and previous studies. Bioinformatics tools provide models of the three-dimensional structure of the protein at deletion sites, making it possible to determine whether the mutated protein retains the typical filamentous structure of dystrophin. An analysis of the structure of mutated dystrophin molecules showed that hybrid repeats were reconstituted at the deletion site in some cases. These hybrid repeats harbored the typical triple coiled-coil structure of native repeats, which may be correlated with better function in muscle cells. CONCLUSION: This new database focuses on the dystrophin protein and its modification due to in-frame deletions in BMD patients. The observation of hybrid repeat reconstitution in some cases provides insight into phenotype-genotype correlations in dystrophin diseases and possible strategies for gene therapy. The eDystrophin database is freely available: http://edystrophin.genouest.org/.  相似文献   

15.
The apple (Malus domestica) is one of the most economically important fruit crops in the world, due its importance to human nutrition and health. To analyze the function and evolution of different apple genes, we developed apple gene function and gene family database (AppleGFDB) for collecting, storing, arranging, and integrating functional genomics information of the apple. The AppleGFDB provides several layers of information about the apple genes, including nucleotide and protein sequences, chromosomal locations, gene structures, and any publications related to these annotations. To further analyze the functional genomics data of apple genes, the AppleGFDB was designed to enable users to easily retrieve information through a suite of interfaces, including gene ontology, protein domain and InterPro. In addition, the database provides tools for analyzing the expression profiles and microRNAs of the apple. Moreover, all of the analyzed and collected data can be downloaded from the database. The database can also be accessed using a convenient web server that supports a full-text search, a BLAST sequence search, and database browsing. Furthermore, to facilitate cooperation among apple researchers, AppleGFDB is presented in a user-interactive platform, which provides users with the opportunity to modify apple gene annotations and submit publication information for related genes. AppleGFDB is available at http://www.applegene.org or http://gfdb.sdau.edu.cn/.  相似文献   

16.
MOTIVATION: A comprehensive gene expression database is essential for computer modeling and simulation of biological phenomena, including development. Development is a four-dimensional (4D; 3D structure and time course) phenomenon. We are constructing a 4D database of gene expression for the early embryogenesis of the nematode Caenorhabditis elegans. As a framework of the 4D database, we have constructed computer graphics (CG), into which we will incorporate the expression data of a number of genes at the subcellular level. However, the assignment of 3D distribution of gene products (protein, mRNA), of embryos at various developmental stages, is both difficult and tedious. We need to automate this process. For this purpose, we developed a new system, named SPI after superimposing fluorescent confocal microscopic data onto a CG framework. RESULTS: The scheme of this system comprises the following: (1) acquirement of serial sections (40 slices) of fluorescent confocal images of three colors (4',6'-diamino-2-phenylindole (DAPI) for nuclei, indodicarbocyanine (Cy-3) for the internal marker, which is a germline-specific protein POS-1 and indocarbocyanine (Cy-5) for the gene product to be examined); (2) identification of several features of the stained embryos, such as contour, developmental stage and position of the internal marker; (3) selection of CG images of the corresponding stage for template matching; (4) superimposition of serial sections onto the CG; (5) assignment of the position of superimposed gene products. The Snakes algorithm identified the embryo contour. The detection accuracy of embryo contours was 92.1% when applied to 2- to 28-cell-stage embryos. The accuracy of the developmental stage prediction method was 81.2% for 2- to 8-cell-stage embryos. We manually judged only the later stage embryos because the accuracy for embryos at the later stages was unsatisfactory due to experimental noise effects. Finally, our system chose the optimal CG and performed the superposition and assignment of gene product distribution. We established an initial 4D gene expression database with 56 maternal gene products. AVAILABILITY: This system is available at http://anti.lab.nig.ac.jp/spi/ and http://anti.lab.nig.ac.jp/4ddb/  相似文献   

17.
In a database search for homologs of acyl-coenzyme A oxidases (ACX) in Arabidopsis, we identified a partial genomic sequence encoding an apparently novel member of this gene family. Using this sequence information we then isolated the corresponding full-length cDNA from etiolated Arabidopsis cotyledons and have characterized the encoded recombinant protein. The polypeptide contains 675 amino acids. The 34 residues at the amino terminus have sequence similarity to the peroxisomal targeting signal 2 of glyoxysomal proteins, including the R-[I/Q/L]-X5-HL-XL-X15-22-C consensus sequence, suggesting a possible microsomal localization. Affinity purification of the encoded recombinant protein expressed in Escherichia coli followed by enzymatic assay, showed that this enzyme is active on C8:0- to C14:0-coenzyme A with maximal activity on C12:0-coenzyme A, indicating that it has medium-chain-specific activity. These data indicate that the protein reported here is different from previously characterized classes of ACX1, ACX2, and short-chain ACX (SACX), both in sequence and substrate chain-length specificity profile. We therefore, designate this new gene AtACX3. The temporal and spatial expression patterns of AtACX3 during development and in various tissues were similar to those of the AtSACX and other genes expressed in glyoxysomes. Currently available database information indicates that AtACX3 is present as a single copy gene.  相似文献   

18.
EcoGene: a genome sequence database for Escherichia coli K-12   总被引:5,自引:1,他引:4       下载免费PDF全文
The EcoGene database provides a set of gene and protein sequences derived from the genome sequence of Escherichia coli K-12. EcoGene is a source of re-annotated sequences for the SWISS-PROT and Colibri databases. EcoGene is used for genetic and physical map compilations in collaboration with the Coli Genetic Stock Center. The EcoGene12 release includes 4293 genes. EcoGene12 differs from the GenBank annotation of the complete genome sequence in several ways, including (i) the revision of 706 predicted or confirmed gene start sites, (ii) the correction or hypothetical reconstruction of 61 frame-shifts caused by either sequence error or mutation, (iii) the reconstruction of 14 protein sequences interrupted by the insertion of IS elements, and (iv) pre-dictions that 92 genes are partially deleted gene fragments. A literature survey identified 717 proteins whose N-terminal amino acids have been verified by sequencing. 12 446 cross-references to 6835 literature citations and s are provided. EcoGene is accessible at a new website: http://bmb.med.miami.edu/EcoGene/EcoWeb. Users can search and retrieve individual EcoGene GenePages or they can download large datasets for incorporation into database management systems, facilitating various genome-scale computational and functional analyses.  相似文献   

19.
EXProt (database for EXPerimentally verified Protein functions) is a new non-redundant database containing protein sequences for which the function has been experimentally verified. It is a selection of 3976 entries from the Prokaryotes section of the EMBL Nucleotide Sequence Database, Release 66, and 375 entries from the Pseudomonas Community Annotation Project (PseudoCAP). The entries in EXProt all have a unique ID number and provide information about the organism, protein sequence, functional annotation, link to entry in original database, and if known, gene name and link to references in PubMed/Medline. The EXProt web page (http://www.cmbi.nl/EXProt) provides further details of the database and a link to a BLAST search (blastp & blastx) of the database. The EXProt entries are indexed in SRS (http://www.cmbi.nl/srs/) and can be searched by means of keywords. Authors can be reached by email (exprot(cmbi.kun.nl).  相似文献   

20.
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号