首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Our current biological knowledge is spread over many independent bioinformatics databases where many different types of gene and protein identifiers are used. The heterogeneous and redundant nature of these identifiers limits data analysis across different bioinformatics resources. It is an even more serious bottleneck of data analysis for larger datasets, such as gene lists derived from microarray and proteomic experiments. The DAVID Gene ID Conversion Tool (DICT), a web-based application, is able to convert user's input gene or gene product identifiers from one type to another in a more comprehensive and high-throughput manner with a uniquely enhanced ID-ID mapping database.  相似文献   

2.
Array CGH enables the detection of pathogenic copy number variants (CNVs) in 5–15% of individuals with intellectual disability (ID), making it a promising tool for uncovering ID candidate genes. However, most CNVs encompass multiple genes, making it difficult to identify key disease gene(s) underlying ID etiology. Using array CGH we identified 47 previously unreported unique CNVs in 45/255 probands. We prioritized ID candidate genes using five bioinformatic gene prioritization web tools. Gene priority lists were created by comparing integral genes from each CNV from our ID cohort with sets of training genes specific either to ID or randomly selected. Our findings suggest that different training sets alter gene prioritization only moderately; however, only the ID gene training set resulted in significant enrichment of genes with nervous system function (19%) in prioritized versus non-prioritized genes from the same de novo CNVs (7%, p < 0.05). This enrichment further increased to 31% when the five web tools were used in concert and included genes within mitogen-activated protein kinase (MAPK) and neuroactive ligand-receptor interaction pathways. Gene prioritization web tools enrich for genes with relevant function in ID and more readily facilitate the selection of ID candidate genes for functional studies, particularly for large CNVs.  相似文献   

3.
4.
Lee S  Kim B  Kim H  Lee H  Yu U 《BMB reports》2011,44(2):107-112
We have developed a biologist-friendly, stand-alone Java GUI application, IdBean, for ID conversion. Our tool integrated most of the widely used ID conversion services that provide programmatic access. It is the first GUI ID conversion application that supports the direct merging as well as comparison of conversion results from multiple ID conversion services without manual effort. This tool will greatly help biologists who handle multiple ID types for the analyses of gene or gene product lists. By referring to multiple conversion services, the number of failed IDs can be reduced. By accessing ID conversion service online, it will potentially provide the most up-to-date conversion results. The application was developed in modular form; however, it can be re-packaged into plug-in form. For the development of a bioinformatics analysis tool, the module can be used as a built-in ID conversion component. It is available at http://neon.gachon.ac.kr/IdBean/.  相似文献   

5.
We introduce the PSSH ('Protein Sequence-to-Structure Homologies') database derived from HSSP2, an improved version of the HSSP ('Homology-derived Secondary Structure of Proteins') database [Dodge et al. (1998) Nucleic Acids Res., 26, 313-315]. Whereas each HSSP entry lists all protein sequences related to a given 3D structure, PSSH is the 'inverse', with each entry listing all structures related to a given sequence. In addition, we introduce two other derived databases: HSSPchain, in which each entry lists all sequences related to a given PDB chain, and HSSPalign, in which each entry gives details of one sequence aligned onto one PDB chain. This re-organization makes it easier to navigate from sequence to structure, and to map sequence features onto 3D structures. Currently (September 2002), PSSH provides structural information for over 400 000 protein sequences, covering 48% of SWALL and 61% of SWISS-PROT sequences; HSSPchain provides sequence information for over 25 000 PDB chains, and HSSPalign gives over 14 million sequence-to-structure alignments. The databases can be accessed via SRS 3D, an extension to the SRS system, at http://srs3d.ebi.ac.uk/.  相似文献   

6.
Yau SS  Yu C  He R 《DNA and cell biology》2008,27(5):241-250
Graphical representation of gene sequences provides a simple way of viewing, sorting, and comparing various gene structures. Here we first report a two-dimensional graphical representation for protein sequences. With this method, we constructed the moment vectors for protein sequences, and mathematically proved that the correspondence between moment vectors and protein sequences is one-to-one. Therefore, each protein sequence can be represented as a point in a map, which we call protein map, and cluster analysis can be used for comparison between the points. Sixty-six proteins from five protein families were analyzed using this method. Our data showed that for proteins in the same family, their corresponding points in the map are close to each other. We also illustrate the efficiency of this approach by performing an extensive cluster analysis of the protein kinase C family. These results indicate that this protein map could be used to mathematically specify the similarity of two proteins and predict properties of an unknown protein based on its amino acid sequence.  相似文献   

7.
Networks are proving to be central to the study of gene function, protein-protein interaction, and biochemical pathway data. Visualization of networks is important for their study, but visualization tools are often inadequate for working with very large biological networks. Here, we present an algorithm, called large graph layout (LGL), which can be used to dynamically visualize large networks on the order of hundreds of thousands of vertices and millions of edges. LGL applies a force-directed iterative layout guided by a minimal spanning tree of the network in order to generate coordinates for the vertices in two or three dimensions, which are subsequently visualized and interactively navigated with companion programs. We demonstrate the use of LGL in visualizing an extensive protein map summarizing the results of approximately 21 billion sequence comparisons between 145579 proteins from 50 genomes. Proteins are positioned in the map according to sequence homology and gene fusions, with the map ultimately serving as a theoretical framework that integrates inferences about gene function derived from sequence homology, remote homology, gene fusions, and higher-order fusions. We confirm that protein neighbors in the resulting map are functionally related, and that distinct map regions correspond to distinct cellular systems, enabling a computational strategy for discovering proteins' functions on the basis of the proteins' map positions. Using the map produced by LGL, we infer general functions for 23 uncharacterized protein families.  相似文献   

8.
The INDETERMINATE protein, ID1, plays a key role in regulating the transition to flowering in maize. ID1 is the founding member of a plant-specific zinc finger protein family that is defined by a highly conserved amino sequence called the ID domain. The ID domain includes a cluster of three different types of zinc fingers separated from a fourth C2H2 finger by a long spacer; ID1 is distinct from other ID domain proteins by having a much longer spacer. In vitro DNA selection and amplification binding assays and DNA binding experiments showed that ID1 binds selectively to an 11 bp consensus motif via the ID domain. Unexpectedly, site-directed mutagenesis of the ID1 protein showed that zinc fingers located at each end of the ID domain are not required for binding to the consensus motif despite the fact that one of these zinc fingers is a canonical C2H2 DNA binding domain. In addition, an ID1 in vitro deletion mutant that lacks the extra spacer between zinc fingers binds the same 11 bp motif as normal ID1, suggesting that all ID domain-containing proteins recognize the same DNA target sequence. Our results demonstrate that maize ID1 and ID domain proteins have novel zinc finger configurations with unique DNA binding properties.  相似文献   

9.
Autism spectrum disorders (ASD) are a group of related neurodevelopmental disorders with significant combined prevalence (~1%) and high heritability. Dozens of individually rare genes and loci associated with high-risk for ASD have been identified, which overlap extensively with genes for intellectual disability (ID). However, studies indicate that there may be hundreds of genes that remain to be identified. The advent of inexpensive massively parallel nucleotide sequencing can reveal the genetic underpinnings of heritable complex diseases, including ASD and ID. However, whole exome sequencing (WES) and whole genome sequencing (WGS) provides an embarrassment of riches, where many candidate variants emerge. It has been argued that genetic variation for ASD and ID will cluster in genes involved in distinct pathways and protein complexes. For this reason, computational methods that prioritize candidate genes based on additional functional information such as protein-protein interactions or association with specific canonical or empirical pathways, or other attributes, can be useful. In this study we applied several supervised learning approaches to prioritize ASD or ID disease gene candidates based on curated lists of known ASD and ID disease genes. We implemented two network-based classifiers and one attribute-based classifier to show that we can rank and classify known, and predict new, genes for these neurodevelopmental disorders. We also show that ID and ASD share common pathways that perturb an overlapping synaptic regulatory subnetwork. We also show that features relating to neuronal phenotypes in mouse knockouts can help in classifying neurodevelopmental genes. Our methods can be applied broadly to other diseases helping in prioritizing newly identified genetic variation that emerge from disease gene discovery based on WES and WGS.  相似文献   

10.
Homology Gene List (HOMGL) is a web-based tool for comparing gene lists with different accession numbers and identifiers and between different organisms. UniGene, LocusLink, HomoloGene and Ensembl databases are utilized to map between these lists and to retrieve upstream or transcribed sequences for genes in these lists. We illustrate the use of HOMGL with respect to microarray studies and promoter analysis. AVAILABILITY: http://homgl.biologie.hu-berlin.de/  相似文献   

11.
Many changes in neuronal gene expression occur in response to ischemia, and these may play a role in determining the fate of ischemic neurons. To identify genes induced in the rat brain following cerebral ischemia, a strategy was used that combines subtractive hybridization and differential screening. Among the genes identified was one referred to as global ischemia-inducible gene 11(Giig11). Sequence analysis indicated that Giig11 exhibited 97% and 91% identity to the known Ero1-L (S. cereviseae ero1-like oxidoreductase) of mouse and human origin, which is involved in oxidative endoplasmic reticulum protein folding. Rat Ero1-L/Giig11 also contains a l07-bp sequence that is nearly identical (> 95%) to the known dispersed repetitive identifier (ID), but which is lacking in mouse and human Ero1-L. Northern blotting showed that expression of the ID element and Ero1-L/Giig11 mRNA increased after global cerebral ischemia. In situ hybridization demonstrated increased expression of Ero1-L/Giig11 in the brain following ischemic injury, with the highest levels in the vulnerable hippocampal CA1 pyramidal neurons. Transfection of cultured primary hippocampal neurons with a plasmid containing green fluorescent protein (gfp) and Ero1-L/Giig11 cDNA (with and without the ID element) produced a gfp-Ero1-L/Giig11 fusion protein, and more fusion protein was localized into dendrites in the presence of the ID element, suggesting that the ID element promotes Ero1-L/Giig11 protein localization to dendrites. Therefore, Ero-1L/Giig11 may have a role in ischemia-induced neuronal repair or survival mechanisms directed at counteracting abnormalities in protein folding, maturation and distribution.  相似文献   

12.
The basal component of the nematode dense-body is vinculin   总被引:30,自引:0,他引:30  
We have constructed a genomic DNA expression library and screened it with antibodies in order to clone the deb-1 gene from the nematode Caenorhabditis elegans. This gene encodes a protein found at the base of the muscle dense-bodies, structures which attach actin thin filaments to the sarcolemma. We report the complete sequence of the deb-1 gene, its localization on the C. elegans genetic map, and the finding that it encodes a protein with a sequence very similar to chicken vinculin. We also show that the difference in size between this nematode protein and chicken vinculin is due in part to the absence from the nematode sequence of one of the three internal repeats found in the chicken sequence.  相似文献   

13.
灵芝-8基因的番茄果实特异性启动子植物表达载体的构建   总被引:1,自引:0,他引:1  
构建含有灵芝-8(LZ-8)基因和番茄果实特异性E8启动子的重组载体,并将其转化到根瘤农杆菌中。通过PCR法获取LZ-8基因和E8启动子序列,将目的基因和E8启动子序列构建到植物表达载体pBI121中,获得果实特异性表达LZ-8蛋白的重组质粒。并采用PCR、限制性内切酶酶切和序列测定分析法,对重组质粒进行鉴定,将其转入根瘤农杆菌GV3101中。PCR法、限制性内切酶酶切图谱和序列测定分析均表明番茄果实特异性表达LZ-8蛋白的重组质粒构建成功。获得了含有LZ-8基因和E8启动子的重组质粒,并成功转化根瘤农杆菌,为下一步LZ-8蛋白在番茄果实中特异表达奠定基础。  相似文献   

14.
微生物基因组注释系统MGAP   总被引:6,自引:0,他引:6  
利用生物信息学方法和工具开发了微生物基因组注释系统(Microbial genome annotation package, MGAP),并用于蓝细菌PCC7002的基因组注释。该系统由基因组注释系统和基于Web的用户接口程序两部分组成。基因组注释系统整合多个基因识别、功能预测和序列分析软件;以及蛋白质序列数据库、蛋白质资源信息系统和直系同源蛋白质家族数据库等。用户接口程序包括基因组环状图展示、基因和开放读码框在染色体上的分布图,以及注释信息检索工具。该系统基于PC微机和Linux操作系统,用MySQL作数据库管理系统、用Apache作Web服务器程序,用Perl脚本语言编写应用程序接口,上述软件均可免费获得。  相似文献   

15.
The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. We purified the Mu gam gene product to apparent homogeneity from cells in which it is over-produced from a plasmid clone. The purified protein is a dimer of identical subunits of 18.9 kd. It can aggregate DNA into large, rapidly sedimenting complexes and is a potent exonuclease inhibitor when bound to DNA. The N-terminal amino acid sequence of the purified protein was determined by automated degradation and the nucleotide sequence of the Mu gam gene is presented to accurately map its position in the Mu genome.  相似文献   

16.
17.
PISCES: a protein sequence culling server   总被引:21,自引:0,他引:21  
PISCES is a public server for culling sets of protein sequences from the Protein Data Bank (PDB) by sequence identity and structural quality criteria. PISCES can provide lists culled from the entire PDB or from lists of PDB entries or chains provided by the user. The sequence identities are obtained from PSI-BLAST alignments with position-specific substitution matrices derived from the non-redundant protein sequence database. PISCES therefore provides better lists than servers that use BLAST, which is unable to identify many relationships below 40% sequence identity and often overestimates sequence identity by aligning only well-conserved fragments. PDB sequences are updated weekly. PISCES can also cull non-PDB sequences provided by the user as a list of GenBank identifiers, a FASTA format file, or BLAST/PSI-BLAST output.  相似文献   

18.
19.
20.
Fanconi anemia (FA) is a developmental and cancer-predisposition syndrome caused by mutations in genes controlling DNA interstrand crosslink repair. Several FA proteins form a ubiquitin ligase that controls monoubiquitination of the FANCD2 protein in an ATR-dependent manner. Here we describe the FA protein FANCI, identified as an ATM/ATR kinase substrate required for resistance to mitomycin C. FANCI shares sequence similarity with FANCD2, likely evolving from a common ancestral gene. The FANCI protein associates with FANCD2 and, together, as the FANCI-FANCD2 (ID) complex, localize to chromatin in response to DNA damage. Like FANCD2, FANCI is monoubiquitinated and unexpectedly, ubiquitination of each protein is important for the maintenance of ubiquitin on the other, indicating the existence of a dual ubiquitin-locking mechanism required for ID complex function. Mutation in FANCI is responsible for loss of a functional FA pathway in a patient with Fanconi anemia complementation group I.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号