首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
DigiNorthern,digital expression analysis of query genes based on ESTs   总被引:1,自引:0,他引:1  
DigiNorthern (DN) is a web-based tool for virtually displaying expression profiles of query genes based on EST sequences. Two utilities are available: DN1 takes one query gene and quantitatively display its expression levels in tissues/organs that express the gene with comparison between normal and neoplastic status of each tissue; DN2 takes two sequences as query genes and compares their expression profiles side by side.  相似文献   

2.
The intestinal crypt/villus in situ hybridization database (CVD) query interface is a web-based tool to search for genes with similar relative expression patterns along the crypt/villus axis of the mammalian intestine. The CVD is an online database holding information for relative gene expression patterns in the mammalian intestine and is based on the scoring of in situ hybridization experiments reported in the literature. CVD contains expression data for 88 different genes collected from 156 different in situ hybridization profiles. The web-based query interface allows execution of both single gene queries and pattern searches. The query results provide links to the most relevant public gene databases. AVAILABILITY: http://pc113.imbg.ku.dk/ps/  相似文献   

3.
MOTIVATION: Life science researchers often require an exhaustive list of protein coding genes similar to a given query gene. To find such genes, homology search tools, such as BLAST or PatternHunter, return a set of high-scoring pairs (HSPs). These HSPs then need to be correlated with existing sequence annotations, or assembled manually into putative gene structures. This process is error-prone and labor-intensive, especially in genomes without reliable gene annotation. RESULTS: We have developed a homology search solution that automates this process, and instead of HSPs returns complete gene structures. We achieve better sensitivity and specificity by adapting a hidden Markov model for gene finding to reflect features of the query gene. Compared to traditional homology search, our novel approach identifies splice sites much more reliably and can even locate exons that were lost in the query gene. On a testing set of 400 mouse query genes, we report 79% exon sensitivity and 80% exon specificity in the human genome based on orthologous genes annotated in NCBI HomoloGene. In the same set, we also found 50 (12%) gene structures with better protein alignment scores than the ones identified in HomoloGene. AVAILABILITY: The Java implementation is available for download from http://www.bioinformatics.uwaterloo.ca/software.  相似文献   

4.
In the genetic code, the UGA codon has a dual function as it encodes selenocysteine (Sec) and serves as a stop signal. However, only the translation terminator function is used in gene annotation programs, resulting in misannotation of selenoprotein genes. Here, we applied two independent bioinformatics approaches to characterize a selenoprotein set in prokaryotic genomes. One method searched for selenoprotein genes by identifying RNA stem-loop structures, selenocysteine insertion sequence elements; the second approach identified Sec/Cys pairs in homologous sequences. These analyses identified all or almost all selenoproteins in completely sequenced bacterial and archaeal genomes and provided a view on the distribution and composition of prokaryotic selenoproteomes. In addition, lineage-specific and core selenoproteins were detected, which provided insights into the mechanisms of selenoprotein evolution. Characterization of selenoproteomes allows interpretation of other UGA codons in completed genomes of prokaryotes as terminators, addressing the UGA dual-function problem.  相似文献   

5.
6.
7.
In the present study, we examined GC nucleotide composition, relative synonymous codon usage (RSCU), effective number of codons (ENC), codon adaptation index (CAI) and gene length for 308 prokaryotic mechanosensitive ion channel (MSC) genes from six evolutionary groups: Euryarchaeota, Actinobacteria, Alphaproteobacteria, Betaproteobacteria, Firmicutes, and Gammaproteobacteria. Results showed that: (1) a wide variation of overrepresentation of nucleotides exists in the MSC genes; (2) codon usage bias varies considerably among the MSC genes; (3) both nucleotide constraint and gene length play an important role in shaping codon usage of the bacterial MSC genes; and (4) synonymous codon usage of prokaryotic MSC genes is phylogenetically conserved. Knowledge of codon usage in prokaryotic MSC genes may benefit from the study of the MSC genes in eukaryotes in which few MSC genes have been identified and functionally analysed.  相似文献   

8.
SUMMARY: The Gandr (gene annotation data representation) knowledgebase is an ontological framework for laboratory-specific gene annotation. Gandr uses Protege 2000 for editing, querying and visualizing microarray data and annotations. Genes can be annotated with provided, newly created or imported ontological concepts. Annotated genes can inherit assigned concept properties and can be related to each other. The resulting knowledgebase can be visualized as interactive network of nodes and edges representing genes and their functional relationships. This allows for immediate and associative gene context exploration. Ontological query techniques allow for powerful data access.  相似文献   

9.
Studies into the genetic origins of tumor cell chemoactivity pose significant challenges to bioinformatic mining efforts. Connections between measures of gene expression and chemoactivity have the potential to identify clinical biomarkers of compound response, cellular pathways important to efficacy and potential toxicities; all vital to anticancer drug development. An investigation has been conducted that jointly explores tumor-cell constitutive NCI60 gene expression profiles and small-molecule NCI60 growth inhibition chemoactivity profiles, viewed from novel applications of self-organizing maps (SOMs) and pathway-centric analyses of gene expressions, to identify subsets of over- and under-expressed pathway genes that discriminate chemo-sensitive and chemo-insensitive tumor cell types. Linear Discriminant Analysis (LDA) is used to quantify the accuracy of discriminating genes to predict tumor cell chemoactivity. LDA results find 15% higher prediction accuracies, using ∼30% fewer genes, for pathway-derived discriminating genes when compared to genes derived using conventional gene expression-chemoactivity correlations. The proposed pathway-centric data mining procedure was used to derive discriminating genes for ten well-known compounds. Discriminating genes were further evaluated using gene set enrichment analysis (GSEA) to reveal a cellular genetic landscape, comprised of small numbers of key over and under expressed on- and off-target pathway genes, as important for a compound’s tumor cell chemoactivity. Literature-based validations are provided as support for chemo-important pathways derived from this procedure. Qualitatively similar results are found when using gene expression measurements derived from different microarray platforms. The data used in this analysis is available at http://pubchem.ncbi.nlm.nih.gov/and http://www.ncbi.nlm.nih.gov/projects/geo (GPL96, GSE32474).  相似文献   

10.
MOTIVATION: Tightly packed prokaryotic genes frequently overlap with each other. This feature, rarely seen in eukaryotic DNA, makes detection of translation initiation sites and, therefore, exact predictions of prokaryotic genes notoriously difficult. Improving the accuracy of precise gene prediction in prokaryotic genomic DNA remains an important open problem. RESULTS: A software program implementing a new algorithm utilizing a uniform Hidden Markov Model for prokaryotic gene prediction was developed. The algorithm analyzes a given DNA sequence in each of six possible global reading frames independently. Twelve complete prokaryotic genomes were analyzed using the new tool. The accuracy of gene finding, predicting locations of protein-coding ORFs, as well as the accuracy of precise gene prediction, and detecting the whole gene including translation initiation codon were assessed by comparison with existing annotation. It was shown that in terms of gene finding, the program performs at least as well as the previously developed tools, such as GeneMark and GLIMMER. In terms of precise gene prediction the new program was shown to be more accurate, by several percentage points, than earlier developed tools, such as GeneMark.hmm, ECOPARSE and ORPHEUS. The results of testing the program indicated the possibility of systematic bias in start codon annotation in several early sequenced prokaryotic genomes. AVAILABILITY: The new gene-finding program can be accessed through the Web site: http:@dixie.biology.gatech.edu/GeneMark/fbf.cgi CONTACT: mark@amber.gatech.edu.  相似文献   

11.
Choi K  Kim S 《Proteins》2011,79(4):1118-1131
The two‐component system (TCS) is a signal transduction system that involves a histidine kinase (HK) and a response regulator (RR). Although up to hundreds of TCSs may operate in parallel in a bacterial cell, the high‐fidelity of a TCS signaling is well maintained, minimizing irrelevant crosstalk between TCSs. When a HK gene and a RR gene in a given TCS system exist in neighboring positions, it is almost certain that their protein products (i.e., HK and RR) are interacting partners. However, large bacterial genomes often have multiple HK genes and/or cognate RR genes that are not neighboring positions. In many partially assembled genomes, some HK genes and RR genes belong to different contigs. In these cases, it is not clear which HK(s) and RR(s) interact. By combining information‐theoretic and graph‐theoretic approaches, we developed a computational method identifying co‐evolving residue pairs between HKs and cognate RRs and predicting the interacting HK:RR pairs for each TCS. In addition, we built a TCSppWWW webserver ( http://compath.org/platcom/tcs ) that takes query sequences of pairing candidates and predicts their HK:RR pairing using precomputed models. The current release of TCSppWWW provides predictors for 48 TCSs using over 20,000 protein sequences from about 900 bacterial genomes. Three different types of predictors using Random Forest, RBF Network, and Naïve Bayes are provided. Once a set of HK and RR candidate sequences are submitted, TCSppWWW aligns query sequences to the precomputed multiple sequence alignment of HK:RR pairs, extracts co‐evolving column positions, then returns prediction results with prediction margin and additional information. Proteins 2011. © 2010 Wiley‐Liss, Inc.  相似文献   

12.
The Prostate Gene Database (PGDB: http://www.ucsf.edu/pgdb) is a curated and integrated database of genes or genomic loci related to the human prostate and prostatic diseases. Currently, PGDB covers genes involved in a number of molecular and genetic events of the prostate including gene amplification, mutation, gross deletion, methylation, polymorphism, linkage and over-expression, as published in the literature. Genes that are specifically expressed in prostate, as evidenced by analysis of data from expressed sequence tags (ESTs) and serial analysis of gene expression (SAGE), are also included. There are a total of 165 unique entries in the database. Users can either browse or query the PGDB through a web interface. For each gene, in addition to basic gene information and rich cross-references to other databases, inclusive and relevant literature references are provided to support the inclusion of the gene in the database. Detailed expression data calculated from the UniGene and SAGEmap databases are also presented.  相似文献   

13.
DNA microarray technology enables high-throughput gene expression analysis and allows researchers to test the activity of thousands of genes at one time in multiple cellular conditions. This approach is based on principal curves of oriented points (PCOP) analysis and minimum spanning trees to analyze temporal and nontemporal series data to relate the genes. PCOP is a very suitable method, non-hypothesis-driven, for nonlinear relationship recognition between multivariable sets of data. Initially, a gene-relations tree is generated from the correlation between each pair of genes, calculated by PCOP analysis. Next, the researcher can introduce the query genes to be studied into the zoom-in operation, and the system selects the genes which connect the previously provided ones, beyond the activation pathways, using the minimum spanning tree. Thus, this zoom-in operation generates the nonlinear pattern of the intraset expression behavior for the new gene set. This inner expression pattern relates the query and selected genes to study their mutual interdependence in detail. This detailed information is especially useful in the biomedical environment, where such information is not possible to obtain by applying the current analytical methods.  相似文献   

14.
MOTIVATION: At present the computational gene identification methods in microbial genomes have a high prediction accuracy of verified translation termination site (3' end), but a much lower accuracy of the translation initiation site (TIS, 5' end). The latter is important to the analysis and the understanding of the putative protein of a gene and the regulatory machinery of the translation. Improving the accuracy of prediction of TIS is one of the remaining open problems. RESULTS: In this paper, we develop a four-component statistical model to describe the TIS of prokaryotic genes. The model incorporates several features with biological meanings, including the correlation between translation termination site and TIS of genes, the sequence content around the start codon; the sequence content of the consensus signal related to ribosomal binding sites (RBSs), and the correlation between TIS and the upstream consensus signal. An entirely non-supervised training system is constructed, which takes as input a set of annotated coding open reading frames (ORFs) by any gene finder, and gives as output a set of organism-specific parameters (without any prior knowledge or empirical constants and formulas). The novel algorithm is tested on a set of reliable datasets of genes from Escherichia coli and Bacillus subtillis. MED-Start may correctly predict 95.4% of the start sites of 195 experimentally confirmed E.coli genes, 96.6% of 58 reliable B.subtillis genes. Moreover, the test results indicate that the algorithm gives higher accuracy for more reliable datasets, and is robust to the variation of gene length. MED-Start may be used as a postprocessor for a gene finder. After processing by our program, the improvement of gene start prediction of gene finder system is remarkable, e.g. the accuracy of TIS predicted by MED 1.0 increases from 61.7 to 91.5% for 854 E.coli verified genes, while that by GLIMMER 2.02 increases from 63.2 to 92.0% for the same dataset. These results show that our algorithm is one of the most accurate methods to identify TIS of prokaryotic genomes. AVAILABILITY: The program MED-Start can be accessed through the website of CTB at Peking University: http://ctb.pku.edu.cn/main/SheGroup/MED_Start.htm.  相似文献   

15.
Codon usage in bacteria: correlation with gene expressivity   总被引:153,自引:53,他引:100       下载免费PDF全文
The nucleic acid sequence bank now contains over 600 protein coding genes of which 107 are from prokaryotic organisms. Codon frequencies in each new prokaryotic gene are given. Analysis of genetic code usage in the 83 sequenced genes of the Escherichia coli genome (chromosome, transposons and plasmids) is presented, taking into account new data on gene expressivity and regulation as well as iso-tRNA specificity and cellular concentration. The codon composition of each gene is summarized using two indexes: one is based on the differential usage of iso-tRNA species during gene translation, the other on choice between Cytosine and Uracil for third base. A strong relationship between codon composition and mRNA expressivity is confirmed, even for genes transcribed in the same operon. The influence of codon use of peptide elongation rate and protein yield is discussed. Finally, the evolutionary aspect of codon selection in mRNA sequences is studied.  相似文献   

16.
The extent to which prokaryotic evolution has been influenced by horizontal gene transfer (HGT) and therefore might be more of a network than a tree is unclear. Here we use supertree methods to ask whether a definitive prokaryotic phylogenetic tree exists and whether it can be confidently inferred using orthologous genes. We analysed an 11-taxon dataset spanning the deepest divisions of prokaryotic relationships, a 10-taxon dataset spanning the relatively recent gamma-proteobacteria and a 61-taxon dataset spanning both, using species for which complete genomes are available. Congruence among gene trees spanning deep relationships is not better than random. By contrast, a strong, almost perfect phylogenetic signal exists in gamma-proteobacterial genes. Deep-level prokaryotic relationships are difficult to infer because of signal erosion, systematic bias, hidden paralogy and/or HGT. Our results do not preclude levels of HGT that would be inconsistent with the notion of a prokaryotic phylogeny. This approach will help decide the extent to which we can say that there is a prokaryotic phylogeny and where in the phylogeny a cohesive genomic signal exists.  相似文献   

17.
18.
SUMMARY: BLAST2GENE is a program that allows a detailed analysis of genomic regions containing completely or partially duplicated genes. From a BLAST (or BL2SEQ) comparison of a protein or nucleotide query sequence with any genomic region of interest, BLAST2GENE processes all high scoring pairwise alignments (HSPs) and provides the disposition of all independent copies along the genomic fragment. The results are provided in text and PostScript formats to allow an automatic and visual evaluation of the respective region. AVAILABILITY: The program is available upon request from the authors. A web server of BLAST2GENE is maintained at http://www.bork.embl.de/blast2gene  相似文献   

19.
The sexually dimorphic expression of genes across 26 somatic rat tissues was using Affymetrix RAE-230 genechips. We considered probesets to be sexually dimorphically expressed (SDE) if they were measurably expressed above background in at least one sex, there was at least a two-fold difference in expression (dimorphism) between the sexes, and the differences were statistically significant after correcting for false discovery. 14.5% of expressed probesets were SDE in at least one tissue, with higher expression nearly twice as prevalent in males compared to females. Most were SDE in a single tissue. Surprisingly, nearly half of the probesets that were (SDE) in multiple tissues were oppositely sex biased in different tissues, and most SDE probesets were also expressed without sex bias in other tissues. Two genes were widely SDE: Xist (female-only) and Eif2s3y (male-only). The frequency of SDE probesets varied widely between tissues, and was highest in the duodenum (6.2%), whilst less than 0.05% in over half of the surveyed tissues. The occurrence of SDE probesets was not strongly correlated between tissues. Within individual tissues, however, relational networks of SDE genes were identified. In the liver, networks relating to differential metabolism between the sexes were seen. The estrogen receptor was implicated in differential gene expression in the duodenum. To conclude, sexually dimorphic gene expression is common, but highly tissue-dependent. Sexually dimorphic gene expression may provide insights into mechanisms underlying phenotypic sex differences. Online data are provided as a resource for further analyses (GEO reference GSE63362).  相似文献   

20.
The repeated occurrence of genes in each other’s neighbourhood on genomes has been shown to indicate a functional association between the proteins they encode. Here we introduce STRING (search tool for recurring instances of neighbouring genes), a tool to retrieve and display the genes a query gene repeatedly occurs with in clusters on the genome. The tool performs iterative searches and visualises the results in their genomic context. By finding the genomically associated genes for a query, it delineates a set of potentially functionally associated genes. The usefulness of STRING is illustrated with an example that suggests a functional context for an RNA methylase with unknown specificity. STRING is available at http://www.bork.embl-heidelberg.de/STRING  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号