首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
4.
With the complete sequencing of multiple genomes, there have been extensions in the methods of sequence analysis from single gene/protein-based to analyzing multiple genes and proteins simultaneously. Therefore, there is a demand of user-friendly software tools that will allow mining of these enormous datasets. PPD is a WWW-based database for comparative analysis of protein lengths in completely sequenced prokaryotic and eukaryotic genomes. PPD's core objective is to create protein classification tables based on the lengths of proteins by specifying a set of organisms and parameters. The interface can also generate information on changes in proteins of specific length distributions. This feature is of importance when the user's interest is focused on some evolutionarily related organisms or on organisms with similar or related tissue specificity or life-style. PPD is available at: PPD Home.  相似文献   

5.
6.
7.
Xiao Y  Xu C  Xu L  Guan J  Ping Y  Fan H  Li Y  Zhao H  Li X 《Gene》2012,499(2):332-338
The development of heart failure (HF) is a complex process that can be initiated by multiple etiologies. Identifying common functional modules associated with HF is a challenging task. Here, we developed a systems method to identify these common functional modules by integrating multiple expression profiles, protein interactions from four species, gene function annotations, and text information. We identified 1439 consistently differentially expressed genes (CDEGs) across HF with different etiologies by applying three meta-analysis methods to multiple HF-related expression profiles. Using a weighted human interaction network constructed by combining interaction data from multiple species, we extracted 60 candidate CDEG modules. We further evaluated the functional relevance of each module by using expression, interaction network, functional annotations, and text information together. Finally, five functional modules with significant biological relevance were identified. We found that almost half of the genes in these modules are hubs in the weighted network, and that these modules can accurately classify HF patients from healthy subjects. We also identified many significantly enriched biological processes that contribute to the pathophysiology of HF, including two new ones, RNA splicing and vesicle-mediated protein transport. In summary, we proposed a novel framework to analyze common functional modules related to HF with different etiologies. Our findings provide important insights into the complex mechanism of HF. Further biological experimentations should be required to validate these novel biological processes.  相似文献   

8.
9.
10.
The ultimate goal of functional genomics is to define the function of all the genes in the genome of an organism. A large body of information of the biological roles of genes has been accumulated and aggregated in the past decades of research, both from traditional experiments detailing the role of individual genes and proteins, and from newer experimental strategies that aim to characterize gene function on a genomic scale.It is clear that the goal of functional genomics can only be achieved by integrating information and data sources from the variety of these different experiments. Integration of different data is thus an important challenge for bioinformatics.The integration of different data sources often helps to uncover non-obvious relationships between genes, but there are also two further benefits. First, it is likely that whenever information from multiple independent sources agrees, it should be more valid and reliable. Secondly, by looking at the union of multiple sources, one can cover larger parts of the genome. This is obvious for integrating results from multiple single gene or protein experiments, but also necessary for many of the results from genome-wide experiments since they are often confined to certain (although sizable) subsets of the genome.In this paper, we explore an example of such a data integration procedure. We focus on the prediction of membership in protein complexes for individual genes. For this, we recruit six different data sources that include expression profiles, interaction data, essentiality and localization information. Each of these data sources individually contains some weakly predictive information with respect to protein complexes, but we show how this prediction can be improved by combining all of them. Supplementary information is available at http://bioinfo.mbb.yale.edu/integrate/interactions/.Abbreviations: TP: true possitive; TN: true negative; FP: false positive; FN: false negative; Y2H: yeast two-hybrid.  相似文献   

11.
Predicting the behavior of living organisms is an enormous challenge given their vast complexity. Efforts to model biological systems require large datasets generated by physical binding experiments and perturbation studies. Genetic perturbations have proven important and are greatly facilitated by the advent of comprehensive mutant libraries in model organisms. Small-molecule chemical perturbagens provide a complementary approach, especially for systems that lack mutant libraries, and can easily probe the function of essential genes. Though single chemical or genetic perturbations provide crucial information associating individual components (for example, genes, proteins or small molecules) with pathways or phenotypes, functional relationships between pathways and modules of components are most effectively obtained from combined perturbation experiments. Here we review the current state of and discuss some future directions for 'combination chemical genetics', the systematic application of multiple chemical or mixed chemical and genetic perturbations, both to gain insight into biological systems and to facilitate medical discoveries.  相似文献   

12.
13.
Within the past five years genome-scale gene essentiality data sets have been published for ten diverse bacterial species. These data are a rich source of information about cellular networks that we are only beginning to explore. The analysis of these data, very heterogeneous in nature, is a challenging task. Even the definition of 'essential genes' in various genome-scale studies varies from genes 'absolutely required for survival' to those 'strongly contributing to fitness' and robust competitive growth. A comparative analysis of gene essentiality across multiple organisms based on projection of experimentally observed essential genes to functional roles in a collection of metabolic pathways and subsystems is emerging as a powerful tool of systems biology.  相似文献   

14.
Genetic and biochemical analysis of Saccharomyces cerevisiae containing a disruption of the nuclear gene (AAC1) encoding the mitochondrial ADP/ATP carrier has revealed a second gene for this protein. The second gene, designated AAC2, has been isolated by genetic complementation and sequenced. AAC2 contains a 954-base pair open reading frame coding for a protein of 318 amino acids which is highly homologous to the AAC1 gene product except that it is nine amino acids longer at the NH2 terminus. The two yeast genes are highly conserved at the level of DNA and protein and share identity with the ADP/ATP carriers from other organisms. Both genes complement an ADP/ATP carrier defect (op1 or pet9). However, the newly isolated gene AAC2 need be present only in one or two copies while the previously isolated AAC1 gene must be present in multiple copies to support growth dependent on a functional carrier protein. This gene dosage-dependent complementation combined with the high degree of conservation suggest that these two functionally equivalent genes may be differentially expressed.  相似文献   

15.
Expression profile analysis of genes provides valuable information concerning the genetic response of cells to stimuli. We describe an adaptation of this technology that can be used to probe for the expression of specific families of genes in microbial species. In our method a combination of sets of oligonucleotide probes representing fingerprint sequences specific to protein families is used to identify the presence and expression levels of family homologs in a microbial cell. We demonstrate computationally, using exemplars, that when the cDNA complement from an organism is sequentially screened against a set of specific motif oligonucleotides, statistically significant information can be obtained concerning the expression of the corresponding genes. This method can be used to identify specific genes and pathways simultaneously in several organisms of interest even in the absence of sequence information from the organisms.  相似文献   

16.
MOTIVATION: Numerous annotations are available that functionally characterize genes and proteins with regard to molecular process, cellular localization, tissue expression, protein domain composition, protein interaction, disease association and other properties. Searching this steadily growing amount of information can lead to the discovery of new biological relationships between genes and proteins. To facilitate the searches, methods are required that measure the annotation similarity of genes and proteins. However, most current similarity methods are focused only on annotations from the Gene Ontology (GO) and do not take other annotation sources into account. RESULTS: We introduce the new method BioSim that incorporates multiple sources of annotations to quantify the functional similarity of genes and proteins. We compared the performance of our method with four other well-known methods adapted to use multiple annotation sources. We evaluated the methods by searching for known functional relationships using annotations based only on GO or on our large data warehouse BioMyn. This warehouse integrates many diverse annotation sources of human genes and proteins. We observed that the search performance improved substantially for almost all methods when multiple annotation sources were included. In particular, our method outperformed the other methods in terms of recall and average precision.  相似文献   

17.
Abstract

Complete functional annotations of proteins are essential to understand the role and mechanisms in pathogenesis. Aminoglycoside nucleotidyltransferases are the subclasses of aminoglycosides modifying enzymes conferring resistance to organisms. Insight into the structural and functional understanding of nucleotidyltransferase family protein provides vital information to combat pathogenesis. Phylogenetic analysis is employed to identify the evolutionary significance and common motif’s present among the homologs of nucleotidyltransferase family protein. Structure, sequence based approaches and molecular docking were implemented to predict the exact function of the protein. Wide distribution of the nucleotidyltransferase family protein in gram-positive and gram-negative organisms are evidenced from phylogenetic analysis. Five common motifs were present in all the homolog’s of nucleotidyltransferase family protein. Sequence-structure based functional annotations predicts that the targeted protein function as ATP-Mg dependent streptomycin adenylyltransferase. Structural comparisons and docking studies correlate well with the identified function. The complete function of nucleotidyltransferase family protein was identified as Streptomycin adenylyltransferase and it could be targeted as a potential therapeutic target to overcome antibiotic resistance.

Communicated by Ramaswamy H. Sarma

Abbreviations AAC aminoglycoside acetyltransferases

AME aminoglycoside modifying enzyme

ANT aminoglycoside nucleotidyltransferases

APH aminoglycoside phosphotransferases

ATP adenosine triphosphate

CASTp computer atlas and surface topography of proteins

DUF domains of unknown function

Glide grid-based ligand docking with energetic

HMM hidden Markov model

MAST motif alignment and search tool

MEGA molecular evolutionary genetics analysis

MEME multiple Em for motif elicitation

MSA multiple sequence alignment

NMP nucleoside monophosphate

NTP nucleoside triphosphate

NT nucleotidyltransferase

OPLS optimized potential for liquid simulation

XP extra precision

  相似文献   

18.
Molecular characterizations of bacteria often employ ribosomal DNA (rDNA) to establish the identity and relationships among organisms, but the use of rRNA sequences can be problematic as the result of alignment ambiguities caused by indels, the lack of informative characters, and varying functional constraints over the molecule. Although protein-coding regions have been used as an alternative to rRNA, there is neither consensus among the genes examined nor ways to rapidly obtain sequence information for such genes from uncharacterized bacterial species. To standardize the set of protein-coding loci assayed in bacterial genomes, we examined over 100 widely distributed genes to identify sets of universal primers for use in the PCR amplification of protein coding regions that are common to virtually all bacteria. From this set, we developed primer sets that each target of 10 genes spanning an array of genomic locations and functional categories. Although many of the primers contain sequence degeneracies that aid in targeting genes across diverse taxa, most are adequate for direct sequencing of amplification products, thereby eliminating intermediate cloning before sequence determination. We foresee the analysis of these protein-coding regions as being complementary to ribosomal DNA for answering questions pertaining to bacterial identification, classification, phylogenetics and evolution.  相似文献   

19.
Functional classification of proteins from sequences alone has become a critical bottleneck in understanding the myriad of protein sequences that accumulate in our databases. The great diversity of homologous sequences hides, in many cases, a variety of functional activities that cannot be anticipated. Their identification appears critical for a fundamental understanding of the evolution of living organisms and for biotechnological applications. ProfileView is a sequence-based computational method, designed to functionally classify sets of homologous sequences. It relies on two main ideas: the use of multiple profile models whose construction explores evolutionary information in available databases, and a novel definition of a representation space in which to analyze sequences with multiple profile models combined together. ProfileView classifies protein families by enriching known functional groups with new sequences and discovering new groups and subgroups. We validate ProfileView on seven classes of widespread proteins involved in the interaction with nucleic acids, amino acids and small molecules, and in a large variety of functions and enzymatic reactions. ProfileView agrees with the large set of functional data collected for these proteins from the literature regarding the organization into functional subgroups and residues that characterize the functions. In addition, ProfileView resolves undefined functional classifications and extracts the molecular determinants underlying protein functional diversity, showing its potential to select sequences towards accurate experimental design and discovery of novel biological functions. On protein families with complex domain architecture, ProfileView functional classification reconciles domain combinations, unlike phylogenetic reconstruction. ProfileView proves to outperform the functional classification approach PANTHER, the two k-mer-based methods CUPP and eCAMI and a neural network approach based on Restricted Boltzmann Machines. It overcomes time complexity limitations of the latter.  相似文献   

20.
In recent years, genomics has been extended to functional genomics. Toward the characterization of organisms or species on the genome level, changes on the metabolite and protein level have been shown to be essential to assign functions to genes and to describe the dynamic molecular phenotype. Gas chromatography (GC) and liquid chromatography coupled to mass spectrometry (GC- and LC-MS) are well suited for the fast and comprehensive analysis of ultracomplex metabolite samples. For the integration of metabolite profiles with quantitative protein profiles, a high throughput (HTP) shotgun proteomics approach using LC-MS and label-free quantification of unique proteins in a complex protein digest is described. Multivariate statistics are applied to examine sample pattern recognition based on data-dimensionality reduction and biomarker identification in plant systems biology. The integration of the data reveal multiple correlative biomarkers providing evidence for an increase of information in such holistic approaches. With computational simulation of metabolic networks and experimental measurements, it can be shown that biochemical regulation is reflected by metabolite network dynamics measured in a metabolomics approach. Examples in molecular plant physiology are presented to substantiate the integrative approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号