首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
5.
Competing endogenous RNA database   总被引:1,自引:0,他引:1  
A given mRNA can be regulated by interactions with miRNAs and in turn the availability of these miRNAs can be regulated by their interactions with alternate mRNAs. The concept of regulation of a given mRNA by alternate mRNA (competing endogenous mRNA) by virtue of interactions with miRNAs through shared miRNA response elements is poised to become a fundamental genetic regulatory mechanism. The molecular basis of the mRNA-mRNA cross talks is via miRNA response elements, which can be predicted based on both molecular interaction and evolutionary conservation. By examining the co-occurrence of miRNA response elements in the mRNAs on a genome-wide basis we predict competing endogenous RNA for specific mRNAs targeted by miRNAs. Comparison of the mRNAs predicted to regulate PTEN with recently published work, indicate that the results presented within the competing endogenous RNA database (ceRDB) have biological relevance.

Availability

http://www.oncomir.umn.edu/cefinder/  相似文献   

6.
Protein-protein interaction (PPI) networks provide insights into understanding of biological processes, function and the underlying complex evolutionary mechanisms of the cell. Modeling PPI network is an important and fundamental problem in system biology, where it is still of major concern to find a better fitting model that requires less structural assumptions and is more robust against the large fraction of noisy PPIs. In this paper, we propose a new approach called t-logistic semantic embedding (t-LSE) to model PPI networks. t-LSE tries to adaptively learn a metric embedding under the simple geometric assumption of PPI networks, and a non-convex cost function was adopted to deal with the noise in PPI networks. The experimental results show the superiority of the fit of t-LSE over other network models to PPI data. Furthermore, the robust loss function adopted here leads to big improvements for dealing with the noise in PPI network. The proposed model could thus facilitate further graph-based studies of PPIs and may help infer the hidden underlying biological knowledge. The Matlab code implementing the proposed method is freely available from the web site: http://home.ustc.edu.cn/~yzh33108/PPIModel.htm.  相似文献   

7.
8.
The dispersal of yeast clumps to the unicellular state by certain sugars, does not increase the percentage survival after freeze-drying. Neither are those changes in cell-wall composition which occur upon ageing of the cell, and which are detectable by means of snail-gut enzymes, related to this cellular property. However, pre-treatment, with -mercaptoethanol, of a strain ofSaccharomyces carisbergensis increased the survival rate. This may be due to the reduction of certain sites in the cell wall. The oxygen consumption of yeast cultures before and after freeze-drying, agree with the hypothesis that low viabilities can arise from localized cellular damage which prevents cell reproduction by budding.  相似文献   

9.

Background

Current technologies have lead to the availability of multiple genomic data types in sufficient quantity and quality to serve as a basis for automatic global network inference. Accordingly, there are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail. These methods have different strengths and weaknesses and thus can be complementary. However, combining different methods in a mutually reinforcing manner remains a challenge.

Methodology

We investigate how three scalable methods can be combined into a useful network inference pipeline. The first is a novel t-test–based method that relies on a comprehensive steady-state knock-out dataset to rank regulatory interactions. The remaining two are previously published mutual information and ordinary differential equation based methods (tlCLR and Inferelator 1.0, respectively) that use both time-series and steady-state data to rank regulatory interactions; the latter has the added advantage of also inferring dynamic models of gene regulation which can be used to predict the system''s response to new perturbations.

Conclusion/Significance

Our t-test based method proved powerful at ranking regulatory interactions, tying for first out of methods in the DREAM4 100-gene in-silico network inference challenge. We demonstrate complementarity between this method and the two methods that take advantage of time-series data by combining the three into a pipeline whose ability to rank regulatory interactions is markedly improved compared to either method alone. Moreover, the pipeline is able to accurately predict the response of the system to new conditions (in this case new double knock-out genetic perturbations). Our evaluation of the performance of multiple methods for network inference suggests avenues for future methods development and provides simple considerations for genomic experimental design. Our code is publicly available at http://err.bio.nyu.edu/inferelator/.  相似文献   

10.
Liang Y  Zhang F  Wang J  Joshi T  Wang Y  Xu D 《PloS one》2011,6(7):e21750

Background

Identifying genes with essential roles in resisting environmental stress rates high in agronomic importance. Although massive DNA microarray gene expression data have been generated for plants, current computational approaches underutilize these data for studying genotype-trait relationships. Some advanced gene identification methods have been explored for human diseases, but typically these methods have not been converted into publicly available software tools and cannot be applied to plants for identifying genes with agronomic traits.

Methodology

In this study, we used 22 sets of Arabidopsis thaliana gene expression data from GEO to predict the key genes involved in water tolerance. We applied an SVM-RFE (Support Vector Machine-Recursive Feature Elimination) feature selection method for the prediction. To address small sample sizes, we developed a modified approach for SVM-RFE by using bootstrapping and leave-one-out cross-validation. We also expanded our study to predict genes involved in water susceptibility.

Conclusions

We analyzed the top 10 genes predicted to be involved in water tolerance. Seven of them are connected to known biological processes in drought resistance. We also analyzed the top 100 genes in terms of their biological functions. Our study shows that the SVM-RFE method is a highly promising method in analyzing plant microarray data for studying genotype-phenotype relationships. The software is freely available with source code at http://ccst.jlu.edu.cn/JCSB/RFET/.  相似文献   

11.

Background

Gene expression as governed by the interplay of the components of regulatory networks is indeed one of the most complex fundamental processes in biological systems. Although several methods have been published to unravel the hierarchical structure of regulatory networks, weaknesses such as the incorrect or inconsistent assignment of elements to their hierarchical levels, the incapability to cope with cyclic dependencies within the networks or the need for a manual curation to retrieve non-overlapping levels remain unsolved.

Methodology/Results

We developed HiNO as a significant improvement of the so-called breadth-first-search (BFS) method. While BFS is capable of determining the overall hierarchical structures from gene regulatory networks, it especially has problems solving feed-forward type of loops leading to conflicts within the level assignments. We resolved these problems by adding a recursive correction approach consisting of two steps. First each vertex is placed on the lowest level that this vertex and its regulating vertices are assigned to (downgrade procedure). Second, vertices are assigned to the next higher level (upgrade procedure) if they have successors with the same level assignment and have themselves no regulators. We evaluated HiNO by comparing it with the BFS method by applying them to the regulatory networks from Saccharomyces cerevisiae and Escherichia coli, respectively. The comparison shows clearly how conflicts in level assignment are resolved in HiNO in order to produce correct hierarchical structures even on the local levels in an automated fashion.

Conclusions

We showed that the resolution of conflicting assignments clearly improves the BFS-method. While we restricted our analysis to gene regulatory networks, our approach is suitable to deal with any directed hierarchical networks structure such as the interaction of microRNAs or the action of non-coding RNAs in general. Furthermore we provide a user-friendly web-interface for HiNO that enables the extraction of the hierarchical structure of any directed regulatory network.

Availability

HiNO is freely accessible at http://mips.helmholtz-muenchen.de/hino/.  相似文献   

12.
Meiotic recombination is an important biological process. As a main driving force of evolution, recombination provides natural new combinations of genetic variations. Rather than randomly occurring across a genome, meiotic recombination takes place in some genomic regions (the so-called ‘hotspots’) with higher frequencies, and in the other regions (the so-called ‘coldspots’) with lower frequencies. Therefore, the information of the hotspots and coldspots would provide useful insights for in-depth studying of the mechanism of recombination and the genome evolution process as well. So far, the recombination regions have been mainly determined by experiments, which are both expensive and time-consuming. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the recombination regions. In this study, a predictor, called ‘iRSpot-PseDNC’, was developed for identifying the recombination hotspots and coldspots. In the new predictor, the samples of DNA sequences are formulated by a novel feature vector, the so-called ‘pseudo dinucleotide composition’ (PseDNC), into which six local DNA structural properties, i.e. three angular parameters (twist, tilt and roll) and three translational parameters (shift, slide and rise), are incorporated. It was observed by the rigorous jackknife test that the overall success rate achieved by iRSpot-PseDNC was >82% in identifying recombination spots in Saccharomyces cerevisiae, indicating the new predictor is promising or at least may become a complementary tool to the existing methods in this area. Although the benchmark data set used to train and test the current method was from S. cerevisiae, the basic approaches can also be extended to deal with all the other genomes. Particularly, it has not escaped our notice that the PseDNC approach can be also used to study many other DNA-related problems. As a user-friendly web-server, iRSpot-PseDNC is freely accessible at http://lin.uestc.edu.cn/server/iRSpot-PseDNC.  相似文献   

13.
A new program (Multiple Motif Scanning) was developed to scan the Saccharomyces cerevisiae proteome for Class I S-adenosylmethionine-dependent methyltransferases. Conserved Motifs I, Post I, II, and III were identified and expanded in known methyltransferases by primary sequence and secondary structural analysis through hidden Markov model profiling of both a yeast reference database and a reference database of methyltransferases with solved three-dimensional structures. The roles of the conserved amino acids in the four motifs of the methyltransferase structure and function were then analyzed to expand the previously defined motifs. Fisher-based negative log statistical matrix sets were developed from the prevalence of amino acids in the motifs. Multiple Motif Scanning is able to scan the proteome and score different combinations of the top fitting sequences for each motif. In addition, the program takes into account the conserved number of amino acids between the motifs. The output of the program is a ranked list of proteins that can be used to identify new methyltransferases and to reevaluate the assignment of previously identified putative methyltransferases. The Multiple Motif Scanning program can be used to develop a putative list of enzymes for any type of protein that has one or more motifs conserved at variable spacings and is freely available (www.chem.ucla.edu/files/MotifSetup.Zip). Finally hidden Markov model profile clustering analysis was used to subgroup Class I methyltransferases into groups that reflect their methyl-accepting substrate specificity.Enzymes that catalyze the transfer of a methyl group from S-adenosylmethionine to protein, nucleic acid, lipid, and small molecule substrates are widely distributed in nature and function in a variety of biological pathways including metabolic regulation, gene expression, the repair of aging biomolecules, and biosynthesis (1). There are several classes of AdoMet1 dependent methyltransferases. Class I enzymes are the most abundant and share a common three-dimensional structural core that includes a seven-strand twisted β sheet (25). It has been estimated that about 0.6–1.6% of genes in organisms ranging from Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Arabidopsis thaliana, and humans encode Class I methyltransferases (6). These results suggest that there are some 50 species in yeast and some 300 species in humans. However, most of these assignments have not been confirmed, and only a relatively small fraction of them have been functionally identified. Previous bioinformatics studies on Class I methyltransferases have taken advantage of the fact that the structure of the AdoMet binding site is conserved in the primary sequences of four short signature motifs designated Motifs I, Post I, II, and III (68). These motifs are present with variable, but often conserved, spacing in the primary sequences. Initial identifications of Class I methyltransferases in the yeast proteome were based on searches with a “Motifs in Protein Data Bases” program with individual motifs to generate a list of putative methyltransferases (8). More recently, Katz et al. (6) performed a search using the Motif Alignment and Search Tool (MAST) with Multiple Em for Motif Elicitation (MEME)-generated matrices combining Motifs I/Post I as well as PSI-BLAST searches to identify methyltransferases in both yeast and a variety of other organisms. However, the success of these methods was limited by difficulties in identifying these motifs in the known methyltransferases. In fact, it was not possible to take advantage of the information content of Motifs II and III in the latter study (6).In recent years, two developments have provided new windows to improve the screening of proteomes to identify the complete cast of Class I methyltransferases in various organisms. In the first place, three-dimensional structures are now known for a large number of Class I methyltransferases. Because the motifs are closely linked to structural features (24), their identification is unambiguous. Secondly secondary structure prediction algorithms can now be used as independent confirmations of the structure-linked sequence motifs in methyltransferases whose structures have not been determined. We wanted to take advantage of the improved motif identification by developing advanced software for searching proteomes for multiple motifs at varying, but partially conserved, spacing. We have now used secondary structure prediction to obtain the motifs of a reference group of 32 known yeast methyltransferases and a search of the Research Collaboratory for Structural Bioinformatics Protein Data Bank to identify the motifs of a second reference group of 33 distinct types of Class I methyltransferases with three-dimensional structures. We describe a method for generating search matrices for each of these reference groups and a program “Multiple Motif Scanning” to score these matrices in the yeast proteome. This approach not only identified new methyltransferases but allowed us to reject some of the previously described candidate methyltransferases. Additionally we used HMM profile clustering analysis to extract more information about the possible substrates of these putative methyltransferases (9).  相似文献   

14.
The demand for phenomics, a high-dimensional and high-throughput phenotyping method, has been increasing in many fields of biology. The budding yeast Saccharomyces cerevisiae, a unicellular model organism, provides an invaluable system for dissecting complex cellular processes using high-resolution phenotyping. Moreover, the addition of spatial and temporal attributes to subcellular structures based on microscopic images has rendered this cell phenotyping system more reliable and amenable to analysis. A well-designed experiment followed by appropriate multivariate analysis can yield a wealth of biological knowledge. Here we review recent advances in cell imaging and illustrate their broad applicability to eukaryotic cells by showing how these techniques have advanced our understanding of budding yeast.  相似文献   

15.
Understanding the molecular basis of common traits is a primary challenge of modern genetics. One model holds that rare mutations in many genetic backgrounds may often phenocopy one another, together explaining the prevalence of the resulting trait in the population. For the vast majority of phenotypes, the role of rare variants and the evolutionary forces that underlie them are unknown. In this work, we use a population of Saccharomyces paradoxus yeast as a model system for the study of common trait variation. We observed an unusual, flocculation and invasive-growth phenotype in one-third of S. paradoxus strains, which were otherwise unrelated. In crosses with each strain in turn, these morphologies segregated as a recessive Mendelian phenotype, mapping either to IRA1 or to IRA2, yeast homologs of the hypermutable human neurofibromatosis gene NF1. The causal IRA1 and IRA2 haplotypes were of distinct evolutionary origin and, in addition to their morphological effects, associated with hundreds of stress-resistance and growth traits, both beneficial and disadvantageous, across S. paradoxus. Single-gene molecular genetic analyses confirmed variant IRA1 and IRA2 haplotypes as causal for these growth characteristics, many of which were independent of morphology. Our data make clear that common growth and morphology traits in yeast result from a suite of variants in master regulators, which function as a mutation-driven switch between phenotypic states.  相似文献   

16.
RNA-Seq techniques generate hundreds of millions of short RNA reads using next-generation sequencing (NGS). These RNA reads can be mapped to reference genomes to investigate changes of gene expression but improved procedures for mining large RNA-Seq datasets to extract valuable biological knowledge are needed. RNAMiner—a multi-level bioinformatics protocol and pipeline—has been developed for such datasets. It includes five steps: Mapping RNA-Seq reads to a reference genome, calculating gene expression values, identifying differentially expressed genes, predicting gene functions, and constructing gene regulatory networks. To demonstrate its utility, we applied RNAMiner to datasets generated from Human, Mouse, Arabidopsis thaliana, and Drosophila melanogaster cells, and successfully identified differentially expressed genes, clustered them into cohesive functional groups, and constructed novel gene regulatory networks. The RNAMiner web service is available at http://calla.rnet.missouri.edu/rnaminer/index.html.  相似文献   

17.
The availability of genomic sequences of many organisms has opened new challenges in many aspects particularly in terms of genome analysis. Sequence extraction is a vital step and many tools have been developed to solve this issue. These tools are available publically but have limitations with reference to the sequence extraction, length of the sequence to be extracted, organism specificity and lack of user friendly interface. We have developed a java based software package having three modules which can be used independently or sequentially. The tool efficiently extracts sequences from large datasets with few simple steps. It can efficiently extract multiple sequences of any desired length from a genome of any organism. The results are crosschecked by published data.

Availability

URL 1: http://ww3.comsats.edu.pk/bio/ResearchProjects.aspxURL 2: http://ww3.comsats.edu.pk/bio/SequenceManeuverer.aspx  相似文献   

18.
19.
The innate immune system is an ancient component of host defense. Since innate immunity pathways are well conserved throughout many eukaryotes, immune genes in model animals can be used to putatively identify homologous genes in newly sequenced genomes of non-model organisms. With the initiation of the “i5k” project, which aims to sequence 5,000 insect genomes by 2016, many novel insect genomes will soon become publicly available, yet few annotation resources are currently available for insects. Thus, we developed an online tool called the Insect Innate Immunity Database (IIID) to provide an open access resource for insect immunity and comparative biology research (http://www.vanderbilt.edu/IIID). The database provides users with simple exploratory tools to search the immune repertoires of five insect models (including Nasonia), spanning three orders, for specific immunity genes or genes within a particular immunity pathway. As a proof of principle, we used an initial database with only four insect models to annotate potential immune genes in the parasitoid wasp genus Nasonia. Results specify 306 putative immune genes in the genomes of N. vitripennis and its two sister species N. giraulti and N. longicornis. Of these genes, 146 were not found in previous annotations of Nasonia immunity genes. Combining these newly identified immune genes with those in previous annotations, Nasonia possess 489 putative immunity genes, the largest immune repertoire found in insects to date. While these computational predictions need to be complemented with functional studies, the IIID database can help initiate and augment annotations of the immune system in the plethora of insect genomes that will soon become available.  相似文献   

20.
Domains are instrumental in facilitating protein interactions with DNA, RNA, small molecules, ions and peptides. Identifying ligand-binding domains within sequences is a critical step in protein function annotation, and the ligand-binding properties of proteins are frequently analyzed based upon whether they contain one of these domains. To date, however, knowledge of whether and how protein domains interact with ligands has been limited to domains that have been observed in co-crystal structures; this leaves approximately two-thirds of human protein domain families uncharacterized with respect to whether and how they bind DNA, RNA, small molecules, ions and peptides. To fill this gap, we introduce dSPRINT, a novel ensemble machine learning method for predicting whether a domain binds DNA, RNA, small molecules, ions or peptides, along with the positions within it that participate in these types of interactions. In stringent cross-validation testing, we demonstrate that dSPRINT has an excellent performance in uncovering ligand-binding positions and domains. We also apply dSPRINT to newly characterize the molecular functions of domains of unknown function. dSPRINT’s predictions can be transferred from domains to sequences, enabling predictions about the ligand-binding properties of 95% of human genes. The dSPRINT framework and its predictions for 6503 human protein domains are freely available at http://protdomain.princeton.edu/dsprint.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号