首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 8 毫秒
1.
2.

Background

This paper is devoted to distance measures for leaf-labelled trees on free leafset. A leaf-labelled tree is a data structure which is a special type of a tree where only leaves (terminal) nodes are labelled. This data structure is used in bioinformatics for modelling of evolution history of genes and species and also in linguistics for modelling of languages evolution history. Many domain specific problems occur and need to be solved with help of tree postprocessing techniques such as distance measures.

Results

Here we introduce the tree edit distance designed for leaf labelled trees on free leafset, which occurs to be a metric. It is presented together with tree edit consensus tree notion. We provide statistical evaluation of provided measure with respect to R-F, MAST and frequent subsplit based dissimilarity measures as the reference measures.

Conclusions

The tree edit distance was proven to be a metric and has the advantage of using different costs for contraction and pruning, therefore their properties can be tuned depending on the needs of the user. Two of the presented methods carry the most interesting properties. E(3,1) is very discriminative (having a wide range of values) and has a very regular distance distribution which is similar to a normal distribution in its shape and is good both for similar and non-similar trees. NFC(2,1) on the other hand is proportional or nearly proportional to the number of mutation operations used, irrespective of their type.  相似文献   

3.

Background  

Co-expression network-based approaches have become popular in analyzing microarray data, such as for detecting functional gene modules. However, co-expression networks are often constructed by ad hoc methods, and network-based analyses have not been shown to outperform the conventional cluster analyses, partially due to the lack of an unbiased evaluation metric.  相似文献   

4.
Independent of the platform and the analysis methods used, the result of a microarray experiment is, in most cases, a list of differentially expressed genes. An automatic ontological analysis approach has been recently proposed to help with the biological interpretation of such results. Currently, this approach is the de facto standard for the secondary analysis of high throughput experiments and a large number of tools have been developed for this purpose. We present a detailed comparison of 14 such tools using the following criteria: scope of the analysis, visualization capabilities, statistical model(s) used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data. This detailed analysis of the capabilities of these tools will help researchers choose the most appropriate tool for a given type of analysis. More importantly, in spite of the fact that this type of analysis has been generally adopted, this approach has several important intrinsic drawbacks. These drawbacks are associated with all tools discussed and represent conceptual limitations of the current state-of-the-art in ontological analysis. We propose these as challenges for the next generation of secondary data analysis tools.  相似文献   

5.
6.
7.
Luciferases have been widely utilized as sensitive reporters to monitor gene expression and protein-protein interactions. Compared to firefly luciferase (Fluc), a recently developed luciferase, Nanoluciferase (NanoLuc or Nluc), has several superior properties such as a smaller size and stronger luminescence activity. We compared the reporter properties of Nluc and Fluc in rice (Oryza sativa). In both plant-based two-hybrid and split luc complementation (SLC) assays, Nluc activity was detected with higher sensitivity and specificity than that with Fluc. To apply Nluc to research involving the photoperiodic regulation of flowering, we made a knock-in rice plant in which the Nluc coding region was inserted in-frame with the OsMADS15 gene, a target of the rice florigen Hd3a. Strong Nluc activity in response to Hd3a, and in response to change in day length, was detected in rice protoplasts and in a single shoot apical meristem, respectively. Our results indicate that Nluc assay systems will be powerful tools to monitor gene expression and protein-protein interaction in plant research.  相似文献   

8.
9.
10.
MOTIVATION: An important application of microarray technology is to relate gene expression profiles to various clinical phenotypes of patients. Success has been demonstrated in molecular classification of cancer in which the gene expression data serve as predictors and different types of cancer serve as a categorical outcome variable. However, there has been less research in linking gene expression profiles to the censored survival data such as patients' overall survival time or time to cancer relapse. It would be desirable to have models with good prediction accuracy and parsimony property. RESULTS: We propose to use the L(1) penalized estimation for the Cox model to select genes that are relevant to patients' survival and to build a predictive model for future prediction. The computational difficulty associated with the estimation in the high-dimensional and low-sample size settings can be efficiently solved by using the recently developed least-angle regression (LARS) method. Our simulation studies and application to real datasets on predicting survival after chemotherapy for patients with diffuse large B-cell lymphoma demonstrate that the proposed procedure, which we call the LARS-Cox procedure, can be used for identifying important genes that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. The LARS-Cox regression gives better predictive performance than the L(2) penalized regression and a few other dimension-reduction based methods. CONCLUSIONS: We conclude that the proposed LARS-Cox procedure can be very useful in identifying genes relevant to survival phenotypes and in building a parsimonious predictive model that can be used for classifying future patients into clinically relevant high- and low-risk groups based on the gene expression profile and survival times of previous patients.  相似文献   

11.
12.

Background

Reconstruction of protein-protein interaction or metabolic networks based on expression data often involves in silico predictions, while on the other hand, there are unspecific networks of in vivo interactions derived from knowledge bases.We analyze networks designed to come as close as possible to data measured in vivo, both with respect to the set of nodes which were taken to be expressed in experiment as well as with respect to the interactions between them which were taken from manually curated databases

Results

A signaling network derived from the TRANSPATH database and a metabolic network derived from KEGG LIGAND are each filtered onto expression data from breast cancer (SAGE) considering different levels of restrictiveness in edge and vertex selection.We perform several validation steps, in particular we define pathway over-representation tests based on refined null models to recover functional modules. The prominent role of the spindle checkpoint-related pathways in breast cancer is exhibited. High-ranking key nodes cluster in functional groups retrieved from literature. Results are consistent between several functional and topological analyses and between signaling and metabolic aspects.

Conclusions

This construction involved as a crucial step the passage to a mammalian protein identifier format as well as to a reaction-based semantics of metabolism. This yielded good connectivity but also led to the need to perform benchmark tests to exclude loss of essential information. Such validation, albeit tedious due to limitations of existing methods, turned out to be informative, and in particular provided biological insights as well as information on the degrees of coherence of the networks despite fragmentation of experimental data.Key node analysis exploited the networks for potentially interesting proteins in view of drug target prediction.
  相似文献   

13.
This review describes the recently developed GeneChip technology that provides efficient access to genetic information using miniaturised, high-density arrays of DNA or oligonucleotide probes. Such microarrays are powerful tools to study the molecular basis of interactions on a scale that would be impossible using conventional analysis. The recent development of the microarray technology has greatly accelerated the investigation of gene regulation. Arrays are mostly used to identify which genes are turned on or off in a cell or tissue, and also to evaluate the extent of a gene's expression under various conditions. Indeed, this technology has been successfully applied to investigate simultaneous expression of many thousands of genes and to the detection of mutations or polymorphisms, as well as for their mapping and sequencing.  相似文献   

14.
An intricate network of interactions between organisms and their environment form the ecosystems that sustain life on earth. With a detailed understanding of these interactions, ecologists and biologists can make better informed predictions about the ways different environmental factors will impact ecosystems. Despite the abundance of research data on biotic and abiotic interactions, no comprehensive and easily accessible data collection is available that spans taxonomic, geospatial, and temporal domains. Biotic-interaction datasets are effectively siloed, inhibiting cross-dataset comparisons. In order to pool resources and bring to light individual datasets, specialized research tools are needed to aggregate, normalize, and integrate existing datasets with standard taxonomies, ontologies, vocabularies, and structured data repositories. Global Biotic Interactions (GloBI) provides such tools by way of an open, community-driven infrastructure designed to lower the barrier for researchers to perform ecological systems analysis and modeling. GloBI provides a tool that (a) ingests, normalizes, and aggregates datasets, (b) integrates interoperable data with accepted ontologies (e.g., OBO Relations Ontology, Uberon, and Environment Ontology), vocabularies (e.g., Coastal and Marine Ecological Classification Standard), and taxonomies (e.g., Integrated Taxonomic Information System and National Center for Biotechnology Information Taxonomy Database), (c) makes data accessible through an application programming interface (API) and various data archives (Darwin Core, Turtle, and Neo4j), and (d) houses a data collection of about 700,000 species interactions across about 50,000 taxa, covering over 1100 references from 19 data sources. GloBI has taken an open-source and open-data approach in order to make integrated species-interaction data maximally accessible and to encourage users to provide feedback, contribute data, and improve data access methods. The GloBI collection of datasets is currently used in the Encyclopedia of Life (EOL) and Gulf of Mexico Species Interactions (GoMexSI).  相似文献   

15.
16.
Microbial gene expression in soil: methods, applications and challenges   总被引:10,自引:0,他引:10  
About 99% of soil microorganisms are unculturable. However, advances in molecular biology techniques allow for the analysis of living microorganisms. With the advent of new technologies and the optimization of previous methods, various approaches to studying gene expression are expanding the field of microbiology and molecular biology. Methods used for RNA extraction, DNA microarrays, real-time PCR, competitive RT-PCR, stable isotope probing and the use of reporter genes provide methods for detecting and quantifying gene expression. Through the use of these methods, researchers can study the influence of soil environmental factors such as nutrients, oxygen status, pH, pollutants, agro-chemicals, moisture and temperature on gene expression and some of the mechanisms involved in the responses of cells to their environment. This review will also address information gaps in bacterial gene expression in soil and possible future research to develop an understanding of microbial activities in soil environments.  相似文献   

17.
18.
19.
20.
A 365-bp fragment from the 5' region of the human transferrin receptor gene has been subcloned and sequenced. This fragment contains 115 bp of flanking sequence, the first exon, and a portion of the first intron. It contains a TATA box, several GC-rich regions, and is able to efficiently promote expression of the bacterial CAT gene in mouse 3T3 cells. Sequence comparisons demonstrate that this DNA segment has homology to the promoter regions of the human dihydrofolate reductase gene and the mouse interleukin 3 gene, as well as to a monkey DNA sequence that has homology to the SV40 origin and promotes expression of an unidentified gene product. Several high molecular mass proteins that interact with the transferrin receptor gene promoter have been identified. The activity of these proteins is transiently increased in 3T3 cells that have been stimulated by serum addition. This increase precedes a rise in transferrin receptor mRNA levels in the cytoplasm, which in turn precedes entry of the cells into S phase. DNase I footprinting of the transferrin receptor promoter reveals several protein binding sites. Two of the sites are within the conserved GC-rich region of the promoter. One of these binding sites probably interacts with Spl, while the second interacts with an uncharacterized protein.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号