首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Biomedical ontologies are increasingly instrumental in the advancement of biological research primarily through their use to efficiently consolidate large amounts of data into structured, accessible sets. However, ontology development and usage can be hampered by the segregation of knowledge by domain that occurs due to independent development and use of the ontologies. The ability to infer data associated with one ontology to data associated with another ontology would prove useful in expanding information content and scope. We here focus on relating two ontologies: the Gene Ontology (GO), which encodes canonical gene function, and the Mammalian Phenotype Ontology (MP), which describes non-canonical phenotypes, using statistical methods to suggest GO functional annotations from existing MP phenotype annotations. This work is in contrast to previous studies that have focused on inferring gene function from phenotype primarily through lexical or semantic similarity measures.

Results

We have designed and tested a set of algorithms that represents a novel methodology to define rules for predicting gene function by examining the emergent structure and relationships between the gene functions and phenotypes rather than inspecting the terms semantically. The algorithms inspect relationships among multiple phenotype terms to deduce if there are cases where they all arise from a single gene function.We apply this methodology to data about genes in the laboratory mouse that are formally represented in the Mouse Genome Informatics (MGI) resource. From the data, 7444 rule instances were generated from five generalized rules, resulting in 4818 unique GO functional predictions for 1796 genes.

Conclusions

We show that our method is capable of inferring high-quality functional annotations from curated phenotype data. As well as creating inferred annotations, our method has the potential to allow for the elucidation of unforeseen, biologically significant associations between gene function and phenotypes that would be overlooked by a semantics-based approach. Future work will include the implementation of the described algorithms for a variety of other model organism databases, taking full advantage of the abundance of available high quality curated data.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0405-z) contains supplementary material, which is available to authorized users.  相似文献   

2.
3.
4.

Background

We have developed a high-throughput amplification method for generating robust gene expression profiles using single cell or low RNA inputs.

Methodology/Principal Findings

The method uses tagged priming and template-switching, resulting in the incorporation of universal PCR priming sites at both ends of the synthesized cDNA for global PCR amplification. Coupled with a whole-genome gene expression microarray platform, we routinely obtain expression correlation values of R2∼0.76–0.80 between individual cells and R2∼0.69 between 50 pg total RNA replicates. Expression profiles generated from single cells or 50 pg total RNA correlate well with that generated with higher input (1 ng total RNA) (R2∼0.80). Also, the assay is sufficiently sensitive to detect, in a single cell, approximately 63% of the number of genes detected with 1 ng input, with approximately 97% of the genes detected in the single-cell input also detected in the higher input.

Conclusions/Significance

In summary, our method facilitates whole-genome gene expression profiling in contexts where starting material is extremely limiting, particularly in areas such as the study of progenitor cells in early development and tumor stem cell biology.  相似文献   

5.
6.
We systematically analyzed the relationships between gene fitness profiles (co-fitness) and drug inhibition profiles (co-inhibition) from several hundred chemogenomic screens in yeast. Co-fitness predicted gene functions distinct from those derived from other assays and identified conditionally dependent protein complexes. Co-inhibitory compounds were weakly correlated by structure and therapeutic class. We developed an algorithm predicting protein targets of chemical compounds and verified its accuracy with experimental testing. Fitness data provide a novel, systems-level perspective on the cell.  相似文献   

7.
8.
9.
BackgroundGenetically modified organisms (GMOs) have numerous biomedical, agricultural and environmental applications. Development of accurate methods for the detection of GMOs is a prerequisite for the identification and control of authorized and unauthorized release of these engineered organisms into the environment and into the food chain. Current detection methods are unable to detect uncharacterized GMOs, since either the DNA sequence of the transgene or the amino acid sequence of the protein must be known for DNA-based or immunological-based detection, respectively.MethodsHere we describe the application of an epigenetics-based approach for the detection of mammalian GMOs via analysis of chromatin structural changes occurring in the host nucleus upon the insertion of foreign or endogenous DNA.ResultsImmunological methods combined with DNA next generation sequencing enabled direct interrogation of chromatin structure and identification of insertions of various size foreign (human or viral) DNA sequences, DNA sequences often used as genome modification tools (e.g. viral sequences, transposon elements), or endogenous DNA sequences into the nuclear genome of a model animal organism.ConclusionsThe results provide a proof-of-concept that epigenetic approaches can be used to detect the insertion of endogenous and exogenous sequences into the genome of higher organisms where the method of genetic modification, the sequence of inserted DNA, and the exact genomic insertion site(s) are unknown.General significanceMeasurement of chromatin dynamics as a sensor for detection of genomic manipulation and, more broadly, organism exposure to environmental or other factors affecting the epigenomic landscape are discussed.  相似文献   

10.
11.
In higher plants, many extracellular proteins are involved in developmental processes, including cell-cell signaling and cell wall construction. Xylogen is an extracellular arabinogalactan protein (AGP) isolated from Zinnia elegans xylogenic culture medium, which promotes xylem cell differentiation. Xylogen has a unique structure, containing a non-specific lipid transfer protein (nsLTP) domain and AGP domains. We searched for xylogen-type genes in the genomes of land plants, including Arabidopsis thaliana, to further our knowledge of xylogen-type genes as functional extracellular proteins in plants. We found that many xylogen-type genes, including 13 Arabidopsis genes, comprise a gene family in land plants, including Populus trichocarpa, Vitis vinifera, Lotus japonicus, Oryza sativa, Selaginella moellendorffii and Physcomitrella patens. The genes shared an N-terminal signal peptide sequence, a distinct nsLTP domain, one or more AGP domains and a glycosylphosphatidylinositol (GPI)-anchored sequence. We analyzed transgenic plants harboring promoter::GUS (β-glucuronidase) constructs to test expression of the 13 Arabidopsis xylogen-type genes, and detected a diversity of gene family members with related expression patterns. AtXYP2 was the best candidate as the Arabidopsis counterpart of the Zinnia xylogen gene. We observed two distinct expression patterns for several genes, with some anther specific and others preferentially expressed in the endodermis/pericycle. We conclude that xylogen-type genes, which may have diverse functions, form a novel chimeric AGP gene family with a distinct nsLTP domain.  相似文献   

12.
13.
14.
15.

Background  

Gene expression is a two-step synthesis process that ends with the necessary amount of each protein required to perform its function. Since the protein is the final product, the main focus of gene regulation should be centered on it. However, because mRNA is an intermediate step and the amounts of both mRNA and protein are controlled by their synthesis and degradation rates, the desired amount of protein can be achieved following different strategies.  相似文献   

16.

Background  

Single Nucleotide Polymorphism (SNP) analysis only captures a small proportion of associated genetic variants in Genome-Wide Association Studies (GWAS) partly due to small marginal effects. Pathway level analysis incorporating prior biological information offers another way to analyze GWAS's of complex diseases, and promises to reveal the mechanisms leading to complex diseases. Biologically defined pathways are typically comprised of numerous genes. If only a subset of genes in the pathways is associated with disease then a joint analysis including all individual genes would result in a loss of power. To address this issue, we propose a pathway-based method that allows us to test for joint effects by using a pre-selected gene subset. In the proposed approach, each gene is considered as the basic unit, which reduces the number of genetic variants considered and hence reduces the degrees of freedom in the joint analysis. The proposed approach also can be used to investigate the joint effect of several genes in a candidate gene study.  相似文献   

17.
18.

Background

Microarray gene expression data are accumulating in public databases. The expression profiles contain valuable information for understanding human gene expression patterns. However, the effective use of public microarray data requires integrating the expression profiles from heterogeneous sources.

Results

In this study, we have compiled a compendium of microarray expression profiles of various human tissue samples. The microarray raw data generated in different research laboratories have been obtained and combined into a single dataset after data normalization and transformation. To demonstrate the usefulness of the integrated microarray data for studying human gene expression patterns, we have analyzed the dataset to identify potential tissue-selective genes. A new method has been proposed for genome-wide identification of tissue-selective gene targets using both microarray intensity values and detection calls. The candidate genes for brain, liver and testis-selective expression have been examined, and the results suggest that our approach can select some interesting gene targets for further experimental studies.

Conclusion

A computational approach has been developed in this study for combining microarray expression profiles from heterogeneous sources. The integrated microarray data can be used to investigate tissue-selective expression patterns of human genes.
  相似文献   

19.
Analysis of the fate of retrovirally transduced cells after transplantation is often hampered by the scarcity of available DNA. We evaluated a promising method for whole-genome amplification, called multiple displacement amplification (MDA), with respect to even and accurate representation of retrovirally transduced genomic DNA. We proved that MDA is a suitable method to subsequently quantify engraftment efficiencies by quantitative real-time PCR by analyzing retrovirally transduced DNA in a background of untransduced DNA and retroviral integrations found in primary material from a retroviral transplantation model. The portion of these retroviral integrations in the amplified samples was 1.02-fold (range 0.2, to 2.1-fold) the portion determined in the original genomic DNA. Integration site analysis by ligation-mediated PCR (LM-PCR) is essential for the detection of retroviral integrations. The combination of MDA and LM-PCR showed an increase in the sensitivity of integration site analysis, as a specific integration site could be detected in a background of untransduced DNA, while the transduced DNA made up only 0.001%. These results show for the first time that MDA enables large-scale sensitive detection and reliable quantification of retrovirally transduced human genomic DNA and therefore facilitates follow-up analysis in gene therapy studies even from the smallest amounts of starting material.  相似文献   

20.
The constant bombardment of mammalian genomes by transposable elements (TEs) has resulted in TEs comprising at least 45% of the human genome. Because of their great age and abundance, TEs are important in comparative phylogenomics. However, estimates of TE age were previously based on divergence from derived consensus sequences or phylogenetic analysis, which can be unreliable, especially for older more diverged elements. Therefore, a novel genome-wide analysis of TE organization and fragmentation was performed to estimate TE age independently of sequence composition and divergence or the assumption of a constant molecular clock. Analysis of TEs in the human genome revealed approximately 600,000 examples where TEs have transposed into and fragmented other TEs, covering >40% of all TEs or approximately 542 Mbp of genomic sequence. The relative age of these TEs over evolutionary time is implicit in their organization, because newer TEs have necessarily transposed into older TEs that were already present. A matrix of the number of times that each TE has transposed into every other TE was constructed, and a novel objective function was developed that derived the chronological order and relative ages of human TEs spanning >100 million years. This method has been used to infer the relative ages across all four major TE classes, including the oldest, most diverged elements. Analysis of DNA transposons over the history of the human genome has revealed the early activity of some MER2 transposons, and the relatively recent activity of MER1 transposons during primate lineages. The TEs from six additional mammalian genomes were defragmented and analyzed. Pairwise comparison of the independent chronological orders of TEs in these mammalian genomes revealed species phylogeny, the fact that transposons shared between genomes are older than species-specific transposons, and a subset of TEs that were potentially active during periods of speciation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号