首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 812 毫秒
1.
Lyu  Chuqiao  Wang  Lei  Zhang  Juhua 《BMC genomics》2018,19(10):905-165

Background

The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.

Methods

Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.

Results

Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.

Conclusions

Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.
  相似文献   

2.

Background

In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families.

Aim

We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server.

Results

Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002).

Reviewers

This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.
  相似文献   

3.

Background

Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.

Results

We applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.

Conclusions

These results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.
  相似文献   

4.
5.
Comparison of the canine and human olfactory receptor gene repertoires   总被引:2,自引:1,他引:1  

Background

Olfactory receptors (ORs), the first dedicated molecules with which odorants physically interact to arouse an olfactory sensation, constitute the largest gene family in vertebrates, including around 900 genes in human and 1,500 in the mouse. Whereas dogs, like many other mammals, have a much keener olfactory potential than humans, only 21 canine OR genes have been described to date.

Results

In this study, 817 novel canine OR sequences were identified, and 640 have been characterized. Of the 661 characterized OR sequences, representing half of the canine repertoire, 18% are predicted to be pseudogenes, compared with 63% in human and 20% in mouse. Phylogenetic analysis of 403 canine OR sequences identified 51 families, and radiation-hybrid mapping of 562 showed that they are distributed on 24 dog chromosomes, in 37 distinct regions. Most of these regions constitute clusters of 2 to 124 closely linked genes. The two largest clusters (124 and 109 OR genes) are located on canine chromosomes 18 and 21. They are orthologous to human clusters located on human chromosomes 11q11-q13 and HSA11p15, containing 174 and 115 ORs respectively.

Conclusions

This study shows a strongly conserved genomic distribution of OR genes between dog and human, suggesting that OR genes evolved from a common mammalian ancestral repertoire by successive duplications. In addition, the dog repertoire appears to have expanded relative to that of humans, leading to the emergence of specific canine OR genes.
  相似文献   

6.

Background

Human cancers are complex ecosystems composed of cells with distinct molecular signatures. Such intratumoral heterogeneity poses a major challenge to cancer diagnosis and treatment. Recent advancements of single-cell techniques such as scRNA-seq have brought unprecedented insights into cellular heterogeneity. Subsequently, a challenging computational problem is to cluster high dimensional noisy datasets with substantially fewer cells than the number of genes.

Methods

In this paper, we introduced a consensus clustering framework conCluster, for cancer subtype identification from single-cell RNA-seq data. Using an ensemble strategy, conCluster fuses multiple basic partitions to consensus clusters.

Results

Applied to real cancer scRNA-seq datasets, conCluster can more accurately detect cancer subtypes than the widely used scRNA-seq clustering methods. Further, we conducted co-expression network analysis for the identified melanoma subtypes.

Conclusions

Our analysis demonstrates that these subtypes exhibit distinct gene co-expression networks and significant gene sets with different functional enrichment.
  相似文献   

7.

Background and aims

Microalgae are ubiquitous in paddy soils. However, their roles in arsenic (As) accumulation and transport in rice plants remains unknown.

Methods

Two green algae and five cyanobacteria were used in pot experiments under continuously flooded conditions to ascertain whether a microalgal inoculation could influence rice growth and rice grain As accumulation in plants grown in As-contaminated soils.

Results

The microalgal inoculation greatly enhanced nutrient uptake and rice growth. The presence of representative microalga Anabaena azotica did not significantly differ the grain inorganic As concentrations but remarkably decreased the rice root and grain DMA concentrations. The translocation of As from roots to grains was also markedly decreased by rice inoculated with A. azotica. This subsequently led to a decrease in the total As concentration in rice grains.

Conclusions

The results of the study indicate that the microalgal inoculation had a strong influence on soil pH, soil As speciation, and soil nutrient bioavailability, which significantly affected the rice growth, nutrient uptake, and As accumulation and translocation in rice plants. The results suggest that algae inoculation can be an effective strategy for improving nutrient uptake and reducing As translocation from roots to grains by rice grown in As-contaminated paddy soils.
  相似文献   

8.
Xu M  Zhu M  Zhang L 《BMC genomics》2008,9(Z2):S18

Background

Microarray technology is often used to identify the genes that are differentially expressed between two biological conditions. On the other hand, since microarray datasets contain a small number of samples and a large number of genes, it is usually desirable to identify small gene subsets with distinct pattern between sample classes. Such gene subsets are highly discriminative in phenotype classification because of their tightly coupling features. Unfortunately, such identified classifiers usually tend to have poor generalization properties on the test samples due to overfitting problem.

Results

We propose a novel approach combining both supervised learning with unsupervised learning techniques to generate increasingly discriminative gene clusters in an iterative manner. Our experiments on both simulated and real datasets show that our method can produce a series of robust gene clusters with good classification performance compared with existing approaches.

Conclusion

This backward approach for refining a series of highly discriminative gene clusters for classification purpose proves to be very consistent and stable when applied to various types of training samples.
  相似文献   

9.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

10.

Introduction

Aromatic rices are culturally and economically important for many countries in Asia. Investigation of the volatile compounds emitted by rice during cooking is the key to understanding the flavour of elite aromatic rice varieties.

Objectives

The objectives of this study were to compare Jasmine-type aromatic rices from the Greater Mekong Subregion and Australia in terms of their metabolomics and sensory profiles and to draw out associations between the volatile organic compounds and human sensory perception of rice aroma.

Methods

A set of aromatic rice varieties from South East Asia and Australia, along with non-aromatic controls, was grown in tropical and temperate areas of Australia. Untargeted metabolite profiling of volatile compounds, from the heated rice flour, by static headspace extraction and separation by two dimensional gas chromatography time-of-flight mass spectrometry was performed. Volatile compounds were also assayed in the standard references used in the sensory evaluation and compared to the compounds detected in the headspace of rice.

Results

While 2-acetyl-1-pyrroline (2-AP) was a discriminating compound, we identified several of its structural homologues, and a number of other metabolites that were consistently detected in fragrant Jasmine rice. 2-AP producing rice varieties have different sensory properties and these variations were defined by the discriminating compounds identified in each rice type.

Conclusions

The results of this study are valuable in understanding the aspects of aromatic rice that are important to consumers, and in the identification of compounds that breeding programs can use to select for pleasant aromas, enabling breeding programs to target markets with greater accuracy.
  相似文献   

11.

Background

Most genes in Arabidopsis thaliana are members of gene families. How do the members of gene families arise, and how are gene family copy numbers maintained? Some gene families may evolve primarily through tandem duplication and high rates of birth and death in clusters, and others through infrequent polyploidy or large-scale segmental duplications and subsequent losses.

Results

Our approach to understanding the mechanisms of gene family evolution was to construct phylogenies for 50 large gene families in Arabidopsis thaliana, identify large internal segmental duplications in Arabidopsis, map gene duplications onto the segmental duplications, and use this information to identify which nodes in each phylogeny arose due to segmental or tandem duplication. Examples of six gene families exemplifying characteristic modes are described. Distributions of gene family sizes and patterns of duplication by genomic distance are also described in order to characterize patterns of local duplication and copy number for large gene families. Both gene family size and duplication by distance closely follow power-law distributions.

Conclusions

Combining information about genomic segmental duplications, gene family phylogenies, and gene positions provides a method to evaluate contributions of tandem duplication and segmental genome duplication in the generation and maintenance of gene families. These differences appear to correspond meaningfully to differences in functional roles of the members of the gene families.
  相似文献   

12.

Aims

This study aimed to determine the capacity of Si to mitigate Al toxicity in upland rice plants (Oryza sativa L.) by evaluating plant growth and the Si and Al uptake kinetics.

Methods

Plants were grown for 40 days, after which the Si and Al uptake kinetics (Cmin, Km and Imax) were analyzed. Then, the shoots and roots were separated, and the dry matter, root morphology and Si and Al concentration and accumulation in the plant were evaluated.

Results

Aluminum decreased plant growth and the Si uptake capacity by decreasing the root growth and Si transport system efficiency in the upland rice roots (> Km and > Cmin). Silicon mitigated Al toxicity in the upland rice plants by decreasing Al transport to the plant shoots, although it did not reduce the Al uptake rate (Imax). Si treatment increased the growth of upland rice plant shoots grown in the presence of Al without influencing the root growth. The alleviation of Al toxicity by Si is more evident in the susceptible upland rice cultivar Maravilha.

Conclusions

Silicon mitigated Al toxicity in the upland rice plants by decreasing Al transport to the plant shoots but did not reduce the Al uptake rate by roots.
  相似文献   

13.

Background

Bacterial genomes develop new mechanisms to tide them over the imposing conditions they encounter during the course of their evolution. Acquisition of new genes by lateral gene transfer may be one of the dominant ways of adaptation in bacterial genome evolution. Lateral gene transfer provides the bacterial genome with a new set of genes that help it to explore and adapt to new ecological niches.

Methods

A maximum likelihood analysis was done on the five sequenced corynebacterial genomes to model the rates of gene insertions/deletions at various depths of the phylogeny.

Results

The study shows that most of the laterally acquired genes are transient and the inferred rates of gene movement are higher on the external branches of the phylogeny and decrease as the phylogenetic depth increases. The newly acquired genes are under relaxed selection and evolve faster than their older counterparts. Analysis of some of the functionally characterised LGTs in each species has indicated that they may have a possible adaptive role.

Conclusion

The five Corynebacterial genomes sequenced to date have evolved by acquiring between 8 – 14% of their genomes by LGT and some of these genes may have a role in adaptation.
  相似文献   

14.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

15.
16.

Background

Protein synthetic lethal genetic interactions are useful to define functional relationships between proteins and pathways. However, the molecular mechanism of synthetic lethal genetic interactions remains unclear.

Results

In this study we used the clusters of short polypeptide sequences, which are typically shorter than the classically defined protein domains, to characterize the functionalities of proteins. We developed a framework to identify significant short polypeptide clusters from yeast protein sequences, and then used these short polypeptide clusters as features to predict yeast synthetic lethal genetic interactions. The short polypeptide clusters based approach provides much higher coverage for predicting yeast synthetic lethal genetic interactions. Evaluation using experimental data sets showed that the short polypeptide clusters based approach is superior to the previous protein domain based one.

Conclusion

We were able to achieve higher performance in yeast synthetic lethal genetic interactions prediction using short polypeptide clusters as features. Our study suggests that the short polypeptide cluster may help better understand the functionalities of proteins.
  相似文献   

17.

Background

Bacterial carbohydrate metabolism is extremely diverse, since carbohydrates serve as a major energy source and are involved in a variety of cellular processes. Bacterial genes belonging to same metabolic pathway are often co-localized in the chromosome, but it is not a strict rule. Gene co-localization in linked to co-evolution and co-regulation. This study focuses on a large-scale analysis of bacterial genomic loci related to the carbohydrate metabolism.

Results

We demonstrate that only 53% of 148,000 studied genes from over six hundred bacterial genomes are co-localized in bacterial genomes with other carbohydrate metabolism genes, which points to a significant role of singleton genes. Co-localized genes form cassettes, ranging in size from two to fifteen genes. Two major factors influencing the cassette-forming tendency are gene function and bacterial phylogeny. We have obtained a comprehensive picture of co-localization preferences of genes for nineteen major carbohydrate metabolism functional classes, over two hundred gene orthologous clusters, and thirty bacterial classes, and characterized the cassette variety in size and content among different species, highlighting a significant role of short cassettes. The preference towards co-localization of carbohydrate metabolism genes varies between 40 and 76% for bacterial taxa. Analysis of frequently co-localized genes yielded forty-five significant pairwise links between genes belonging to different functional classes. The number of such links per class range from zero to eight, demonstrating varying preferences of respective genes towards a specific chromosomal neighborhood. Genes from eleven functional classes tend to co-localize with genes from the same class, indicating an important role of clustering of genes with similar functions. At that, in most cases such co-localization does not originate from local duplication events.

Conclusions

Overall, we describe a complex web formed by evolutionary relationships of bacterial carbohydrate metabolism genes, manifested as co-localization patterns.

Reviewers

This article was reviewed by Daria V. Dibrova (A.N. Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia), nominated by Armen Mulkidjanian (University of Osnabrück, Germany), Igor Rogozin (NCBI, NLM, NIH, USA) and Yuri Wolf (NCBI, NLM, NIH, USA).
  相似文献   

18.
19.

Objectives

To characterize biomarkers that underlie osteosarcoma (OS) metastasis based on an ego-network.

Results

From the microarray data, we obtained 13,326 genes. By combining PPI data and microarray data, 10,520 shared genes were found and constructed into ego-networks. 17 significant ego-networks were identified with p < 0.05. In the pathway enrichment analysis, seven ego-networks were identified with the most significant pathway.

Conclusions

These significant ego-modules were potential biomarkers that reveal the potential mechanisms in OS metastasis, which may contribute to understanding cancer prognoses and providing new perspectives in the treatment of cancer.
  相似文献   

20.

Background

In recent years the visualization of biomagnetic measurement data by so-called pseudo current density maps or Hosaka-Cohen (HC) transformations became popular.

Methods

The physical basis of these intuitive maps is clarified by means of analytically solvable problems.

Results

Examples in magnetocardiography, magnetoencephalography and magnetoneurography demonstrate the usefulness of this method.

Conclusion

Hardware realizations of the HC-transformation and some similar transformations are discussed which could advantageously support cross-platform comparability of biomagnetic measurements.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号