共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
Identifying protein complexes plays an important role for understanding cellular organization and functional mechanisms. As plenty of evidences have indicated that dense sub-networks in dynamic protein-protein interaction network (DPIN) usually correspond to protein complexes, identifying protein complexes is formulated as density-based clustering.Methods
In this paper, a new approach named iOPTICS-GSO is developed, which is the improved Ordering Points to Identify the Clustering Structure (OPTICS) algorithm with Glowworm swarm optimization algorithm (GSO) to optimize the parameters in OPTICS when finding dense sub-networks. In our iOPTICS-GSO, the concept of core node is redefined and the Euclidean distance in OPTICS is replaced with the improved similarity between the nodes in the PPI network according to their interaction strength, and dense sub-networks are considered as protein complexes.Results
The experiment results have shown that our iOPTICS-GSO outperforms of algorithms such as DBSCAN, CFinder, MCODE, CMC, COACH, ClusterOne MCL and OPTICS_PSO in terms of f-measure and p-value on four DPINs, which are from the DIP, Krogan, MIPS and Gavin datasets. In addition, our predicted protein complexes have a small p-value and thus are highly likely to be true protein complexes.Conclusion
The proposed iOPTICS-GSO gains optimal clustering results by adopting GSO algorithm to optimize the parameters in OPTICS, and the result on four datasets shows superior performance. What’s more, the results provided clues for biologists to verify and find new protein complexes.2.
3.
Chunyu Hou Yuan Li Huiqin Liu Mengjiao Dang Guoxuan Qin Ning Zhang Ruibing Chen 《Proteome science》2018,16(1):5
Background
Protein kinase C ζ (PKCζ), an isoform of the atypical protein kinase C, is a pivotal regulator in cancer. However, the molecular and cellular mechanisms whereby PKCζ regulates tumorigenesis and metastasis are still not fully understood. In this study, proteomics and bioinformatics analyses were performed to establish a protein-protein interaction (PPI) network associated with PKCζ, laying a stepping stone to further understand the diverse biological roles of PKCζ.Methods
Protein complexes associated with PKCζ were purified by co-immunoprecipitation from breast cancer cell MDA-MB-231 and identified by LC-MS/MS. Two biological replicates and two technical replicates were analyzed. The observed proteins were filtered using the CRAPome database to eliminate the potential false positives. The proteomics identification results were combined with PPI database search to construct the interactome network. Gene ontology (GO) and pathway analysis were performed by PANTHER database and DAVID. Next, the interaction between PKCζ and protein phosphatase 2 catalytic subunit alpha (PPP2CA) was validated by co-immunoprecipitation, Western blotting and immunofluorescence. Furthermore, the TCGA database and the COSMIC database were used to analyze the expressions of these two proteins in clinical samples.Results
The PKCζ centered PPI network containing 178 nodes and 1225 connections was built. Network analysis showed that the identified proteins were significantly associated with several key signaling pathways regulating cancer related cellular processes.Conclusions
Through combining the proteomics and bioinformatics analyses, a PKCζ centered PPI network was constructed, providing a more complete picture regarding the biological roles of PKCζ in both cancer regulation and other aspects of cellular biology.4.
Background
Most phylogenetic studies using molecular data treat gaps in multiple sequence alignments as missing data or even completely exclude alignment columns that contain gaps.Results
Here we show that gap patterns in large-scale, genome-wide alignments are themselves phylogenetically informative and can be used to infer reliable phylogenies provided the gap data are properly filtered to reduce noise introduced by the alignment method. We introduce here the notion of split-inducing indels (splids) that define an approximate bipartition of the taxon set. We show both in simulated data and in case studies on real-life data that splids can be efficiently extracted from phylogenomic data sets.Conclusions
Suitably processed gap patterns extracted from genome-wide alignment provide a surprisingly clear phylogenetic signal and an allow the inference of accurate phylogenetic trees.5.
Zexi Cai Trine Michelle Villumsen Torben Asp Bernt Guldbrandtsen Goutam Sahana Mogens Sandø Lund 《BMC genetics》2018,19(1):103
Background
Identification of genes underlying production traits is a key aim of the mink research community. Recent availability of genomic tools have opened the possibility for faster genetic progress in mink breeding. Availability of mink genome assembly allows genome-wide association studies in mink.Results
In this study, we used genotyping-by-sequencing to obtain single nucleotide polymorphism (SNP) genotypes of 2496 mink. After multiple rounds of filtering, we retained 28,336 high quality SNPs and 2352 individuals for a genome-wide association study (GWAS). We performed the first GWAS for body weight, behavior, along with 10 traits related to fur quality in mink.Conclusions
Combining association results with existing functional information of genes and mammalian phenotype databases, we proposed WWC3, MAP2K4, SLC7A1 and USP22 as candidate genes for body weight and pelt length in mink.6.
Background
Identifying complexes from PPI networks has become a key problem to elucidate protein functions and identify signal and biological processes in a cell. Proteins binding as complexes are important roles of life activity. Accurate determination of complexes in PPI networks is crucial for understanding principles of cellular organization.Results
We propose a novel method to identify complexes on PPI networks, based on different co-expression information. First, we use Markov Cluster Algorithm with an edge-weighting scheme to calculate complexes on PPI networks. Then, we propose some significant features, such as graph information and gene expression analysis, to filter and modify complexes predicted by Markov Cluster Algorithm. To evaluate our method, we test on two experimental yeast PPI networks.Conclusions
On DIP network, our method has Precision and F-Measure values of 0.6004 and 0.5528. On MIPS network, our method has F-Measure and S n values of 0.3774 and 0.3453. Comparing to existing methods, our method improves Precision value by at least 0.1752, F-Measure value by at least 0.0448, S n value by at least 0.0771. Experiments show that our method achieves better results than some state-of-the-art methods for identifying complexes on PPI networks, with the prediction quality improved in terms of evaluation criteria.7.
Background
The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.Methods
Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.Results
Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.Conclusions
Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.8.
Ling Bai Wei He Tianpeng Li Cuiting Yang Yingping Zhuang Shu Quan 《Biotechnology letters》2017,39(8):1191-1199
Objective
To investigate the application of the TEM-1 β-lactamase protein fragment complementation assay (PCA) in detecting weak and unstable protein–protein interactions as typically observed during chaperone-assisted protein folding in the periplasm of Escherichia coli.Results
The TEM-1 β-lactamase PCA system effectively captured the interactions of three pairs of chaperones and substrates. Moreover, the strength of the interactions can be quantitatively analyzed by comparing different levels of penicillin resistance, and the assay can be performed under 0.5% butanol, a stress condition thought to be physiologically relevant.Conclusions
The β-lactamase PCA system faithfully reports chaperone-substrate interactions in the bacterial cell envelope, and therefore this system has the potential to map the complex protein homeostasis network under a fluctuating environment.9.
Yuji Sawada Hirokazu Tsukaya Yimeng Li Muneo Sato Kensuke Kawade 《Metabolomics : Official journal of the Metabolomic Society》2017,13(6):75
Introduction
In plant metabolomics, metabolite contents are often normalized by sample weight. However, accurate weighing of very small samples, such as individual Arabidopsis thaliana seeds (approximately 20 µg), is difficult, which may lead to irreproducible results.Objectives
We aimed to establish alternative normalization methods for seed-grain-based comparative metabolomics of A. thaliana.Methods
Arabidopsis thaliana seeds were assumed to have a prolate spheroid shape. Using a microscope image of each seed, the lengths of major and minor axes were measured by fitting a projected 2-dimensional shape of each seed as an ellipse. Metabolic profiles of individual diploid or tetraploid A. thaliana seeds were measured by our highly sensitive protocol (“widely targeted metabolomics”) that uses liquid chromatography coupled with tandem quadrupole mass spectrometry. Mass spectrometric analysis of 1 µL of solution extract identified more than 100 metabolites. The data were normalized by various seed-size measures, including seed volume (single-grain-based analysis). For comparison, metabolites were extracted from 4 mg of diploid and tetraploid A. thaliana seeds and their metabolic profiles were analyzed by normalization of weight (weight-based analysis).Results
A small number of metabolites showed statistically significant differences in the single-grain-based analysis compared to weight-based analysis. A total of 17 metabolites showed statistically different accumulation between ploidy types with similar fold changes in both analyses.Conclusion
Seed-size measures obtained by microscopic imaging were useful for data normalization. Single-grain-based analysis enables evaluation of metabolism of each seed and elucidates the metabolic profiles of precious bioresources by using small amounts of samples.10.
Thijs Welle Anna T. Hoekstra Ineke A. J. J. M. Daemen Celia R. Berkers Matheus O. Costa 《Metabolomics : Official journal of the Metabolomic Society》2017,13(7):83
Introduction
Swine dysentery caused by Brachyspira hyodysenteriae is a production limiting disease in pig farming. Currently antimicrobial therapy is the only treatment and control method available.Objective
The aim of this study was to characterize the metabolic response of porcine colon explants to infection by B. hyodysenteriae.Methods
Porcine colon explants exposed to B. hyodysenteriae were analyzed for histopathological, metabolic and pro-inflammatory gene expression changes.Results
Significant epithelial necrosis, increased levels of l-citrulline and IL-1α were observed on explants infected with B. hyodysenteriae.Conclusions
The spirochete induces necrosis in vitro likely through an inflammatory process mediated by IL-1α and NO.11.
Yibin Zhuang Jingjie Jiang Huiping Bi Hua Yin Shaowei Liu Tao Liu 《Biotechnology letters》2016,38(4):619-627
Objectives
To produce rosmarinic acid analogues in the recombinant Escherichia coli BLRA1, harboring a 4-coumarate: CoA ligase from Arabidopsis thaliana (At4CL) and a rosmarinic acid synthase from Coleus blumei (CbRAS).Results
Incubation of the recombinant E. coli strain BLRA1 with exogenously supplied phenyllactic acid (PL) and analogues as acceptor substrates, and coumaric acid and analogues as donor substrates led to production of 18 compounds, including 13 unnatural RA analogues.Conclusion
This work demonstrates the viability of synthesizing a broad range of rosmarinic acid analogues in E. coli, and sheds new light on the substrate specificity of CbRAS.12.
Nicholas J. Bond Albert Koulman Julian L. Griffin Zoe Hall 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):128
Introduction
Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.Objectives
We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.Methods
massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.Results
Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.Conclusion
massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.13.
Korey J. Brownstein Mahmoud Gargouri William R. Folk David R. Gang 《Metabolomics : Official journal of the Metabolomic Society》2017,13(11):133
Introduction
Botanicals containing iridoid and phenylethanoid/phenylpropanoid glycosides are used worldwide for the treatment of inflammatory musculoskeletal conditions that are primary causes of human years lived with disability, such as arthritis and lower back pain.Objectives
We report the analysis of candidate anti-inflammatory metabolites of several endemic Scrophularia species and Verbascum thapsus used medicinally by peoples of North America.Methods
Leaves, stems, and roots were analyzed by ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and partial least squares-discriminant analysis (PLS-DA) was performed in MetaboAnalyst 3.0 after processing the datasets in Progenesis QI.Results
Comparison of the datasets revealed significant and differential accumulation of iridoid and phenylethanoid/phenylpropanoid glycosides in the tissues of the endemic Scrophularia species and Verbascum thapsus.Conclusions
Our investigation identified several species of pharmacological interest as good sources for harpagoside and other important anti-inflammatory metabolites.14.
Yinglei Lai 《BMC bioinformatics》2017,18(3):69
Background
q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation.Results
We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method.Conclusions
The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.15.
Background
Identification of common genes associated with comorbid diseases can be critical in understanding their pathobiological mechanism. This work presents a novel method to predict missing common genes associated with a disease pair. Searching for missing common genes is formulated as an optimization problem to minimize network based module separation from two subgraphs produced by mapping genes associated with disease onto the interactome.Results
Using cross validation on more than 600 disease pairs, our method achieves significantly higher average receiver operating characteristic ROC Score of 0.95 compared to a baseline ROC score 0.60 using randomized data.Conclusion
Missing common genes prediction is aimed to complete gene set associated with comorbid disease for better understanding of biological intervention. It will also be useful for gene targeted therapeutics related to comorbid diseases. This method can be further considered for prediction of missing edges to complete the subgraph associated with disease pair.16.
Objectives
To develop a versatile Trichoderma reesei (teleomorph Hypocrea jecorina) expression system for the high-purity production of heterologous proteins.Results
The versatile T. reesei expression system is based on xyn1 and xyn2 promoters, A824V transition in XYRI, and a bicomponent carbon source strategy. Red fluorescent protein gene rfp and alkaline endoglucanase EGV gene egv3 from Humicola insolens were used as reporter genes to test our versatile expression systemConclusions
The versatile T. reesei expression system can be applied to produce heterologous proteins with high purity and high yield.17.
18.
Vasantika Suryawanshi Ina N. Talke Michael Weber Roland Eils Benedikt Brors Stephan Clemens Ute Krämer 《BMC genomics》2016,17(13):1034
Background
Gene copy number divergence between species is a form of genetic polymorphism that contributes significantly to both genome size and phenotypic variation. In plants, copy number expansions of single genes were implicated in cultivar- or species-specific tolerance of high levels of soil boron, aluminium or calamine-type heavy metals, respectively. Arabidopsis halleri is a zinc- and cadmium-hyperaccumulating extremophile species capable of growing on heavy-metal contaminated, toxic soils. In contrast, its non-accumulating sister species A. lyrata and the closely related reference model species A. thaliana exhibit merely basal metal tolerance.Results
For a genome-wide assessment of the role of copy number divergence (CND) in lineage-specific environmental adaptation, we conducted cross-species array comparative genome hybridizations of three plant species and developed a global signal scaling procedure to adjust for sequence divergence. In A. halleri, transition metal homeostasis functions are enriched twofold among the genes detected as copy number expanded. Moreover, biotic stress functions including mostly disease Resistance (R) gene-related genes are enriched twofold among genes detected as copy number reduced, when compared to the abundance of these functions among all genes.Conclusions
Our results provide genome-wide support for a link between evolutionary adaptation and CND in A. halleri as shown previously for Heavy metal ATPase4. Moreover our results support the hypothesis that elemental defences, which result from the hyperaccumulation of toxic metals, allow the reduction of classical defences against biotic stress as a trade-off.19.
The WD-repeat protein superfamily in Arabidopsis: conservation and divergence in structure and function 总被引:4,自引:0,他引:4
Background
The WD motif (also known as the Trp-Asp or WD40 motif) is found in a multitude of eukaryotic proteins involved in a variety of cellular processes. Where studied, repeated WD motifs act as a site for protein-protein interaction, and proteins containing WD repeats (WDRs) are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. In the model plant Arabidopsis thaliana, members of this superfamily are increasingly being recognized as key regulators of plant-specific developmental events.Results
We analyzed the predicted complement of WDR proteins from Arabidopsis, and compared this to those from budding yeast, fruit fly and human to illustrate both conservation and divergence in structure and function. This analysis identified 237 potential Arabidopsis proteins containing four or more recognizable copies of the motif. These were classified into 143 distinct families, 49 of which contained more than one Arabidopsis member. Approximately 113 of these families or individual proteins showed clear homology with WDR proteins from the other eukaryotes analyzed. Where conservation was found, it often extended across all of these organisms, suggesting that many of these proteins are linked to basic cellular mechanisms. The functional characterization of conserved WDR proteins in Arabidopsis reveals that these proteins help adapt basic mechanisms for plant-specific processes.Conclusions
Our results show that most Arabidopsis WDR proteins are strongly conserved across eukaryotes, including those that have been found to play key roles in plant-specific processes, with diversity in function conferred at least in part by divergence in upstream signaling pathways, downstream regulatory targets and /or structure outside of the WDR regions.20.
Biswapriya B. Misra Evaldo de Armas Sixue Chen 《Metabolomics : Official journal of the Metabolomic Society》2016,12(4):61